This article outlines common problems with RapidMiner Server.

Switching Java distributions

When switching Java distributions, e.g. from Oracle JDK to OpenJDK, temporary files of JBoss need to be cleaned. To do this, please ensure that RapidMiner Server is shut down. Afterwards, go into the installation directory of RapidMiner Server and delete the <install-directory>/standalone/tmp folder or its contents.

Increasing ActiveMQ working directory

The following applies if you're using our embedded ActiveMQ broker which is the default setup.

ActiveMQ stores data and log files (journal files ending with .log file extension) in its working directory within the RapidMiner Server/AI Hub home directory under rapidminer-server-home/data/broker/. They're used internally by ActiveMQ to consistently manage messages. An ActiveMQ instance can hold references to multiple journal files and will only prune them if they're not used anymore. If a lot of messages are being sent to and from ActiveMQ, e.g. by submitting jobs frequently or by having multiple schedules in place, the working directory can grow in size.

If you manually delete journal files, then you'll probably lose messages like pending jobs. We advise you to not delete journal files. Instead, we recommend that you periodically check the size of the directory which should always be below 50GB because ActiveMQ has a default 50GB limit of the working directory size and afterwards will stop functioning properly, e.g. no jobs are submitted anymore and your jobs will not be executed.

There are two main approaches on how to solve such a situation:

  1. Let ActiveMQ's clean-up mechanism handle it for you or
  2. Let RapidMiner AI Hub delete all messages on every application start.

Use ActiveMQ's cleanup mechanism

To ensure that ActiveMQ will clean up most of the journal files automatically, you need to ensure that jobs are not submitted to any queue anymore and that there are no pending jobs on any queue, e.g. by purging a queue on the RapidMiner AI Hub queue's page and pausing any active schedule. If this is the case, then the internal cleanup mechanism will delete unused journal files automatically.

You can also change the default size for each journal file by adjusting the property broker.activemq.embeddedBroker.journalMaxFileLength = 256 in the file to set maximum file size in MB. The default size is 128 MB which ActiveMQ will automatically allocate and occupy on start.

By default, ActiveMQ will check every 5 seconds if it can cleanup unused journal files. If this interval is too high for your setup, you can lower it to e.g. one second by adapting the property broker.activemq.embeddedBroker.cleanupInterval = 1000 (in milliseconds) and observe if journal files will now be cleaned up correctly.

Wiping the broker's working directory on start

Be aware that the following is a destructive operation and all submitted and pending jobs (all pending messages) will be dropped, so please ensure that all schedules have been stopped before and that no submitted job is in the queue anymore.

The RapidMiner AI Hub embedded broker provides means to automatically wipe the broker's working directory during every application start. This is useful if you know that you don't need to keep pending jobs after restarts, e.g. when they're automatically added by a schedule anyway.

You should only use the wipe-on-start mechanism if really necessary or if your deployment has limited disk space resources.

Ensure to follow these steps:

  1. Ensure that there are no important pending jobs any job queue which didn't finish yet. You probably want to pause all schedules on the schedules page and wait for all pending jobs to be executed.
  2. Shut down RapidMiner AI Hub.
  3. Head over to the within the rapidminer-server-home/configuration/ folder and add the property broker.activemq.embeddedBroker.wipeWorkDir = true to enable wiping functionality for every restart.
  4. Start RapidMiner AI Hub.
  5. Resume any schedule which you've paused prior to restarting AI Hub.
  6. Remove the added property from your or set it to false if you don't wish to wipe on every restart.

The following steps describe how you would determine still actively used journal files from ActiveMQ and is targeted for advanced users only. Using ActiveMQ’s automatic cleanup and/or the wipe-on-start functionality described above is the safer approach and should be preferred.

You can dive deep into ActiveMQ's internals by setting the property = trace in the file. Logs will now include information about which ActiveMQ queue/topic is still referencing which journal files and therefore why it's not automatically cleaning them. You can now decide which journal file you like to delete manually. More information on that can be found on the official ActiveMQ article "Why do KahaDB log files remain after cleanup".

Zip Dump does not contain the entire contents of a folder

It is very likely that the zip dump mechanism of the remote repository will unexpectedly abort for folders containing large files and return with an incomplete zip file. This is because the underlying JBoss has a transaction timeout of 300 seconds configured by default. To overcome this issue you need to increase the transaction timeout value:

  1. Open the file standalone.xml. It is located in configuration/ of the RapidMiner Server home directory.
  2. Locate the tag <coordinator-environment default-timeout="300" />
  3. Increase the default-timeout value to a large number e.g. from 300 to 3000 seconds.

If the zip dump mechanism still aborts unexpectedly, please try increasing the timeout to a higher number.

We recommend setting the transaction timeout value to a lower number again when the zip dump mechanism is no longer needed, to avoid unwanted side effects elsewhere.