You are viewing the RapidMiner Studio documentation for version 9.4 - Check here for latest version
Install the Python Scripting extension
The Python Scripting extension includes the Operator Execute Python, that makes it possible to run Python code inside a RapidMiner process. The Operator supports a variety of Python interpreters/distributions, including the popular Anaconda distribution and virtualenvwrapper environments.
Install
To install the extension, go to the Extensions menu, open the Marketplace (Updates and Extensions), and search for Python Scripting. For more detail, see Adding extensions.
Configure: choose the default Python
Even without any configuration, the extension will attempt to detect a Python environment / executable. If Anaconda is installed, it will by default use the base conda environment. Otherwise it will use the first executable it finds.
Before using the extension for the first time, you should configure its default settings: the location of your Python interpreter or the name of your virtual environment. Notice, however, that you can override the default settings within the Operator Execute Python by unchecking the advanced parameter use default python.
To configure the extension with the default Python, take the following steps:
Open the Preferences dialog (on Mac OS go to the RapidMiner -> Preferences... menu, on other systems use the Settings -> Preferences... menu).
Go to the Python Scripting tab.
If the folder containing your Python interpreter (or conda/virtualenvwrapper) is not in your system path, edit Search paths and include it.
Based on your needs, set the package manager to one of the following:
If you are using Anaconda (or Miniconda), select the conda (anaconda) option from the drop-down list (this is the default). The conda environment parameter appears. From the drop-down list select the name of the environment you want to use. Click the refresh button, if you don't see your environment in the list.
If you are using virtualenvwrapper, select the virtualenvwrapper option from the drop-down list. The venvw environment parameter appears. From the drop-down list select the name of the environment you want to use. Click the refresh button, if you don't see your environment in the list.
If you are using some other Python executable/distribution, select the specific python binaries option from the drop-down list. The Python binary path parameter appears. The extension scans some commonly used directories plus the additional directories you provided in Search paths for Python executables; this scan may take some time. Select the desired Python executable from the drop-down menu. Click the refresh button if you don't see it in the list. Alternatively, locate your executable and provide the full path by typing it into the text box or by selecting it from the open file dialog.
When you click the refresh button, it may take a few seconds until the drop-down list is updated.
Once you have selected the desired environment/executable, click the Test button. If your configuration is successful, you may see something like this:
If testing fails, try to select a different environment/executable or address the problems shown in the dialog. Most probably you will need to install or upgrade some Python packages in your environment/executable.
Click OK to close the test result dialog. Click OK one more time to close the Preferences dialog and save the new settings.
Now you are ready to use the Execute Python operator.
Execute Python: override the default Python
The Python Scripting extension supports conda and virtualenvwrapper virtual environments. Virtual environments make package management simpler when you are working on multiple projects simultaneously. The Operator Execute Python normally uses the default environment defined in the Preferences dialog, but you can use any environment you like by unchecking the parameter use default python in the Parameters Panel. Then follow the instructions above to select a non-default environment for Execute Python.
The benefit of using virtual environments is that processes are more portable -- just make sure you have the same Python environment with the same packages installed on other instances of RapidMiner Studio or RapidMiner Server.