Categories

Versions

You are viewing the RapidMiner Python documentation for version 10.0 - Check here for latest version

Execute Python code

Here are the basic features of the extension. Make sure to explore the tutorial processes provided with the Execute Python operator as well. Other operators (Python Learner and Python Transformer) will be presented on the custom operators page.

How things work

It's important to understand how data is transferred between RapidMiner operators and the operators provided by the Python Scripting extension. In other words, what happens when you connect the port of any RapidMiner operator to any of the Python operators (Execute Python, Python Learner, and Python Transformer).

When passing data to one of the Python operators, RapidMiner ExampleSets are transformed automatically to Pandas DataFrames. The Pandas DataFrames returned by your rm_main function (see the next chapter on how to structure your code) are converted back automatically to RapidMiner ExampleSets by the Python Scripting extension. Metadata propagation and automatic data type conversion is also in place in both directions.

How to structure your code

To successfully execute your Python code inside RapidMiner, you need to structure your code in a way that you declare an rm_main function as your main entry point. The number and order of input parameters and returned values of your rm_main function will correspond to the input and output ports of the Execute Python operator.

You have to follow this convention regardless if you are using our inline editor or just embedding a Python script or notebook file.

Running scripts

You can execute your Python code either by editing it in-line with our basic script editor (it provides basic syntax highlighting but lacks all the powerful features of a Python IDE), or by specifying a script file in the Execute Python operator's script file parameter. If your script is stored in a location accessible via internet (such as GitHub), you can also read your script file directly from there with the help of the Open File operator.

You can also store your script file in your RapidMiner project or repository.

As a convenience feature, if you drag and drop a .py or .ipynb file from your project or repository to the canvas, the correct operators will be automatically created for you.

Running notebooks

You can also execute ipynb notebooks with the help of Execute Python. In this case, use the script file parameter of the operator to locate your notebook. The same consideration on how to structure code applies for notebooks as for Python scripts.

If you tagged your notebook cells, we offer a selective tag based execution. One way to do this is to click the Show Preview... button on the Execute Python operator (once you added your notebook into the script file parameter or connected it to the first input port) and pick which cells to exclude from the execution. Alternatively, you can specify which cells to execute by providing a regular expression in the notebook cell tag filter parameter.

Using RapidMiner macros

Macros added into the Python code inline with the %{myMacro} syntax will be parsed before the script execution, both in case of an inline script and one provided by script file. But, to no surprise, this piece of code then will only run inside RapidMiner, and will otherwise produce a syntax error.

Another, more pythonic way to tackle this is to check the enable macros parameter on your Execute Python operator. Next, you need to add an extra parameter to your rm_main function, where macros will be accessible during your execution. This will allow you not only to read macro values, but also to define new ones, or overwrite the value of existing macros.