Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version

Execute R (R Scripting)

Synopsis

Executes a R script.

Description

Before using this operator you need to specify the path to your R installation under Tools -> Preferences -> R Scripting. Your R installation has to include the 'data.table' package.

This operator executes either the script provided through the script file port or parameter or the script specified in the script parameter. The arguments of the script correspond to the input ports, where example sets are converted to data frames. Analogously, the values returned by the script are delivered at the output ports of the operator, where data frames are converted to example sets.

The console output of R is shown in the Log View (View -> Show View -> Log).

Input

  • script file (File)

    A file containing an R script to be executed. The file has to comply with the script parameter rules. This port is optional, a file can also be provided through the script file parameter.

  • input

    The Script operator can have multiple inputs. An input must be either an example set, a file object or an R object which was generated by an 'Execute R' operator.

Output

  • output

    The Script operator can have multiple outputs. An output can be either an example set, a file object or an R object generated by this operator.

Parameters

  • script

    The R script to execute. Define a function with name 'rm_main' with as many arguments as connected input ports or alternatively the ellipsis arguments ('...') to use a dynamic number of attributes. The return values of the function 'rm_main' are delivered to the connected output ports. Entries from the data type 'data frame' are converted to example sets; files are converted to File Objects, other R objects are serialized and can be used by other 'Execute R' operators or stored in the repository. Serialized R objects have to be smaller than 2 GB.

    If you pass an example set to your script through an input port, the meta data of the example set (types and roles) is available in the script. You can access it by the metaData list object in R. The names of top components in the list are identical to the arguments from the rm_main() function. Each component will contain the name of all attributes defined by that input argument and its type and role. To access or change a specific meta data entry use metaData$inputArgument$attributeName$type or metaData$inputArgument$attributeName$role. Please note that changes to the meta data have to be made with the 'superassignment' operator <<-.

    For more information about the meta data handling in an R operator check the tutorial process 'Meta data handling' below.

    If a script file is provided either through the script file port or parameter (port takes precedence), that script will be used instead of the value of this parameter.

    Range: text
  • script_file A file containing an R script to be executed. The file has to comply with the script parameter rules. This parameter is optional. Range: filename
  • use_default_R Use the default R executable defined in Preferences if checked. If unchecked, you can use a custom executable in this operator. Range: boolean
  • Rscript_executable Path to Rscript executable. Under Windows you have to specify the path to 'Rscript.exe' including 'Rscript.exe' itself. On Linux or Mac you have to specify your Rscript executable. Range: filename
  • use_default_R_LIBS_paths Use the additional R_LIBS paths defined in Preferences if checked. If unchecked, you can override the global setting just for this operator. Range: boolean
  • R_LIBS_paths A list of paths to search for additional R packages. A path may be a relative path, in which case it must be relative to the used Rscript executable path. Range: enumeration

Tutorial Processes

Training and applying a linear model in R

The polynomial data set is split in two parts. The first part is used by the 'Execute R' operator to train a linear model in R. The calculated model is passed on to the second 'Execute R' operator and applied there on the second part of the data set.

Generating probability density functions for different probability functions in R

This script generates sample points for some statistical density functions and returns them as an example set.

Reading an example set from a file using R

This tutorial process uses the 'Execute R' operator to save example data in a csv file. The second 'Execute R' operator receives this file, reads the data and returns a part of the data to the output port. The result is an example set.

Meta data handling

This tutorial process shows how to access the meta data of incoming example sets inside a 'Execute R' operator. It also explains how to set the meta data for the outcoming example sets.