You are viewing the RapidMiner Studio documentation for version 8.0 - Check here for latest version

Execute Program (Productivity)

Synopsis

This operator simply executes a command in the shell of the underlying operating system. It can execute any system command or external program.

Description

This operator executes a system command. The command and all its arguments are specified by the command parameter. Please note that the command is system dependent. The standard output stream of the process can be redirected to the log file by enabling the log stdout parameter. The standard error stream of the process can be redirected to the log file by enabling the log stderr parameter.

In Windows / MS DOS, simple commands should be preceded by 'cmd /c' call, e.g. 'cmd /c notepad'. Just writing 'notepad' in the command parameter will also work in case you are executing a program and not just a shell command. Then Windows opens a new shell, executes the command, and closes the shell again. However, Windows 7 may not open a new shell, it just executes the command. Another option would be to precede the command with 'cmd /c start' which opens the shell and keeps it open. The rest of the process will not be executed until the shell is closed by the user. Please study the attached Example Processes for more information.

CAUTION: Due to a Java bug on Windows / MS DOS the operator is only able to stop first level child processes. For example: When stopping the operator that was started with a command containing a preceeding 'cmd /c', only the direct child process (the shell) will be closed but processes started by these shell will still be running detached from RapidMiner.

The Java ProcessBuilder is used for building and executing the command. Characters that have special meaning on the shell e.g. the pipe symbol or brackets and braces do not have a special meaning to Java. Please note, that this Java method parses the string into tokens before it is executed. These tokens are not interpreted by a shell. If the desired command involves piping, redirection or other shell features, it is best to create a small shell script to handle this.

Input

  • in (File)

    A file object sent to this port will be piped to the standard input (stdin) of the process.

  • through (IOObject)

    It is not compulsory to connect any object with this port. Any object connected at this port is delivered without any modifications to the output port. This operator can have multiple inputs. When one input is connected, another through input port becomes available which is ready to accept another input (if any). The order of inputs remains the same. The object supplied at the first through input port of the Execute Program operator is available at the first through output port.

Output

  • out (File)

    If connected, the standard output stream (stdout) generated by this process will be delivered as a file object.

  • err (File)

    If connected, the standard error stream (stderr) generated by this process will be delivered as a file object.

  • through (IOObject)

    The objects that were given as input are passed without changing to the output through this port. It is not compulsory to connect this port to any other port, the command is executed even if this port is left without connections. The Execute Program operator can have multiple outputs. When one output is connected, another through output port becomes available which is ready to deliver another output (if any). The order of outputs remains the same. The object delivered at the first through input port of the Execute Program operator is delivered at the first through output port

Parameters

  • commandThis parameter specifies the command to be executed. Range: string
  • log_stdoutIf set to true, the stdout stream (standard output stream) of the command is redirected to the log file. Only available if out port is not connected. Range: boolean
  • log_stderrIf set to true, the stderr stream (standard error stream) of the command is redirected to the log file. Only available if err port is not connected. Range: boolean
  • working_directoryDefines the working directory for the command. If no working directory is defined the working directory of the current RapidMiner process is used. Range: string
  • env_variablesAllows to set environment variables for the specified command. If an environment variable is defined multiple times, the last defined variable will be used. Range: list

Tutorial Processes

Introduction to the Execute Program operator

This Example Process uses the Execute Program operator to execute commands in the shell of Windows 7. Two Execute Program operators are used. The command parameter of the first Execute Program operator is set to 'cmd /c java -version'. The command parameter of the second Execute Program operator is set to 'cmd /c notepad'. When the process is executed, first the java version is described in the log window. Then the notepad is opened. The process waits for the notepad to close. The process proceeds when the notepad is closed by the user. Please note that setting the command parameter to just 'notepad' would have also worked here.

Opening Internet Explorer by the Execute Program operator

This Example Process uses the Execute Program operator to open the Internet Explorer browser by using the shell commands of Windows 7. The command parameter of the Execute Program operator is set to 'cmd /c start C:\"Program Files"\"Internet Explorer"\"iexplore.exe"'. When the process is executed, the Internet Explorer browser opens. The process waits for the Internet Explorer browser to be closed by the user. The process proceeds when the Internet Explorer browser is closed.

Piping data through a shell command

This process demonstrates how data can be streamed into and out of the executed command. In particular, we open the Iris data set, write it as a CSV file into an in-memory File Object. This buffer is then passed to the Execute Program operator which executes the "sort" command. This sorts the input and returns it at the >out port. Another Read CSV operator parses the sorted output.