Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.1 - Check here for latest version

Write Excel (RapidMiner Studio Core)

Synopsis

This operator writes ExampleSets to an Excel spreadsheet file.

Description

The Write Excel operator can be used for writing ExampleSets into a Microsoft Excel spreadsheet file. This operator creates Excel files that are readable by Excel 95, 97, 2000, XP, 2003 and newer versions. Missing data values in the ExampleSet are indicated by empty cells in the Excel file. The first row of the resultant Excel file has the names of attributes of the input ExampleSet. Files written by the Write Excel operator can be loaded in RapidMiner using the Read Excel operator. Multiple input ExampleSets can be specified and this will lead to one excel file with multiple sheets. By default the sheets will be named after their corresponding ExampleSet's source. Sheet names can optionally be specified using the sheet names parameter.

Input

  • input

    This input port expects an ExampleSet or a Collection of ExampleSets. The specified ExampleSets will become sheets in the resulting Excel file. The Write Excel operator can have multiple inputs. When one input is connected, another input port becomes available which is ready to accept another input (if any). The order of inputs remains the same in the resulting Excel file.

Output

  • file

    The created Excel file is provided as a file object that can be used with other operators with file input ports like 'Write File'.

  • through

    The ExampleSet or collection of ExampleSets that was provided at the corresponding input port is delivered through this output port without any modifications. This is usually used to reuse the same ExampleSets in further operators of the process.

Parameters

  • excel_file The path of the Excel file is specified here. It can be selected using the choose a file button. Range: string
  • file_format Allows the user to specify if the resulting excel sheet should have the xls or xlsx format. Range: selection
  • sheet_names

    Sheets names can optionally be specified here. The first sheets name specified corresponds to the first input and so on. Sheet names that have not been specified will default to the name of their ExampleSet's source.

    Compatibility level below 9.3.0: The first sheet name will be taken from the sheet name parameter. Additional sheet names can be specified here as described above.

    Range: enumeration
  • sheet_nameThe user can specify the name of the first excel sheet here. This parameter will only be shown for xlsx file format. This parameter is outdated and has been removed for versions starting from 9.3.0. Range: string
  • date_formatThis is an expert parameter. Furthermore it is shown with file format xlsx only. Format dates should be saved in. Range: string
  • number_formatThis is an expert parameter. Furthermore it is shown with file format xlsx only. Format number should be saved in. Range: string
  • encodingThis is an expert parameter. Furthermore it is shown with file format xls only. There are different options. Users can choose any of them. Range: selection

Tutorial Processes

Writing multiple data sets into one Excel file

Multiple data sets are loaded using the Retrieve operator. The Write Excel operator takes the data sets as input and writes them as separate sheets into one Excel file. The first two sheet names are specified via the sheet names parameter. Please note that the specified sheet names need to respect the restrictions that Excel puts on its sheet names. They need to be unique and not longer than 31 characters. The rest of the sheet names are not explicitly given and, therefore, these sheets will be named after their corresponding example sets' sources.

The excel file parameter is provided with this path: '%{tempdir}/Multiple data sets.xlsx'. Thus an Excel file named 'Multiple data sets' is created in a temp folder of your computer. Note that the first rows of the resulting Excel sheets have the names of the attributes of your their corresponding data sets. Also note that all missing values in the data sets are represented by empty cells in the Excel sheets.

The Write Excel operator's input ports can handle collections of example sets as input. The loop operator retrieves the Iris data set three times, stores the three data sets in a collections and hands it to the Write Excel operator. The operator then unwraps the three data sets and writes them to the resulting Excel file as separate sheets. As the three data sets will all have the same source, their names will not be unique. To resolve this the Write Excel operator automatically adds an index to the names before writing them to the file.