Categories

Versions

You are viewing the RapidMiner Studio documentation for version 8.1 - Check here for latest version

Retrieve (RapidMiner Studio Core)

Synopsis

This Operator can access stored information in the Repository and load them into the Process.

Description

The Retrieve Operator loads a RapidMiner Object into the Process. This Object is often an ExampleSet but it can also be a Collection or a Model. Retrieving data this way also provides the meta data of the RapidMiner Object.

Differentiation

This Operator is like the different Read <source> Operators in the Data Access group. Storing the data inside a repository gives one the advantage that meta data properties are stored as well. Meta data gives you additional information about the RapidMiner Object you retrieve. For an ExampleSet this is e.g. the names and types of Attributes, their range and how many missing values there are. Meta data allows you to easily configure parameters of other Operators, for example you can select Attributes from a list of available Attributes.

The data stored in the Repository can only be changed within a RapidMiner Process. Data stored on disk or within database can be changed by other means.

Output

  • output (IOObject)

    It returns the RapidMiner Object whose path was specified in repository entry parameter.

Parameters

  • repository_entry

    The path to the RapidMiner Object which should be loaded. This parameter references an entry in the repository, which will be returned as output of this Operator.

    Repository locations are resolved relative to the Repository folder containing the current Process. Folders in the Repository are separated by a forward slash ('/'). A '..' references the parent folder. A leading forward slash references the root folder of the Repository containing the current Process. A leading double forward slash ('//') is interpreted as an absolute path starting with the name of a Repository. The list below shows the different methods:

    • 'MyData' looks up an entry 'MyData' in the same folder as the current Process
    • '../Input/MyData' looks up an entry 'MyData' located in a folder 'Input' next to the folder containing the current Process
    • '/data/Model' looks up an entry 'Model' in a top-level folder 'data' in the Repository holding the current Process
    • '//Samples/data/Golf' looks up the Iris data set in the 'Samples' Repository.

    When using the “Select the repository location” button, it is possible to check if the path should be resolved relative. This is useful when sharing Processes with others.

    Range:

Tutorial Processes

Load Example Data using the Retrieve Operator

This Process loads the Golf data set from repository. The repository entry parameter is provided as an absolute path '//Samples/data/Golf'. Thus the Golf data set is returned from the Samples repository and the sub-folder data.