You are viewing the RapidMiner Studio documentation for version 8.0 - Check here for latest version

Add Entry to Archive File (RapidMiner Studio Core)

Synopsis

This operator adds entries to an archive file object, currently the only available type is a zip file.

Description

The Add Entry to Archive File operator adds entries, i.e. files, to an archive file object created by the Create Archive File operator. By default, the entries are added to the root directory of the archive file, but you can specify a directory name to create a subdirectory inside the archive file. Please have a look at the tutorial process of this operator to better understand its usage.

Input

  • archive file

    This operator can have multiple inputs. When one input port is connected, another input port becomes available which is ready to accept another input (if any).This input port expects an ExampleSet. It is output of the Retrieve operator in the attached Example Process. Output of other operators can also be used as input. It is essential that meta data should be attached with the data for the input because attributes are specified in their meta data. The Retrieve operator provides meta data along with data.

  • file input (File)

    The Add Entry to Archive File operator can have multiple inputs. When one input port is connected, another input port becomes available which is ready to accept another input (if any). This input port expects a File Object. File Objects can be created e.g. with the Open File operator.

Output

  • archive file

    The same archive file object that has been connected to the input port is output of this port, with the additional entries added by this operator.

Parameters

  • directoryThis parameter specifies the directory where the entry will be stored inside the archive file. Specify it in the form 'my/sub/directory', or leave it empty to store the entry in the root folder. Range: string
  • override_compression_levelThis parameter allows to override the default compression level of the archive file object for the entries created by this operator. The default level is set by the Create Archive File operator that created the archive file object. This is useful, if you are adding pre-compressed files to the archive, such as zip files, jar files etc. These files cannot be further compressed, so you can save some execution time by setting the compression level for new entries of this kind to a low value. Range: boolean
  • compression_levelThe compression level of the newly created entries is specified through this parameter. In general, higher compression levels result also in a higher runtime. Range: integer

Tutorial Processes

Creating and storing a zip file

This Example Process demonstrates how a zip file can be created in RapidMiner, how entries can be added and how the file can be written to a disk.

First of all, the zip file is created with the Create Archive File operator. Then, some entries are added. At first the Open File operators open some files from your harddisk. These files are then added to the zip file via the Add Entry to Archive File operators. You can see that you can add several files in one single step, and that you can also concatenate several Add Entry to Archive File operators. Finally, the zip file is written to a disk with the Write File operator.

Please be sure to select some valid files from your harddisk in the Open File operators, and to specify a valid location in the Write File operator!

The second Add Entry to Archive File operator creates a directory inside the zip file. After the execution of the process you may open the archive file from your disk and inspect the results.

Storing freshly created data in a zip file

This process loads the Iris data set, creates a CSV file from it and stores it in a zip file. The zip file is then saved to disk with the Write File operator.Please note that you have to set the filename parameter of the Write File operator.

When you load a file from disk with Open File, the filename is known to RapidMiner. In the current process, that is not the case, since the CSV file has been created on the fly: we must assign the name manually. This is done by defining the Filename annotation with the help of the Annotate operator.

For more information about annotations and the related operators, please have a look at the documentation of the Annotate operator.