API changes in RapidMiner 7.3

RapidMiner 7.3 brings two changes that affect the development of extensions. First, a central API for the creation of data sets (ExampleSet instances) was introduced. Second, the ExampleSet interface was extended by a method to allow for freeing unused data.

These changes only affect you if your extension includes operators that generate new data sets or defines its own ExampleSets (e.g., custom views).

Generating data sets

RapidMiner 7.3 adds the ExampleSets class that provides a set of static methods to build new data sets. Those methods replace direct instantiations of ExampleTable implementations such as the MemoryExampleTable. In particular, all public constructors for the MemoryExampleTable class have been deprecated.

The new API provides methods to create data sets from both columnar and row-oriented data:

import com.rapidminer.example.Attribute;
import com.rapidminer.example.Attributes;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.example.table.AttributeFactory;
import com.rapidminer.example.table.BinominalMapping;
import com.rapidminer.example.table.NominalMapping;
import com.rapidminer.example.utils.ExampleSetBuilder;
import com.rapidminer.example.utils.ExampleSets;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorDescription;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.tools.Ontology;

// create example set using column fillers
Attribute topTen = AttributeFactory.createAttribute("Top Ten Numbers", Ontology.INTEGER);
Attribute coinFlip = AttributeFactory.createAttribute("Coin Flip", Ontology.BINOMINAL);

NominalMapping coin = new BinominalMapping();
int heads = coin.mapString("Heads");
int tails = coin.mapString("Tails");
coinFlip.setMapping(coin);

ExampleSet numbers = ExampleSets.from(topTen, coinFlip)
    .withRole(topTen, Attributes.ID_NAME)
    .withBlankSize(10)
    .withColumnFiller(topTen, i -> i + 1)
    .withColumnFiller(coinFlip, i -> Math.random() < 0.5 ? heads : tails)
    .build();

// create example set from double matrix
ExampleSetBuilder builder = ExampleSets.from(AttributeFactory.createAttribute(Ontology.REAL),
    AttributeFactory.createAttribute(Ontology.REAL),
    AttributeFactory.createAttribute(Ontology.REAL));

builder.withExpectedSize(10);

double rawData[][] = new double[10][3];
for (double[] row : rawData) {
    builder.addRow(row);
}

ExampleSet matrix = builder.build();

Freeing unused resources

The ExampleSet interface has been extended by the cleanup() method. RapidMiner will invoke this method at certain points of the process execution, e.g., in between operators. Please note, that the default implementation does nothing.

/**
 * Frees unused resources, if supported by the implementation. Does nothing by default.
 *
 * Should only be used on freshly {@link #clone}ed {@link ExampleSet}s to ensure that the
 * cleaned up resources are not requested afterwards.
 *
 * @since 7.3
 */
public default void cleanup() {
    // does nothing by default
}

When implementing custom example sets that manage their own resources, please use this method to free unused data such as temporary attributes.

If you do not manage your own resources, but implement a custom ExampleSet that acts as view on top another data set, please delegate the call accordingly.

For instance, most of RapidMiner's view implementations reference a single parent example set. Thus, their implementation of cleanup() boils down to:

@Override
public void cleanup() {
    parent.cleanup();
}