Categories

Versions

You are viewing the RapidMiner Developers documentation for version 9.9 - Check here for latest version

This guide targets the new Connection Management introduced with RapidMiner Studio 9.3 and replaces the section on custom configurators.

To convert old style configurators to the new Connection Management, follow instructions here.

Creating Custom Connection Handlers

Imagine that you want to create a RapidMiner extension that offers an operator for reading data from a Shared Data system. Your operator needs the information for accessing the Shared Data, such as a user name, or a password and maybe some additional parameters. One approach is to add text fields to the parameters of the operator and let the user type in the required information. Though this may seem convenient, it gets quite redundant if you want to use the same information in other RapidMiner processes or operators, since you would have to enter the information multiple times. Alternatively, you can define the Shared Data connection in a repository and let the user select which Shared Data to get data from.

This is a scenario where Connection Types come in handy. A connection handler can manage connection objects of a certain type and allows you to create, edit and delete them inside a repository through a customizable connection dialog. For this example, we will implement a connection handler for Shared Data entries that allows us to automatically configure those entries using a dialog accessible through the repository entry. Moreover, a connection handler can be used along with a parameter in an operator, which allows the user to easily select a connection from the repository. Alternatively, the operator can provide a pass-through port pair to take in a connection from the repository via a Retrieve operator.

You can follow along the implementation of a custom connection type with this extension project. The extension was created similarly to how you can see in the main extension developer documentation. If you have an extension that is using the legacy configurator mechanism, you can follow instructions here to convert it to the new connection management with the added benefit of a built-in conversion mechanism for existing connections.

Connection Handler

In order to implement your own connection handler, you need to know the following classes and interfaces:

  • ConnectionHandler is the interface to implement to create, validate and test a connection of a specific type.
  • ConnectionHandlerRegistry is used to register ConnectionHandlers in RapidMiner.
  • ConnectionInformation + ConnectionInformationBuilder The connection information holds the configuration and the paths to all related files.
  • ConnectionConfiguration + ConnectionConfigurationBuilder The configuration holds all parameters sorted into groups, the builder helps to create a fresh configuration or expanding from an existing one.
  • ConfigurationParameter + ParameterUtility A parameter can be encrypted and have a default value, and it could be injected. The utility class provides builders for parameters, e.g. when creating connections from scratch.

To implement your handler, create a class that implements the ConnectionHandler class. It is recommended to make the handler a singleton, either through a static instance field or by implementing it as an enum.

To implement the interface's methods, we suggest the following order

  1. String getType(): This specifies the type of the connection this handler can manage. The type should include the extension's namespace to avoid possible duplications and should look like this: <extension_namespace>:<connection-type>

    In the example project Shared Data, the namespace of the extension is shared_data by convention, and the connection type is shared-data. This might be confusing to look at, but might be common place for extensions whose main feature is a new connection type. The full connection type would be shared_data:shared-data.

    It is best practice to declare the namespace as a constant in the PluginInit class and to have the full type as a public constant in the handler.

  2. void initialize() and boolean isInitialized(): If your handler needs any kind of one-time initialization, like extracting files from your extension, this goes into the initialize method. This will not be called automatically but gives you the ability to trigger the initialization whenever you think it is necessary. Similarly, isInitialized should indicate if the handler was successfully initialized. If you do not need any additional initialization, just make the init method a noop implementation and always return true in the indicator method.

    Note: These methods are here for convenience and are not directly called or checked by RapidMiner.

  3. ConnectionInformation createNewConnectionInformation(String name): In this method, a fresh "default" connection of your specified type will be created. A ConnectionInformation consists of (at least) a ConnectionConfiguration that holds the parameters.

    1. Parameters in connections are sorted into groups. You do not need to have more than one group, but at least one group is necessary. Group names as well as parameter names (also called keys) should be in snake_case, i.e. words are in all lower-case and separated with an underscore. This will be relevant for i18n translation in a bit.
    2. You can create parameters with a builder by calling ParameterUtility.getCPBuilder(<parameter_key>) and then specify details about the parameter. You can create an encrypted parameter by adding true as a second argument to the getCPBuilder method. With the parameter builder, you can set the enabled state of the parameter (call enable(), disable() or enable(boolean)) as well as the default value (call withValue(String)). Afterwards you can build the parameter.
    3. Collect parameters that belong to one group in a specific list.
    4. You can create a ConfigurationBuilder with the given name and your connection type.
    5. To create a group in the configuration builder, call withKeys(<group_key>, ).
    6. You can add tags or a default description to the configuration if you want to have defaults for those too.

    After you are done with the configuration and built it, you hand it over to a ConnectionInformationBuilder and build the ConnectionInformation from that. Again we recommend to keep group and parameter keys as (public) constants for later and outside reference.

    Additionally, a connection object can hold library files (such as .jar or .dll) as well as other files (e.g. an xml configuration, a plain text file etc.). This might be relevant if you need a specific file per connection. In RapidMiner Studio for example, this is done for JDBC database connections to ensure the correct JDBC driver is used and always bundled with a given connection.

    To utilize the additional files, you will need to provide some UI component to take care of that. See the next section for more information.

  4. ValidationResult validate(ConnectionInformation connection) and TestResult test(ConnectionInformation connection): These methods should check your connection for correct parameters. The validation should only do general checks, like making sure that all mandatory parameters are set and the values make sense. This should not take too much time, as it is used for live updates. Testing on the other hand is done asychronously and can have long running content. Here you should test if the parameters can be used to establish a successful connection.

    Parameters can be injected, i.e. be provided at process runtime. Built in providers are macros and server. For more information on this, see this section. To help check a parameter during validation, you can use one of the ParameterUtility#validateParameterValue methods. It will determine if a parameter is set, i.e. if its value is not null or that the parameter is injected.

  5. Additional methods/classes: In order to create a "connection" to the system your connection type represents, we recommend having a central point to establish it from a ConnectionInformation object. This could be a (static) method in the handler, or in some helper class. In other extensions, this usually results in a specific object holding onto the connection while it is used and taking care of clean up. Approaching connections this way will help with reusing this connection type in multiple operators without duplicating code. You can find more on usage of connections in operators further down.

    To resolve all parameters for a connection, including injected values, you can simply call ValueProviderHandlerRegistry.getInstance().injectValues(connection, operator, false) to get a map of all parameters to values. The parameters are fully qualified, i.e. with their respective group prefixes.

When you are done with the connection handler, you can register it with the ConnectionHandlerRegistry. This should be done in the init method of your PluginInit class.

ConnectionGUI and I18N

If your connection only consists of string parameters, whether plain text or encrypted, you can simply register an instance of DefaultConnectionGUIProvider for your connection, like so:

ConnectionGUIRegistry.INSTANCE.registerGUIProvider(new DefaultConnectionGUIProvider(), yourConnectionHandler.getType());

Otherwise you need to provide some Java Swing components for the UI of the connection. To implement a custom(ized) connection UI, you need to look at the following classes:

  • AbstractConnectionGUI, DefaultConnectionGUI and respective provider classes: The main UI class that provides the label and input for each parameter; also provides some static helper methods
  • ConnectionGUIRegistry is the registry for custom connection GUIs
  • ConnectionParameterLabel, ConnectionParameterTextField, ConnectionParameterCheckBox are simple label and input components for parameters, used in the GUI
  • InjectableComponentWrapper is a component helper class for parameters that can be injected

To customize a connection GUI, the easiest start point is the DefaultConnectionGUI. It already comes packed with several helper methods that we will discuss in order of complexity. The default GUI will display connection parameters similar to how parameters of operators are displayed: Labels on the left, input on the right. More on labels in the i18n section (see below). So the simple way to implement a connection GUI is to extend the default GUI.

  1. JComponent getParameterInputComponent(String type, ConnectionParameterModel parameter): The input component by default is a simple text field that reacts to whether a parameter is encrypted or not, displaying either a plain text field, or a password field (which obfuscates text with * symbols). This is the ConnectionParameterTextField.

    Other predefined components or helper methods exist for boolean parameters in the form of a check box (ConnectionParameterCheckBox) and a combo box that can be created with InjectableComponentWrapper.getInjectableCombobox(parameter, String[], String). All of these components use an InjectableComponentWrapper to allow for parameter injection (see the section on injection in the main documentation).

    Since the GUI methods are called both for edit and view mode, you might need to differentiate between these two modes to create appropriate input components. Text fields and check boxes already adapt to these modes and are shown as uneditable. In the example project, we chose to display combo boxes as simple uneditable text fields in view mode.

    In the default GUI, check boxes don't have a separate label component, since they are displayed as checkbox first, label second. Also input components are wrapped with an information icon that displays a description of the parameter.
    See the i18n section below for more information.

    More complex input components can be constructed if necessary, but the three input types explained above can be found in the example project.

  2. List<ConnectionParameterModel> getInjectableParameters(ConnectionParameterGroupModel group): Here you can modify which parameters are actually injectable. For some connections it might make sense to not allow all parameters to be injectable. By default all parameters are injectable.

  3. List<ConnectionParameterModel> orderedParameters(ConnectionParameterGroupModel groupModel): If you want to enforce the order of the connection parameters for a group, you can do so here. The json format used for connection persistence might be created manually and thus have not all parameters in order. Be careful that all parameters are in the final list because they otherwise will not show up in the GUI at all. By default the order is taken from the group model.

  4. Defining parameter dependencies: If you want certain parameters to only show up if another parameter has a certain value, there are several steps to take.

    1. First of all you should override JComponent getComponentForGroup(ConnectionParameterGroupModel groupModel, ConnectionModel connectionModel) to make sure that all involved parameters actually exist (see the getOrCreateParameter method in the example project) and then register the enabled properties of parameters with the values associated with them. E.g. singleAParameter.enabledProperty().bind(optionParameter.valueProperty().isEqualTo(OPTION_A)); would set the a parameter to be enabled if and only if the option parameter is set to the a option. After preparing the parameters, simply call the super implementation.

    2. Next up we need to make sure that disabled parameters are not shown in the UI anymore. This can be achieved using the visibilityWrapper method provided in the AbstractConnectionGUI. This method takes an arbitrary JComponent and a ConnectionParameterModel to bind the enabled property of that parameter to the visibility of the component. We need to make sure that both the label and the input components are affected.

      To achieve this, override both **getParameterLabelComponent** and **wrapInformationIcon** by wrapping the super
      implementation calls in a **visibilityWrapper** call.
      

For more complex modifications to the GUI, look into the Javadoc for both AbstractConnectionGUI and DefaultConnectionGUI or ask questions on the community.

I18N

To improve the possibility to translate RapidMiner to other languages, most parts of the connection GUI are internationalized, requiring entries in the GUI<ExtensionName>.properties file. This includes icon, label and description of a connection type, the same for groups (icons and descriptions are currently unused for this) and label and description for each parameter.

All i18n keys for connections are prefixed with gui.label.connection, and suffixes include icon, label and tip. The specifiers are as follows:

  • Connection type: type.<extension_namespace>.<connection-type>
  • Group: group.<extension_namespace>.<connection-type>.<group_name>
  • Parameter: parameter.<extension_namespace>.<connection-type>.<group_name>.<parameter_name>

Example: The parameter option in the basic group of the example project has the full key

gui.label.connection.parameter.shared_data.shared-data.basic.option

followed by the label or tip suffix.

You can use the I18NHelper class in the example project to create all connection related i18n keys with some simple prefilled labels. The original GUI property file is kept as a .bak file in case some other i18n keys get lost. The class helper class looks something like the following.

Runnable initializer = PluginInit<ExtensionName>::initPlugin;
String namespace = PluginInit<ExtensionName>.NAMESPACE;
Path propertyPath = Paths.get("src/main/resources/com/rapidminer/extension/resources/i18n/GUI<ExtensionName>.properties").toAbsolutePath();
if (!Files.exists(propertyPath)) {
    Files.createFile(propertyPath);
}
BiFunction<String, Properties, String> defaultValues = connectionAdaptionHandlerDefaultValues(namespace);
Path newPath = appendOrReplaceConnectionKeys(initializer, namespace, propertyPath, defaultValues);
if (newPath == null) {
    return;
}
Files.copy(Files.newInputStream(newPath), propertyPath, StandardCopyOption.REPLACE_EXISTING);

Now you can create, edit and view your custom connections!

Usage

To use your custom connections in an operator, the following classes are used:

  • ConnectionInformationSelector: Helper class that gives access to parameter types, helps with meta data transformation and data forwarding, as well as retrieving the specified connection
  • ConnectionSelectionProvider: Interface to give access to the selector (optional)

To incorporate the selector with your operator, create a field an initialize a selector with the current operator and the fully qualified connection type, e.g. shared_data:shared-data in the example project. This minimal constructor will automatically create both an input and output port for the operator, so that a connection that is fed into the operator can be passed through.

To customize this, a second constructor is available that takes an input and output port as parameters. The ports can be preconfigured ports from your operator or can be left as null to only rely on a parameter to specify the connection.

We recommend following these steps to then setup your operator:

  1. In the operator constructor, if you have at least an input port for connections, call ConnectionInformationSelector#makeDefaultPortTransformation to attach a precondition to the input to ensure that connections have the correct type and add a pass through rule for input and output port (if possible).

  2. In the getParameterTypes method of the operator, add all parameter types created by the selector by calling the static method ConnectionInformationSelector#createParameterTypes(ConnectionInformationSelector). The selector creates a single parameter by default with key connection_entry and, if an input port is present, adds a condition to hide the parameter if the connection input port is connected.

  3. If you implement the ConnectionSelectionProvider interface, provide the field to the getter method and leave the setter as a no-op method. The interface can be utilized if you have multiple operators that need specific selectors, but most use cases do not need this.

  4. In the doWork method, to retrieve the specified connection, call ConnectionInformationSelector#getConnection. The method automatically throws a UserError if no connection is specified and as such never returns null. The connection can then used with your handler's central method (see the last point in the handler section above) to establish a connection with the system your connection type represents. After that you can implement whatever is necessary.

  5. At the end of the doWork method, don't forget to call ConnectionInformationSelector#passDataThrough or ConnectionInformationSelector#passCloneThrough if you have input and output ports in your selector to hand the data through, either as the original or as a clone. This way you can chain operators that use the same connection very easily.