You are viewing the RapidMiner Studio documentation for version 9.8 - Check here for latest version
Update Database (JDBC Connectors)
Synopsis
This operator updates the values of all examples with matching ID values in a database.Description
The Update Database operator is used for updating an existing table in the specified SQL database. You need to have at least basic understanding of databases and database connections in order to use this operator properly. Go through the parameters and the attached Example Process to understand the flow of this operator.
The user can specify the database connection, a table name and ID column names. The most convenient way of defining the necessary parameters is the Manage Database Connections wizard. The most important parameters (database URL and user name) will be automatically determined by this wizard.
The row(s) to update are specified via the db id attribute name parameter. If the id columns of the table do not match all the id values of any given example, the row will be inserted instead. The ExampleSet attribute names must be a subset of the table column names, otherwise the operator will fail.
Input
- input (Data Table)
This input port expects an ExampleSet. It is output of the Retrieve operator in the attached Example Process.
- connection (Connection)
This input port expects a Connection object if any. See the parameter connection entry for more information.
Output
- through (Data Table)
The ExampleSet that was provided at the input port is delivered through this output port without any modifications. This is usually used to reuse the same ExampleSet in further operators of the process.
- connection (Connection)
This output port delivers the Connection object from the input port. If the input port is not connected the port delivers nothing.
Parameters
- define_connectionThis parameter indicates how the database connection should be specified. It gives you four options: repository, predefined, url and jndi. The parameter is not visible if the connection input port is connected. Range: selection
- connection_entryThis parameter is only available when the define connection parameter is set to repository. This parameter is used to specify a repository location that represents a connection entry. The connection can also be provided using the connection input port. Range: string
- connectionThis parameter is only available when the define connection parameter is set to predefined. This parameter is used for connecting to the database using a predefined connection. You can have many predefined connections. You can choose one of them using the drop down box. You can add a new connection or modify previous connections using the button next to the drop down box. You may also accomplish this by clicking on Manage Database Connections... from the Tools menu in the main window. A new window appears. This window asks for several details e.g. Host, Port, Database system, schema, username and password. The Test button in this new window will allow you to check whether the connection can be made. Save the connection once the test is successful. After saving a new connection, it can be chosen from the drop down box of the connection parameter. You need to have basic understanding of databases for configuring a connection. Range: string
- database_systemThis parameter is only available when the define connection parameter is set to url. This parameter is used for selecting the database system in use. It can have one of the following values: MySQL, PostgreSQL, Sybase, HSQLDB, ODBC Bridge (e.g. Access), Microsoft SQL Server (JTDS), Ingres, Oracle. Range: selection
- database_urlThis parameter is only available when the define connection parameter is set to url. This parameter is used for defining the URL connection string for the database, e.g. 'jdbc:mysql://foo.bar:portnr/database'. Range: string
- usernameThis parameter is only available when the define connection parameter is set to url. This parameter is used for specifying the username of the database. Range: string
- passwordThis parameter is only available when the define connection parameter is set to url. This parameter is used for specifying the password of the database. Range: string
- jndi_nameThis parameter is only available when the define connection parameter is set to jndi. This parameter is used for giving the JNDI a name for a data source. Range: string
- table_nameThis parameter is used for selecting the required table from the specified database. Please note that you can also write a table name here, if the table does not exist it will be created during writing. Range:
- attribute_filter_typeThis parameter allows you to select the ID attribute which values ALL have to match in the example set and the database for the row to be updated. It has the following options:
- all: Does not make sense in this context so do not use, will break the process.
- single: This option allows the selection of a single id attribute.
- subset: This option allows the selection of multiple id attributes through a list. This option will not work if the meta data is not known.
- regular_expression: This option allows you to specify a regular expression for the id attribute selection. When this option is selected some other parameters (regular expression, use except expression) become visible in the Parameter panel.
- value_type: This option allows selection of all the id attributes of a particular type. It should be noted that types are hierarchical. For example real and integer types both belong to the numeric type. The user should have a basic understanding of type hierarchy when selecting attributes through this option. When this option is selected some other parameters (value type, use value type exception) become visible in the Parameter panel.
- block_type: This option is similar in working to the value_type option. This option allows the selection of all the attributes of a particular block type. It should be noted that block types may be hierarchical. For example value_series_start and value_series_end block types both belong to the value_series block type. When this option is selected some other parameters (block type, use block type exception) become visible in the Parameter panel.
- no_missing_values: This option simply selects all the attributes of the ExampleSet which don't contain a missing value in any example. Attributes that have even a single missing value are removed.
- numeric_value_filter: When this option is selected another parameter (numeric condition) becomes visible in the Parameter panel. All numeric attributes whose examples all satisfy the mentioned numeric condition are selected. Please note that all nominal attributes are also selected irrespective of the given numerical condition.
Tutorial Processes
Updating an ExampleSet in a mySQL database
The 'Iris' data set is loaded using the Retrieve operator. The Update Database operator is used to update an existing database table named "Test" in the "My connection" SQL database. Rows in the example set and table which match on their "ID" column will be updated. If no match can be found, the row will be inserted instead.