Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.2 - Check here for latest version

Read Cassandra (NoSQL)

Synopsis

This operator reads an example set from a Cassandra table.

Description

The example set to be read can be specified via a CQL statement, a CQL file or by specifying a table name.

Input

  • file (File)

    The CQL file which specifies the CQL statement to be executed. If the 'define query' parameter is set to the 'query file' option, the input port 'file' is used for the CQL file. Note: If the input port is connected to another operator with output port file and the input port is connected to it, the 'query file' option of the 'define query file' parameter is ignored.

Output

  • output (IOObject)

    The example set specified via either the CQL statement or the table.

  • file (File)

    If the input port 'file' is connected, the unchanged CQL file is returned.

Parameters

  • conncetion The connection details for the Cassandra connection have to be specified. If you have already configured a Cassandra connection, you can select it from the drop-down list. If you have not configured a Cassandra connection yet, select the Cassandra icon right to the drop-down list. Create a new Cassandra connection in the Manage connections box. The contact points and keyspace name are mandatory. Range: configurable
  • consistency_level The consistency level for the Cassandra query. The consistency level defines how many Cassandra nodes have to respond to the query in order to be successful. Possible levels are: ONE, TWO, THREE, QUORUM, ALL, ANY
    • ONE: A write must be written at least to one node.
    • TWO: A write must be written at least to two nodes.
    • THREE: A write must be written at least to three nodes.
    • QUORUM: A write must be written at least on a quorum of nodes. A quorum is calculated as (rounded down to a whole number): (replication_factor / 2) + 1. For example, with a replication factor of 3, a quorum is 2 (can tolerate 1 node down). With a replication factor of 6, a quorum is 4 (can tolerate 2 nodes down).
    • ALL: A write must be written on all nodes in the cluster for that row key.
    • ANY: A write must be written to at least one node
    Range: selection
  • define_query This parameter allows to select the mode the data of a query should be defined.
    • query: Define a CQL query via the 'query' parameter.
    • query file: Load CQL query from file. If 'file' input port is connected, the query is loaded from the provided file object.
    • query table: Select a table to be loaded without defining a CQL query.
    Range: selection
  • query This parameter is only displayed when you have selected the 'query' parameter. If you click in the 'Edit text...' field, the 'Edit parameter: query' editor opens and you specify the CQL query. Only SELECT statements are allowed. Range: string
  • query_file This parameter is only displayed when you have selected the 'query file' parameter. You can select the file that contains the CQL statement that defines the data. Only SELECT statements are allowed. Note: If the Input port of the Read Cassandra operator is connected to an Open file operator, this parameter is not displayed. Range: file
  • prepare_statement If you have either select 'query' or 'query file' for the 'define query' operator, this parameter is displayed. It specifies whether the query will be a prepared query or a normal query. If activated, the parameter 'parameters' is shown. Range: boolean
  • parameters If you have activated the 'prepare statement' checkbox, this parameter allows to specify prepared values for the query. Every '?' from the specified CQL query will be replaced by the prepared values in the order they are listed in the Edit parameter list: parameters. Note: If you select the wrong type for the parameter, an error message informs you about. Range: enumeration
  • table If 'define query' is set to 'query table', this parameter is displayed. It allows to select the table that should be read. Range: string
  • datamanagementThis parameter allows you to select the appropriate data type for the internal data description. Range: selection