Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.4 - Check here for latest version

Read Cassandra (NoSQL)

Synopsis

This operator reads an example set from a Cassandra table.

Description

The example set to be read can be specified via a CQL statement, a CQL file or by specifying a table name.

Input

  • file (File)

    The CQL file which specifies the CQL statement to be executed. If the 'define query' parameter is set to the 'query file' option, the input port 'file' is used for the CQL file. Note: If the input port is connected to another operator with output port file and the input port is connected to it, the 'query file' option of the 'define query file' parameter is ignored.

  • connection (Connection)

    This input port expects a Connection object if any. See the parameter connection entry for more information.

Output

  • output (IOObject)

    The example set specified via either the CQL statement or the table.

  • file (File)

    If the input port 'file' is connected, the unchanged CQL file is returned.

  • connection (Connection)

    This output port delivers the Connection object from the input port. If the input port is not connected the port delivers nothing.

Parameters

  • connection_source This parameter indicates how the connection should be specified. It gives you two options, predefined and repository. The parameter is not visible if the connection input port is connected. Range: selection
  • connection_entry This parameter is only available when the connection source parameter is set to repository. This parameter is used to specify a repository location that represents a connection entry. The connection can also be provided using the connection input port. Range: string
  • connection This parameter is only available when the connection source parameter is set to predefined. The connection details for the Cassandra connection can be specified here. If you have already configured a Cassandra connection, you can select it from the drop-down list. If you have not configured a Cassandra connection yet, select the Cassandra icon right to the drop-down list. Create a new Cassandra connection in the Manage connections box. The contact points and keyspace name are mandatory. Range: configurable
  • consistency_level The consistency level for the Cassandra query. The consistency level defines how many Cassandra nodes have to respond to the query in order to be successful. Possible levels are: ONE, TWO, THREE, QUORUM, ALL, ANY
    • ONE: A write must be written at least to one node.
    • TWO: A write must be written at least to two nodes.
    • THREE: A write must be written at least to three nodes.
    • QUORUM: A write must be written at least on a quorum of nodes. A quorum is calculated as (rounded down to a whole number): (replication_factor / 2) + 1. For example, with a replication factor of 3, a quorum is 2 (can tolerate 1 node down). With a replication factor of 6, a quorum is 4 (can tolerate 2 nodes down).
    • ALL: A write must be written on all nodes in the cluster for that row key.
    • ANY: A write must be written to at least one node
    Range: selection
  • define_query This parameter allows to select the mode the data of a query should be defined.
    • query: Define a CQL query via the 'query' parameter.
    • query file: Load CQL query from file. If 'file' input port is connected, the query is loaded from the provided file object.
    • query table: Select a table to be loaded without defining a CQL query.
    Range: selection
  • query This parameter is only displayed when you have selected the 'query' parameter. If you click in the 'Edit text...' field, the 'Edit parameter: query' editor opens and you specify the CQL query. Only SELECT statements are allowed. Range: string
  • query_file This parameter is only displayed when you have selected the 'query file' parameter. You can select the file that contains the CQL statement that defines the data. Only SELECT statements are allowed. Note: If the Input port of the Read Cassandra operator is connected to an Open file operator, this parameter is not displayed. Range: file
  • prepare_statement If you have either select 'query' or 'query file' for the 'define query' operator, this parameter is displayed. It specifies whether the query will be a prepared query or a normal query. If activated, the parameter 'parameters' is shown. Range: boolean
  • parameters If you have activated the 'prepare statement' checkbox, this parameter allows to specify prepared values for the query. Every '?' from the specified CQL query will be replaced by the prepared values in the order they are listed in the Edit parameter list: parameters. Note: If you select the wrong type for the parameter, an error message informs you about. Range: enumeration
  • table If 'define query' is set to 'query table', this parameter is displayed. It allows to select the table that should be read. Range: string
  • datamanagementThis parameter allows you to select the appropriate data type for the internal data description. Range: selection