Generate Transaction Data (RapidMiner Studio Core)

Synopsis

This operator generates an ExampleSet that represents transaction data. The number of transactions, number of customers, number of items and number of clusters can be specified by the user.

Description

The Generate Transaction Data operator generates an ExampleSet representing transaction data. This ExampleSet can be used when you do not have a data set that represents a real transaction data. It can also be used as a placeholder for such a requirement. This data set has 2 regular attributes and 1 special attribute. The regular attributes are Item (nominal) and Amount (integer). The special attribute is Id (nominal). This Id attribute represents the customer Id. All items purchased by a single customer are listed in form of multiple examples with the same customer Id. The Item attribute tells which item was purchased and the Amount attribute tells the quantity of the item that was purchased. The number of transactions can be set by the number transactions parameter. To have a look at this ExampleSet, just run the attached Example Process.

Output

  • output (IOObject)

    The Generate Transaction Data operator generates an ExampleSet which is delivered through this port. The meta data is also delivered along with the data.This output is the same as the output of the Retrieve operator.

Parameters

  • number_transactionsThis parameter specifies the number of generated transactions. Range: integer
  • number_customersThis parameter specifies the number of generated customers. Range: integer
  • number_itemsThis parameter specifies the number of generated items. Range: integer
  • number_clustersThis parameter specifies the number of generated clusters. Range: integer
  • use_local_random_seedThis parameter indicates if a local random seed should be used for randomization. Using the same value of local random seed will produce the same ExampleSet. Changing the value of this parameter changes the way examples are randomized, thus the ExampleSet will have a different set of values. Range: boolean
  • local_random_seedThis parameter specifies the local random seed. This parameter is only available if the use local random seed parameter is set to true. Range: integer

Tutorial Processes

Introduction to the Generate Transaction Data operator

The Generate Transaction Data operator is applied for generating an ExampleSet that represents transaction data. The number transactions parameter is set to 1000, thus the ExampleSet will have 1000 examples. The number customers parameter is set to 50, thus there will be 50 unique values in the Id attribute. The number items parameter is set to 80, thus there will be 80 unique values in the Item attribute. You can see the ExampleSet in the Results Workspace. The use local random seed parameter is set to false in this Example Process. Set the use local random seed parameter to true and run the process with different values of local random seed. You will see that changing the values of local random seed changes the randomization.