Categories

Versions

You are viewing the RapidMiner Hub documentation for version 10.0 - Check here for latest version

Tableau dashboards

Requirements

Before getting started, make sure that:

Also, make sure that you have access to RapidMiner Go if you plan to use it together with Tableau.

Introduction

Tableau Software's outstanding data visualization tools can be enriched via the Tableau Analytics Extension API. One of the core scenarios for Analytics Extensions is the integration of predictive models into Tableau visualizations. RapidMiner's implementation of the API, the Tableau Connector, makes it possible to feed data from Tableau to a RapidMiner machine learning model and to return the results to Tableau. It follows that you can integrate calculations executed by RapidMiner Go models and RapidMiner web services/Scoring Agent deployment into Tableau dashboards.

The documentation presented here is long, because there are numerous pieces of software that have to communicate with each other, and you can't afford to have any missing pieces. Don't lose the forest for the trees! In essence, there are only two steps:

  1. In RapidMiner, deploy a model.

  2. Tell Tableau how to get the results from that model.

If you've never before deployed a model in RapidMiner, you will benefit from taking a slower approach. Start by reading Example: Churn prediction so that you will understand how all the pieces fit together.

If you know how to deploy a model in RapidMiner, you can skip the example and concentrate on The JSON snippet. The quick summary of this entire document is contained in the following two lines:

  • In a Tableau workbook, select Analysis > Create Calculated Field.
  • In the Calculation Editor, enter a name for the calculated field and paste the JSON snippet.

Table of contents

The JSON snippet

Read more: Pass Expressions with Analytics Extensions

Tableau communicates with external services through calculated fields, and each calculated field uses the SCRIPT functions for expressions, operating as a table calculation.

RapidMiner's Tableau Connector supports the following SCRIPT functions:

  • SCRIPT_STR
  • SCRIPT_REAL
  • SCRIPT_INT

The SCRIPT functions expect the first parameter to be a script for the analytics extension. For example, when using Tableau’s analytics extension for Python (TabPy), the script is an actual Python script to be executed by the extension. When using RapidMiner's Tableau Connector, the script is a small JSON snippet.

From the perspective of RapidMiner, the JSON snippet contains the minimal set of information needed to send data from Tableau to a deployed model and get a result in return:

  • it specifies the RapidMiner back end, either a RapidMiner Go model or a web service, and
  • it provides additional information such as column names (if required).

For example, the following snippet says to connect to the RapidMiner Go engine and to use the RapidMiner Go model with the given id. RapidMiner Go expects the data to have three columns named Year, Country, and Cases.

{
    "engine": "go",
    "id": "980807d6-9332-4da9-b026-d8af0fa4879e",
    "inputs": ["Year", "Country", "Cases"]
}

To create a calculated field in Tableau, using RapidMiner Go to generate the result, proceed as follows:

  • Select Analysis > Create Calculated Field.
  • In the Calculation Editor, enter a name for the calculated field, and paste a JSON snippet as follows.

JSON snippet

SCRIPT_STR('{"engine": "go",
        "id": "980807d6-9332-4da9-b026-d8af0fa4879e",
        "inputs": ["Year", "Country", "Cases"]}',
    ATTR([Year]),
    ATTR([Country Code]),
    ATTR([Cases])
)

In this example, the column called Country Code in Tableau will be passed to a column called Country in RapidMiner Go.

By default, Tableau will not pass the column names to the Tableau Connector. Hence, naming the inputs explicitly in the JSON snippet will be necessary most of the time, even if there is no reason to rename columns such as Country Code in the example above.

JSON for RapidMiner Go

Read more: Deploy a model in RapidMiner Go

engine

When connecting Tableau to RapidMiner Go, the back-end engine is go.

id

When using RapidMiner Go as the back end, you have to specify the deployment id. This id is simply the last part of the deployment link, visible on the Manage Deployments page of RapidMiner Go. The deployment link always has the same structure, for example:

https://aihub.company.test/am/api/deployments/980807d6-9332-4da9-b026-d8af0fa4879e

and in this case, the deployment id is 980807d6-9332-4da9-b026-d8af0fa4879e.

inputs

The inputs must match the column names you used when creating the model. If in doubt, you can find the column names in the example request on the Manage Deployments page of RapidMiner Go.

output

By default, when using RapidMiner Go as the engine, the Tableau Connector will return only the prediction values. However, it is possible to select other fields by specifying the output in the JSON snippet. For example, if the model is a classifier with values "Yes" and "No", you can request the confidence values for the class "Yes" as follows:

JSON snippet

SCRIPT_REAL('{"engine": "go",
        "id": "980807d6-9332-4da9-b026-d8af0fa4879e",
        "inputs": ["Year", "Country", "Cases"],
        "output": "confidence(Yes)"}',
    ATTR([Year]),
    ATTR([Country Code]),
    ATTR([Cases])
)

Note here that we also changed the Tableau function call from SCRIPT_STR to SCRIPT_REAL.

JSON for RapidMiner web services

Read more: Create a web service/Scoring Agent deployment

The Tableau Connector can also connect Tableau to a RapidMiner web service/Scoring Agent, if the web service satisfies all of the following conditions:

  • the web service is public,
  • the web service accepts data in the same JSON format as RapidMiner Go, and
  • the web service responds with a data table in JSON format.

Connecting to such a web service follows the same approach as with RapidMiner Go, but there are a few differences.

engine

When connecting Tableau to a RapidMiner web service, the back-end engine is rms.

id

When using a RapidMiner web service as the back end, the deployment id is the name of the web service.

inputs

Unlike with RapidMiner Go, specifying inputs is optional. If inputs are omitted, the data will be passed to RapidMiner using generic column names: _arg1, _arg2, …, _argN.

Furthermore, the Tableau Connector supports passing constants such as Tableau parameters to RapidMiner as process macros using query parameters. Please note that you will have to configure the query parameters in the settings for the RapidMiner web service.

output

Finally, since the web service might not produce any prediction, the Tableau Connector will by default return the first column. To return any other column, include the column name as output.

The following examples invoke a public web service named echo that simply echoes the input data. You can download the RapidMiner process used to create the echo web service here.

In the first example, we do not specify an output, and the Tableau Connector returns the values of the first column (Year):

JSON snippet

SCRIPT_STR('{"engine": "rms",
        "id": "echo"}',
    ATTR([Year]),
    ATTR([Country Code]),
    ATTR([Cases])
)

In the second example, the second column is renamed as it was for RapidMiner Go. Moreover, the Tableau Connector returns the values of the third column, Cases, not the first column (Year):

JSON snippet

SCRIPT_REAL('{"engine": "rms",
        "id": "echo",
        "inputs": ["Year", "Country", "Cases"],
        "output": "Cases"}',
    ATTR([Year]),
    ATTR([Country Code]),
    ATTR([Cases])
)

Finally, in the third example, we show how to pass Tableau parameters or other constant values to the RapidMiner web service as process macros. The Tableau parameters should be listed before the other inputs:

JSON snippet

SCRIPT_REAL('{"engine": "rms",
        "id": "echo",
        "parameters": ["some-process-macro"],
        "inputs": ["Year", "Country", "Cases"],
        "output": "Cases"}',
    // Parameters go first
    [Some Tableau Parameter],
    // Regular inputs come second
    ATTR([Year]),
    ATTR([Country Code]),
    ATTR([Cases])
)

Example: Churn prediction with RapidMiner Go

In the remainder of this document we will look at a simple example, integrating a predictive RapidMiner Go model into an interactive Tableau dashboard. The example has four parts:

  1. The data
  2. Deploy a model in RapidMiner Go
  3. Create a calculated field in Tableau
  4. Interactive Tableau dashboards

The data

The example considers a simple churn use case for a fictional mobile phone and internet service provider. We have historical customer data in the following format.

Click here to download the data file: historical_customer_data.csv

Customer ID Mobile Plan Partner Plan Internet Plan Monthly Payment Churn
CX484660 Yes No No 30.95 No
CX968493 Yes No No 20.95 No
CX762775 No No Yes 42.95 Yes

If the value in the Churn column is "Yes", the customer has cancelled their contract with the service provider. If it is "No", the customer has remained with the company.

The idea is to predict which of the current customers are likely to churn during the next period. The data for these customers is part of a second data set which has the same structure except that it does not include the Churn column.

Click here to download the data file: current_customer_data.csv

Customer ID Mobile Plan Partner Plan Internet Plan Monthly Payment
CX112292 Yes Yes Yes 77.89
CT245008 Yes Yes Yes 74.89
CX227029 No No Yes 27.95

Deploy a model in RapidMiner Go

First, we upload the first data set (historical_customer_data.csv) to RapidMiner Go to build a predictive model. In Choose Column to Predict, select Churn:

In Select your inputs, RapidMiner Go will automatically detect the Customer ID column as a unique identifier and exclude it from the analysis based on High ID-ness:

For this example we use the preset focusing on Easily interpretable algorithms, but you can of course use other algorithms.

Please note that we have turned off the Explain predictions feature. This feature, when enabled, augments predictions with additional information about influence factors, but at the cost of slowing down the prediction. Since we will not use this information in the Tableau dashboard, we can disable the feature:

Of the three validated models, the Generalized Linear Model performs the best with our data set:

You can deploy the model by navigating to the model details and then clicking on Apply Model > Deploy Model. Make sure to copy the URL, since it includes the model id that will be inserted into the JSON snippet.

Our model is now ready for use!

Create a calculated field in Tableau

Next, we import the second data set (current_customer_data.csv) into Tableau. Note that this second data set lacks the Churn column -- the Churn value will be predicted by RapidMiner Go. Once it is imported into Tableau, the data looks as follows:

From the above user interface take the following steps:

  • select Analysis > Create Calculated Field, and
  • paste the the JSON snippet given below into the Calculation Editor.

The JSON snippet includes the RapidMiner Go id we saved previously, plus the column names. Although in this case the column names are the same in RapidMiner Go and Tableau, we explicitly list both of them, because Tableau does not include the names when sending data to the Tableau Connector.

JSON snippet

// https://aihub.company.test/am/api/deployments/1cfea671-d53b-4b45-97cc-72edcc54a7bc
SCRIPT_STR('{
        "engine": "go",
        "id": "1cfea671-d53b-4b45-97cc-72edcc54a7bc",
        "inputs": ["Mobile Plan", "Partner Plan", "Internet Plan", "Monthly Payment"]
    }',
    ATTR([Mobile Plan]),
    ATTR([Partner Plan]),
    ATTR([Internet Plan]),
    ATTR([Monthly Payment])
)

We can now add this table calculation to the Tableau dashboard, e.g., as a color mark for the table. Tableau will update the dashboard automatically with the churn predictions returned by RapidMiner Go:

Interactive Tableau dashboards

The dashboard created in the previous section provides very little for the dashboard user to interact with. In particular, the RapidMiner Go model is always applied to the same data. However, with only a small change you can turn the dashboard into an interactive simulator.

The idea is to introduce a parameter representing a rebate on monthly payments. We then score the modified data to see what effect the rebate has on churn prediction. The result is a Churn Simulator.

In the first step, we create a parameter called Rebate, with range of values between 0% and 90%.

Next, we use this parameter to create a calculated field called Lowered Monthly Payment, which applies the rebate to the field Monthly Payment taken from our original data:

[Monthly Payment] * (100 - [Rebate]) / 100

Finally, we modify the Churn Prediction so that it depends on the Lowered Monthly Payment, in effect a simulated data set depending on the Rebate.

JSON snippet

// https://aihub.company.test/am/api/deployments/1cfea671-d53b-4b45-97cc-72edcc54a7bc
SCRIPT_STR('{
        "engine": "go",
        "id": "1cfea671-d53b-4b45-97cc-72edcc54a7bc",
        "inputs": ["Mobile Plan", "Partner Plan", "Internet Plan", "Monthly Payment"]
    }',
    ATTR([Mobile Plan]),
    ATTR([Partner Plan]),
    ATTR([Internet Plan]),
    // The lowered monthly payment includes a simulated rebate
    ATTR([Lowered Monthly Payment]) 
)

All that is left to do is to add the parameter Rebate and the field Lowered Monthly Payment to the dashboard. Change the slider value for the Rebate, and the predictions are updated automatically.

As expected, some predictions change from "Yes" to "No" -- the model predicts that those customers could be convinced to stay with the company if we lower their monthly payments.