You are viewing the RapidMiner Go documentation for version 9.9 - Check here for latest version
Apply your model
The new data set
When you created your model, you knew the answers in advance. For example, when we were creating a model to predict sales from advertising, we knew the advertising budget for TV, radio, and newspapers, and we knew the resulting sales figures. Without that information, there would have been no obvious way to build the model.
Advertising.csv
TV | radio | newspaper | sales | |
---|---|---|---|---|
1 | 230.1 | 37.8 | 69.2 | 22.1 |
2 | 44.5 | 39.3 | 45.1 | 10.4 |
Once you have a model, it's the job of the model to provide the answers. To do so, it requires new data as input, compatible with the data that was used to build the model.
- It must include data columns of the same type as the inputs you selected when you built the model.
- It will ignore data columns of the same type as the inputs you deselected when you built the model.
- It will ignore data columns of the same type as the target column; these will be predicted.
In Advertising.csv, the first column is unneeded, because it was identified as an ID and deselected. The target column was "sales"; future values are unknown.
Suppose we're planning the budget for our next advertising campaign, and we have three different proposals. We can predict "sales" for each of them, and compare the results, by applying the model to the data below.
Three proposals for the advertising budget
TV | radio | newspaper |
---|---|---|
200 | 0 | 100 |
250 | 50 | 0 |
300 | 0 | 0 |
Notice that these values were chosen from within the range of values in the training set:
- TV: 0 - 300
- radio: 0 - 50
- newspaper: 0 - 114
If you want to make predictions based on values outside this range, the errors are likely to be larger than they were for the test set.
Apply your model to the new data set
You can apply your model to the new data set either:
immediately after you're finished building it, from within Inspect results, or
later, by clicking on a recent analysis, via Home > Manage recent analyses
Once you have chosen an analysis:
Choose a model (in this example, Decision Tree)
Click on Apply Model, and select Apply to new data set from the drop-down menu
Click Add Data, and upload the new data set
When the new data set is displayed, click on Calculate Predictions
The new data set is displayed, together with the predictions
Regarding our three proposals for the advertising budget, there's a clear winner! The predictive model says that to maximize the predicted sales, we should invest heavily in radio advertising. The amount of TV and newspaper advertising seems to be less important, at least in these examples.
Click the Export button if you want to download the table in an Excel format (prediction_result.xlsx).