To say whether a model is good or bad, and in particular whether it is better or worse than some other model, we need to have some basis of comparison. By assigning a numeric measure of success to the model, a so-called performance metric, you can compare it with other models and get some idea of its relative success.
The complication is that many different performance metrics exist, and none of them is absolute as a standard of success; each has strengths and weaknesses, depending on the problem you're trying to solve. You will have to choose the best performance metric for your problem, and with the help of this performance metric, you can choose the best model.
To calculate the performance metrics, we start by building a model based on a random sample of 80% of your data (the training set). Once it is built, we apply the model to the remaining 20% of your data (called the test set) and compare the predictions with the known values. Ideally there should be no difference, but in practice there usually is, because the predictions are rarely 100% correct.
Recall from 2. Choose column that the type of problem you're solving depends on the values in the target column. Are they categorical or numerical? Depending on what you're trying to predict, there are different performance metrics. For a more detailed discussion, including examples, see the links below:
- 5A. Binary classification (categorical data, two possible values)
- 5B. Multiclass classification (categorical data, three or more possible values)
- 5C. Regression (numerical data)
The results include charts and tables illustrating the relative strength of each model. When you're done, you can download these charts and tables by clicking Export.
Auto Model Web is not a black box. When you're done building the model, you can download a copy of the RapidMiner process that created it, and you can import the process into RapidMiner Studio for more detailed examination. You can run this process, you can modify this process, you can make any changes you like!
Charts as PNG
The charts include all the performance measures, displayed graphically.
Tables as Excel
Any tables displayed in the results, including confusion matrices and performance comparison tables,
can be exported in an Excel format (