This article describes how you can evaluate the trained model’s performance once you have uploaded your data on the platform and built your first prediction model. The model is trained on 80% of the data and its performance is evaluated on the rest 20% of the data. All the technical details of your model including graphs can be found under the Tech Specs tab of the model.

AutoML

Classification

The first section is the Model Charts.

There are 3 main graphs (that you can download using the corresponding download buttons):

1. Decision tree: The decision tree gives the details on the model’s decision making process, going through each feature and its value in order to generate a decision on the classes.

2. Confusion Matrix: The confusion matrix gives the performance details of the trained classification model (classifier). Here we can check the precision and recall values. It is shown as an mxm table, thus, if you have 2 classes, you have a 2x2 confusion matrix, and if you have 3 classes (multiclass prediction) you will have a 3x3 matrix and so on. On the y-axis we have the actual labels and on the x-axis we have the predicted labels. Ideally the higher the values on the diagonal the better the classification model. It means that the model predicted most of the values correctly

3. Actual vs Predicted: Next, we have the Actual vs. Predicted Value bar charts which explains the same thing as the confusion matrix but may be sometimes easier to follow. The blue is for True labels and Red is for False. In the graph, on the axis we have the classes and on the y-axis we have the count/frequency of those classes. For example, the Churn bar chart conveys that of all the values predicted by the model, blue is the count of correct classification and red is the count of incorrect classification for the Churn=NO class. Similarly for all the other classes the depiction is the same.

The next section is the Model Metrics where you’ll find the details of the algorithm such as the algorithm name, training, validation and testing accuracies, F1 score, precision, recall, area under curve, number of iterations, upsampling%, loss function, regularization method and the learning rate. In the middle you’ll find the corresponding values of each metric. On the right you’ll find an indication of whether the value needs to increase for better performance. The tooltips associated with each metric gives a brief description of that metric.

The next section shows the details of your uploaded dataset on which you built this model. Rows or columns in your dataset that were not used during the model building are reflected here. Additionally, you’ll also find the details of which column were used and which were dropped by the platform, the list of dropped columns also includes the columns that you decided to not use for model building.

The final section is the Other Machine Learning Models that were tested. When the model training process starts, the platform on the backend runs six main classification algorithms along with different hyper-parameter combinations. This results in more than 10000 models being run in parallel. Finally the model with the best weighted metric combination is chosen by the platform. We display the top 2 hyper-parameter combinations (named A and B) for each algorithm. You have the flexibility to choose and run a different model and check their details by clicking on the Re-run with this model.

Regression

The first section is the Model Charts. There are 3 main graphs (that you can download using the corresponding download buttons):

1. Percentage Error Plot: The percentage error plot is a 3D plot where we have the test data points on the x axis and the prediction column and the percent error on the y-axes.

2. Error Plot Histogram: The Error plot histogram gives the same information as the percentage error plot but a different visualization. The bar chart depicts the number of datapoints that have a particular range of percent error.

3. Actual vs Predicted: The Actual vs. Predicted Value line plot gives the details on the test set results. The x-axis has the data points and the y-axis has the predicted values. The blue line represents the actual values in the test set and the red line represents the predicted values by the model. The overlap of the blue and red lines means that the actual and the predicted values are the same. The more the overlap the better the performance of the model.

The next section is the Model Metrics where you’ll find the details of the algorithm such as the algorithm name, training, validation and testing RMSE (root mean squared errors), R2 score, mean squared error, mean absolute error, min and max absolute errors, mean absolute percentage error, root mean squared percentage error, mean squared log error and loss function. In the middle you’ll find the corresponding values of each metric. On the right you’ll find an indication of whether the value needs to increase or decrease for better performance. The tooltips associated with each metric gives a brief description of that metric.

The final section is the Other Machine Learning Models that were tested. When the model training process starts, the platform on the backend runs five main regression algorithms along with different hyper-parameter combinations. This results in more than 10000 models being run in parallel. Finally the model with the best weighted metric combination is chosen by the platform. We display the top 2 hyper-parameter combinations (named A and B) for each algorithm. You have the flexibility to choose and run a different model and check their details by clicking on the Re-run with this model.

Time Series

The Tech Specs tab shows the following sections:

Model Charts are downloadable and we have the Error Plot Histogram and Actual vs Predicted plot.

The error plot histogram is a visualization of % errors and their corresponding count. On the x-axis we have the %error (between actual and predicted value) and on the y-axis we have the count. This means that each bar here represents the number of datapoints with the corresponding %error. For example, there are 5 data points corresponding to an error of 10%.

The Actual vs. Predicted graph shows the line plot for the actual sales (blue) and predicted sales (red) on the test set. The more the overlap, the better the forecast. On the x-axis we have the Dates and on the y-axis we have the sales.

The next section is the Model Metrics where you’ll find the details of the algorithm such as the algorithm name, testing RMSE (root mean squared errors), mean squared error, mean absolute error, mean absolute percentage error, mean absolute scaled error and skill score. In the middle you’ll find the corresponding values of each metric. On the right you’ll find an indication of whether the value needs to increase or decrease for better performance. The tooltips associated with each metric gives a brief description of that metric.

The next section shows the details of your uploaded dataset on which you built this model. Specifically, the dataset name, source and number of rows present in the data. Additionally on the right we display the aggregation details, that is, the data level, aggregation function and seasonality chosen before starting to build this model.

The final section is the Other Machine Learning Models that were tested. When the model training process starts, the platform on the backend runs seven main time series algorithms. Finally the model with the best skill score is chosen by the platform. You have the flexibility to choose and run a different model and check their details by clicking on the Re-run with this model.

To see Obviously AI in action, checkout this demo video OR enroll in the No-Code AI University for free to become a certified no-code AI expert.