1. This article describes how you can run predictions on your trained model after you have uploaded your desired dataset on the platform and created your prediction report. A prediction report is a repository which will give information about the top drivers, algorithms used, the accuracy that the model achieved, the significance of the model fit, etc.

  2. The Predictions tab on the prediction report takes you to the Predictions page where, as the name suggests, you can try out future random combinations of values on your feature columns and also get a sneak peek into the test data and the model’s predictions on the test data.

  3. The first section of the Predictions page focuses on Personas. Personas help you run/test what-if scenarios, i.e., future combinations of data values to check what would be the future prediction output

  4. The prediction column is on the left (e.g., severity here). Since we have four classes in this prediction column, and the current random combination of feature column values predicts a 100% probability of minor_damage. The platform also gives relative % for all the other classes. This means that all the classes together totals to 100%. All the feature values have drop downs associated with them. For example, numeric values can be increased or decreased to a desired value, and for text feature columns the dropdown gives the option to choose among the different class values present in the original dataset

  5. We also have the “Ideal Personas” tab that gives us the feature value combinations that lead to predicting a particular class in the prediction column. For example, here we have the severity classes - Fatal, minor_damage, serious_damage and significant_damage. The platform generates “Ideal Personas” for all the classes by default, so you can have a look at what are the feature values that lead to a 100% probability of each class in the prediction column

    For regression tasks, the Ideal Personas have the feature value combinations for highest and lowest

  6. Next, on this page we have the “Sample Predictions”. This section details the model’s predictions on the test set. As mentioned before in our Data Preprocessing methods article, we split the dataset in 80-20 where 80% of the data is used for training the models and 20% of the data is used for evaluating the performance of the trained model

    On the left we have the Sample Inputs, the feature columns from the original dataset without the prediction column. Thus the model does not know the actual values for each of these rows. After the model gets trained we use this test dataset to run predictions and collect the predicted value for each row as shown on the Predictions section on the extreme right. Note that every time the model predicts the same value as the actual value there is a “Correct” label associated with the corresponding output. Similarly, if the predicted value is different from the actual value, there is an “Incorrect” label assigned to the value

    Regarding the Sample Predictions for regression tasks, the prediction column has the corresponding values with the tags “In Range” or “Out Range”. This is because of the precision range, shown as uncertainty on the Personas section. For example, Agility = 67.70 +- 4.938 means that the predicted value can be between (67.70+4.938) = 72.638 and (67.70-4.938) = 62.762. This gives the user the flexibility to understand the range of a particular predicted value. “In Range” means that the actual value lies within the range of the predicted value, and similarly “Out Range” means that the actual value is outside the range of predicted value

    The Sample Predictions can be downloaded as a csv file from here to have a look into the entire 20% dataset, since, on the platform we just show the first 16 datapoints.

    For classification, we have the prediction column with the actual values, a column with the predicted values by the model and the prediction probability

    For regression we just have the actual values and the predicted values without the uncertainty range

Did this answer your question?