1. This article describes the factors that influence the predictions on the data. When collecting your data you might have a lot of feature columns. While it is advisable to have as many feature columns as possible, it is also important to know how important each of those features are to generate predictions, because all features might not be equally important

  2. After uploading your data and generating your prediction report you will see on the Overview tab that there is a section called Drivers. On this section, on the left we have the various feature columns present in the original data that was used for training the model. The features are ranked according to their influence/impact on the final prediction column

  3. For example, the feature with the highest impact will be on the top and the feature with the lowest impact will be on the last

  4. On the right we have the distribution graph of each feature. For example, for the severity to be Fatal, the highest impact is from accident_type and on the right we see the distribution graph of the same. The accident type has 7 classes and each bar shows the frequency of each class

  5. For a numerical column like safety_score, we have a probability distribution graph that is continuous. Here we can see that when the value of the safety_score is around 2.5 it has the maximum value of severity-Fatal

    All the feature columns values add up to close to 100%

Did this answer your question?