Tuning Predictive Intelligence Models (part 1)

You're sold on using Predictive Intelligence to route work more efficiently, help drive faster resolution, and recommend major incidents before they become a serious problem. AWESOME! Then, you run your first classification, similarity, or cluster solution definition and the results aren’t what you expected. What do you do? Part 1 of this article will break down some of the most common tuning patterns we see in the field for classification. Part 2 will provide similar tuning recommendations for similarity and clustering.

Classification is used to auto-populate fields on case/incident forms to assist in intelligent routing to the most efficient assignment group. Just to be clear, machine learning will never be 100% right (aka precise) all the time. The goal is for the machine learning model to be right more than their human counterparts. Using real world data, classification models with a precision higher than 70% and coverage higher than 80% are typically more effective than agents looking at thousands of incidents/cases per month.

Figure1 represents a good classification model. The blue circles represent our assignments groups, the larger the circle the greater the distribution. What's distribution? Let's look at the blue bubble "Deskside Support". This assignment group represents 22% of the incidents from the training set. What makes this a good model is that all the blue bubbles which represents assignments groups are in the upper far right quadrant. Which means they have good precision and good coverage. Note: the x-axis changes to “recall” in Orlando. (This is a big picture you may need to scroll right.)

In Figure2 we notice that there are a couple of outliers that are not in the far upper right quadrant. Those are assignment groups that could be excluded from the model as they have a very little distribution and are most likely rare cases where a human is required to determine when an incident goes to that assignment group. Our Predictive Intelligence documentation lists out the steps on how to exclude classes from the model.

Below Figure3 represents a model that needs a lot of work. You’ll notice many of the assignment group bubbles with large distributions are scattered all over the classification graph, with several in the low precision and low coverage sections.

So how do you move all those assignment group bubbles to move to the upper right, like our ideal model in Figure1?

Starting in Orlando, we have the ability to set the target precision and coverage for a solution definition. For most customer, setting a target solution level precision should be the first step when they start using predictive intelligence. By setting the target level precision, predictive intelligence will automatically change recall and coverage values to get all classes close to the target precision of 90% (see Figure4).

What if target metrics aren’t helping? I’ve listed a few approaches that can help you move the model to a more ideal state below.

Techniques to improve the classification model precision & coverage:

Understand the quality of the data. If there are lots of assignment groups that mean the same thing with different spellings or if there are empty/null assignment groups this will impact the precision of model. You’ll want to either exclude those from the training set or clean up your data.
“More is not always better”. Sometimes training using 200k records vs a smaller selection of 80k records from the past 3 months may drive down precision & coverage. If your precision is low try training with a smaller data set first (i.e. 30k records), increase the number of records to see if that increases the precision. In some cases, 200k records may reduce precision because there is a chance for more bad data in that training set.
Make sure there is good distribution of the assignment groups. If 90% of the incidents are skewed toward “Deskside Support” then you don’t need a machine learning model to tell you where new incidents will land. Also, you don’t want to use inputs with lost of unique values. For example configuration item, which may have millions or more records may make it harder for the machine learning algorithm to predict if you us CI as an input field.
Look at individual class level settings and see if a higher precision can be selected for that class.
If you started with the OOTB classification solution definition and it’s giving a low precision/coverage then try analyzing the incident/case data to determine what other inputs may help the ML predict a more precise result. Running lists/reports with these different input combinations should shed light on whether additional inputs such as category, subcategory, or location might help improve the precision/coverage of the model.
Simplify your model. Because predictive intelligence is easy to configure many customers will add lots of inputs to the solution definition without understanding that “more is not better”. If you have lots of inputs and your model has low precision/coverage, then strip out all the inputs and start again with just short description. Update and retrain and then add input #2 and so forth until you get the desired precision.
You might not be able to predict assignment group with just one model. So, plan to use multiple models. One model may not be able to predict all the assignment groups. You can call different solutions definitions in multiple passes. In Figure3, I used four classification models to predict the assignment group.
Finally use Performance Analytics(PA) with Predictive Intelligence. PA has helpful OOTB indicators such as re-assignment count that can help identify areas where you want to focus Predictive Intelligence.

If you’d like to learn more about how to interpret and tune similarity and clustering models make sure to check back on the Analytics, Intelligence, and Reporting Community.

Thanks for reading. Stay health and safe!