Alert CTR Score - Reseach discussion #13638
Replies: 2 comments
-
I'm noticing that the model is overfitting a bit on the alarm values themselves. Will have to go back to the drawing board a little to "round" or "bin" the alarm values in some sensible way such that the model has less chance to just overfit on really specific alarm values. Some logic to try control for this, by for example rounding % values to just nearest 5% integer and things like that. Along with more data will help here. Will iterate a bit more on this as is a problem i need to try resolve a bit. |
Beta Was this translation helpful? Give feedback.
-
cross linking as have graduated this discussion into this issue netdata/netdata-cloud#760 |
Beta Was this translation helpful? Give feedback.
-
This is a discussion around some internal research we are doing related to Alert CTR prediction that could end up as a feature in Netdata.
Idea
Build a model that will score each alert based on the probability of a click. This "Alert CTR Score" can then be used to rank, sort, filter etc alerts based on those the ML has learned tend to be more likely or less likely (than average) to result in a click.
Approach
Take all the clicks from alert emails sent by Netdata as our positive examples. Randomly sample a similar number of alert emails that did not result in clicks. This becomes the training data for a binary classification model. This model can then be used to score new alerts - those with a high score should tend on average to be more likely to solicit a click or response for the user than those with a low score.
User Value
A decent "Alert CTR Score" can then be just another Lego block that users could use in deciding how to filter/sort/respond to alerts.
Obviously, a low alert ctr score does not mean the alert "does not matter" it just means that on average users maybe have less of a tendency to click on such alerts then they might for one with a higher score.
Pros
Cons
Initial Research Results
If we train a model on full month of August data and then use a sample of emails from September, we see a plot like below. In this plot we have taken a random sample of 250,000 email alerts sent by Netdata in September (so not ever seen or trained on by the model) scored each one based on the model and then sorted all those alerts into 10 deciles. So, for example, decile 9 is basically the top 10% of scored alerts. In this group of alerts, we see that the average alert ctr score ("true prob mean") was 66.81% and the actual alert ctr rate ("true true mean") was 0.768%. This is (0.00768/0.002484) = 3.09 times higher ("uplift factor") than the average across the full sample of alerts (a "no model" benchmark of if you had to randomly guess). The difference between the lowest decile and the highest decile is even bigger at (0.00768/0.0008) = 9.6 times uplift from using the alert ctr score to rank alerts when it comes to comparing the lowest scored alerts to the highest scored alerts. This makes sense as it seems the model has learned what sorts of alerts very very rarely get clicked on and what ones have a much higher likelihood of getting clicked on. Its also nice then to see that the actual alert ctr rate follows the alert ctr score deciles as one would hope for.
If we do a similar exercise 50 times on a random sample of 50,000 alerts from the holdout data and plot the same lines we can see below plot that shows the stability of this result,
Beta Was this translation helpful? Give feedback.
All reactions