Strategies for Anomaly/Outlier Detection

This article applies to:

  • Querying
  • Product edition: All
  • Feature Category: Data Ingestion

Overview

The ability to detect anomalies or outliers is one of the key benefits of monitoring. Using a static threshold to determine whether current behavior is anomalous is a basic way to do this. However, the rich set of functions available with the Wavefront Query Language allows you to use various dynamic strategies for detecting anomalies. Multiple strategies can be found on this page (see the See Also section for more additional resources). In this article, we will highlight a few additional strategies.

Strategies

At the core of these strategies is comparing current behavior with past behavior. This essentially requires determining a baseline from past behavior. The strategy or combination of strategies that work best will depend on the nature of the behavior and your use case.

Using the Previous Week as a Baseline

Comparing current behavior to last week's behavior is made simple with the lag function. Assuming the query that specifies current behavior can be referenced with the query line variable ${current}, an example query would be:

${current}/lag(1w, ${current})

This would give us the ratio between current behavior and what it was a week ago. If preferred, it's easy to make this a percentage change instead of a ratio. Additionally, if it's more suitable, we could easily compare instead against a different point in time, such as the previous day, instead of the previous week.

Using the Average Behavior of the Last X Weeks as a Baseline

This approach is similar to the previous approach but it accounts for several weeks' worth of behavior rather than just the previous week. For example, let's say we wanted to establish a baseline using the last 3 weeks' worth of behavior. We'd have 3 queries to specify the behavior for each of the previous 3 weeks:

1-week: ${current}/lag(1w, ${current})
2-week: ${current}/lag(2w, ${current})
3-week: ${current}/lag(3w, ${current})

Then, we can find the average behavior from these 3 weeks:

baseline: (${1-week)+${2-week)+${3-week))/3

or

baseline: rawavg(collect(${1-week),${2-week),${3-week)))

Again, we can determine a ratio of the current behavior against this baseline:

${current}/${baseline}

 

Note: If desired, instead of an average, we could just as easily calculate other statistics.

 

Using Standard Deviation

We can use standard deviation to identify when behavior deviates from historical behavior. For example, if we wanted to see how current behavior deviates from behavior over the last week, we can have:

(${current} - mavg(1w, ${raw})) / sqrt(mvar(1w, ${current}))

This will return the number of standard deviations current behavior varies from historical behavior.

 

See Also

Detecting Anomalies with Functions and Statistical Functions

How to Auto-Detect Cloud App Anomalies with Analytics: 10 Smart Alerting Examples (Part 1, Part 2, Part 3, Part 4)

Query Language Recipes

Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk