The ExtraHop platform uses several different metric types. When creating custom metrics, it is important to choose the correct metric type. The following is copied from the AI Trigger API Doc:
- Top-level metrics. Time series of simple datatypes.
- count: Number (e.g., HTTP requests).
- snapshot: A special type of count metric that when queried over time, returns the most recent value (e.g., TCP established connections).
- dataset: Statistical summary of timing information (5-number summary: min, 25th-percentile, median, 75th-percentile, max).
- sampleset: Statistical summary of timing information (mean and standard deviation).
- max: A special type of count metric that preserves the maximum.
- Detail metrics. Time series of datatypes consisting of key-value pairs, where the key is a string or an IP address and the value is a top-level data type. Detail metrics provide drilldown information for top-level metrics.
Below we expand the definitions of the different metric types. Remember that each data point represents a specific roll-up time period: either 30 seconds, 5 minutes, 1 hour, or 24 hours (if you are using the extended datastore) depending on the time range being analyzed.
The following metrics are numbers, used when a single number is representative of the data (requests, connections, money earned, bytes)
- Counts: The sum of values over a data point’s time period. Every value committed adds to the sum for the data point. Remember that rates are controlled in visualization (charts/widgets) so bytes stored in a count can either be total traffic volume (GB) or throughput (Gbps) based on how the visualization is configured.
- Snapshot: The most recent value over a data point’s time period. Use this for point-in-time metrics or metrics which don’t make sense to sum. Examples would be ratios or current connections
- Max: The largest value over a data point’s time period. We often use the Max data type with metrics tied to SLAs, especially in conjunction with the Single Value widget.
The following metrics are set-type metrics. They are useful for metrics that you would want to perform statistical analysis on, ie finding “normal” behavior (eg processing time, RTT, deal size, etc). Bear in mind, a mean is affected by data outliers (really large or really small numbers compared to a normal). Median values aren’t affected by outliers.
- Dataset: Summary of the set of data using the median and percentiles. Datasets offer finer granularity than Samplesets, but they are not calculating the mean. Instead the data points represent sorted sets of all values for the metric, which can show the median or other percentiles (5th, 25th, 75th, 95th, etc). Typically datasets are used for summary metrics, especially across multiple devices, applications, etc, where one outlier can skew the the data.
- Sampleset: Summary of the set of data using mean (average) and standard deviation. Samplesets offers less granularity but can quickly identify ‘normal’ behavior, so they are commonly used as a detail metric, especially to summarize data points about a single thing (eg a server, URI, file). You’ll notice the ExtraHop UI is organized in this manner, as shown below.
Notice how the top level RTT metric is broken down by quartiles (min,25%,median,75%,max). This is indicative of a dataset metric.
On the RTT drilldown, in this case where RTT is broken out by URI, the RTT metrics are broken out by mean (average) and standard deviation.
By better understanding each metric type, you are better equipped to identify when to use each for your own data. In a later post, we’ll cover detail metrics in more depth, but until then, questions and comments are welcome!