Time series basics

Metrics collected by the monitoring are stored in a time series-database
It is important to understand the query principles and mechanisms to ensure 
displayed data are accurate and relevant

Concepts

A time-serie consists of values associated to timestamps.
A time serie has a name identifying the kind of data it represents, such as system.disk.free_space
A time serie can also have tags, in general used to identify the resources associated the the data, such as a server name or a disk
One times serie is created for each set of unique combination of tags, it is basically a table associating timestamps to values.

Querying

Time serie name

To be used in a table or a graph, you have to query the database to return data associated to a particular time serie name (aka metric) , within a period of time
For a particular metric name, you can have multiple underlying time series, one for each combination of tags.
Example: metric server.disk.free_space, for 2 servers and 2 disks each, this creates 4 time series:

12:00:00 server.disk.free_space;server=server1;disk=C, 50 
12:00:00 server.disk.free_space;server=server1;disk=D, 80 
12:00:00 server.disk.free_space;server=server2;disk=C, 20
12:00:00 server.disk.free_space;server=server2;disk=D, 10

Aggregation

When building a query, you must first define the time serie name and time window
Then you must define the aggregation: This defines how, for each timestamp, the values will be collided.
Using the above example, it gives:
- Max → 80
- Min → 10
- Avg → 40
- Sum → 160 (not relevant)

Grouping

For a given time serie name, you can have multiple time series, for each combination of tags
It is often necessary to collide those time series
Similarly to a GROUP BY, you need to select the associated operation and tags.
The group by operation is the one defined during the previous Aggregation step.
Here you need to define which tags are going to be used as keys the regroup the data.
Using above example:
- max(none) → returns 1 result : 80
- max(system) → returns 2 results: 80 for server1 and 20 for server 2
- max(disk) → returns 2 results: 50 for disk C and 80 for disk D
- max(server, disk) → returns the 4 results. no colliding happens when the query spans over all existing combination of tags.

Roll up

Until now, you have defined how to vertically aggregate the values of the time series for each timestamp.
You also need to define how the data points are going to be aggregated horizontally, across time. this is called the roll-up.
Used in a graph, Roll-up defines how to aggregate the values from within a given time window
Example:
- A bar graph represents number of errors per day, and the number of errors are collected every 15 minutes, which represents 96 data points per day.
- The roll up will define how we compute the value which will better represent those 96 datapoints in a single bar: Min / Max / Sum / Avg / Last
- Any datapoint on a graph will have to usually represent multiple, more granular datapoints.
- The roll-up defines what is the best way to reflect the information you want to get.
In a table or a gauge, all the data points of the collided series must still result to a single value, the roll-up is used for that.

Summary

As you can see, the choice of aggregation and roll-up is VERY important.
The displayed information can be very misleading if not correctly defined.

Products documentation

Table of Contents

Time series basics

Concepts

Querying

Time serie name

Aggregation

Grouping

Roll up

Summary