Getting started¶
Architecture overview¶
Gnocchi consists of several services: a HTTP REST API (see REST API Usage), an optional statsd-compatible daemon (see Statsd Daemon Usage), and an asynchronous processing daemon (named gnocchi-metricd). Data is received via the HTTP REST API or statsd daemon. gnocchi-metricd performs operations (statistics computing, metric cleanup, etc…) on the received data in the background.
All those services are stateless and therefore horizontally scalable. Contrary to many time series databases, there is no limit on the number of gnocchi-metricd daemons or gnocchi-api endpoints that you can run. If your load starts to increase, you just need to spawn more daemons to handle the flow of new requests. The same applies if you want to handle high-availability scenarios: just start more Gnocchi daemons on independent servers.
As you can see on the architecture diagram above, there are three external components that Gnocchi needs to work correctly:
An incoming measure storage
An aggregated metric storage
An index
Those three parts are provided by drivers. Gnocchi is entirely pluggable and offer different options for those services.
Incoming and storage drivers¶
Gnocchi can leverage different storage systems for its incoming measures and aggregated metrics, such as:
File (default)
Ceph (preferred)
Depending on the size of your architecture, using the file driver and storing your data on a disk might be enough. If you need to scale the number of server with the file driver, you can export and share the data via NFS among all Gnocchi processes. Ultimately, the S3, Ceph, and Swift drivers are more scalable storage options. Ceph also offers better consistency, and hence is the recommended driver.
A typical recommendation for medium to large scale deployment is to use Redis as an incoming measure storage and Ceph as an aggregate storage.
Indexer driver¶
You also need a database to index the resources and metrics that Gnocchi will handle. The supported drivers are:
PostgreSQL (preferred)
MySQL (at least version 5.6.4)
The indexer is responsible for storing the index of all resources, archive policies and metrics, along with their definitions, types and properties. The indexer is also responsible for linking resources with metrics and the relationships of resources..
Understanding aggregation¶
The way data points are aggregated is configurable on a per-metric basis, using an archive policy.
An archive policy defines which aggregations to compute and how many aggregates to keep. Gnocchi supports a variety of aggregation methods, such as minimum, maximum, average, Nth percentile, standard deviation, etc. Those aggregations are computed over a period of time (called granularity) and are kept for a defined timespan.
Gnocchi uses three different back-ends for storing data: one for storing new incoming measures (the incoming driver), one for storing the time series aggregates (the storage driver) and one for indexing the data (the index driver). By default, the incoming driver is configured to use the same value as the storage driver.