res/res_prometheus.exports.in · eef29d24e1a11e62edf017c695968e6c3e50599c · Voice / asterisk

6 years ago

Add core Prometheus support to Asterisk · c50f29df

Matt Jordan authored 6 years ago

Prometheus is the defacto monitoring tool for containerized applications.
This patch adds native support to Asterisk for serving up Prometheus
compatible metrics, such that a Prometheus server can scrape an Asterisk
instance in the same fashion as it does other HTTP services.

The core module in this patch provides an API that future work can build
on top of. The API manages metrics in one of two ways:
(1) Registered metrics. In this particular case, the API assumes that
    the metric (either allocated on the stack or on the heap) will have
    its value updated by the module registering it at will, and not
    just when Prometheus scrapes Asterisk. When a scrape does occur,
    the metrics are locked so that the current value can be retrieved.
(2) Scrape callbacks. In this case, the API allows consumers to be
    called via a callback function when a Prometheus initiated scrape
    occurs. The consumers of the API are responsible for populating
    the response to Prometheus themselves, typically using stack
    allocated metrics that are then formatted properly into strings
    via this module's convenience functions.

These two mechanisms balance the different ways in which information is
generated within Asterisk: some information is generated in a fashion
that makes it appropriate to update the relevant metrics immediately;
some information is better to defer until a Prometheus server asks for
it.

Note that some care has been taken in how metrics are defined to
minimize the impact on performance. Prometheus's metric definition
and its support for nesting metrics based on labels - which are
effectively key/value pairs - can make storage and managing of metrics
somewhat tricky. While a naive approach, where we allow for any number
of labels and perform a lot of heap allocations to manage the information,
would absolutely have worked, this patch instead opts to try to place
as much information in length limited arrays, stack allocations, and
vectors to minimize the performance impacts of scrapes. The author of
this patch has worked on enough systems that were driven to their knees
by poor monitoring implementations to be a bit cautious.

Additionally, this patch only adds support for gauges and counters.
Additional work to add summaries, histograms, and other Prometheus
metric types may add value in the future. This would be of particular
interest if someone wanted to track SIP response types.

Finally, this patch includes unit tests for the core APIs.

ASTERISK-28403

Change-Id: I891433a272c92fd11c705a2c36d65479a415ec42

c50f29df

History

Add core Prometheus support to Asterisk

Matt Jordan authored 6 years ago

Prometheus is the defacto monitoring tool for containerized applications.
This patch adds native support to Asterisk for serving up Prometheus
compatible metrics, such that a Prometheus server can scrape an Asterisk
instance in the same fashion as it does other HTTP services.

The core module in this patch provides an API that future work can build
on top of. The API manages metrics in one of two ways:
(1) Registered metrics. In this particular case, the API assumes that
    the metric (either allocated on the stack or on the heap) will have
    its value updated by the module registering it at will, and not
    just when Prometheus scrapes Asterisk. When a scrape does occur,
    the metrics are locked so that the current value can be retrieved.
(2) Scrape callbacks. In this case, the API allows consumers to be
    called via a callback function when a Prometheus initiated scrape
    occurs. The consumers of the API are responsible for populating
    the response to Prometheus themselves, typically using stack
    allocated metrics that are then formatted properly into strings
    via this module's convenience functions.

These two mechanisms balance the different ways in which information is
generated within Asterisk: some information is generated in a fashion
that makes it appropriate to update the relevant metrics immediately;
some information is better to defer until a Prometheus server asks for
it.

Note that some care has been taken in how metrics are defined to
minimize the impact on performance. Prometheus's metric definition
and its support for nesting metrics based on labels - which are
effectively key/value pairs - can make storage and managing of metrics
somewhat tricky. While a naive approach, where we allow for any number
of labels and perform a lot of heap allocations to manage the information,
would absolutely have worked, this patch instead opts to try to place
as much information in length limited arrays, stack allocations, and
vectors to minimize the performance impacts of scrapes. The author of
this patch has worked on enough systems that were driven to their knees
by poor monitoring implementations to be a bit cautious.

Additionally, this patch only adds support for gauges and counters.
Additional work to add summaries, histograms, and other Prometheus
metric types may add value in the future. This would be of particular
interest if someone wanted to track SIP response types.

Finally, this patch includes unit tests for the core APIs.

ASTERISK-28403

Change-Id: I891433a272c92fd11c705a2c36d65479a415ec42