Understanding the state of your systems and their underlying infrastructure at all times is paramount for ensuring the stability and reliability of your services. Up to date information about the performance and health of your deployments not only helps your team react to issues in real time, it gives them the security to make changes with confidence and to safely forecast system failures or performance hiccups even before they occur.
Two very popular monitoring applications in the world of cloud computing are AWS CloudWatch, the principal monitoring application on the AWS suite, and Prometheus, a massive open source monitoring application originally developed at SoundCloud.
When using CloudWatch and Prometheus, we are given a wide range of built-in metrics to choose from. But sometimes, we need to monitor more than the standard set of metrics that Prometheus or CloudWatch give us. It is important that our monitoring platforms have the ability to allow users to make their own metrics custom to the system they are working with.
The purpose of this article is to provide an educational comparison of exposing and using custom metrics with these two popular monitoring applications: AWS CloudWatch and Prometheus.
To get started, jump on to the MetricFire free trial, where you can build your own Prometheus custom metrics. Our platform lets you try Prometheus and Grafana directly, and you can build your own custom dashboards. You can also integrate AWS CloudWatch with MetricFire, and monitor your AWS metrics in our platform. This is very helpful for AWS users who are looking for a second platform with greater flexibility and dashboarding options. Check out how to integrate AWS CloudWatch with Grafana on our Hosted Graphite documentation.
Simply put, custom metrics are metrics defined by the application user. They are different from built-in system metrics and their purpose is to allow users or system administrators to define whatever they want to monitor or track from their systems, even if this data is not natively exposed by the said system.
You can create and publish custom metrics to CloudWatch using the AWS Command Line Interface (CLI) tools or AWS API. These custom metrics can be created to collect all sorts of data, from application performance data not natively exposed by default, to business metrics like purchases made in a sales application.
Custom metrics can be created for any application running on an AWS service, with slightly different processes and requirements depending on the service. For example, compute services like Elastic Beanstalk (EBS) and Elastic Cloud Compute (EC2) allow the use of CloudWatch Monitoring Scripts, which are essentially perl scripts that allow you to create and report custom metrics.
These scripts are written in such a way that they define what metrics they would collect, and how they are collected, abstracting them away from the user. Using these scripts is as easy as installing and running them on the compute instances whose data you wish to collect.
CloudWatch Monitoring Scripts provide an amazing amount of flexibility and reusability with custom metrics, as you can very easily install and run these scripts on any compute instances you wish to monitor. The metrics collected by these scripts are then graphed in the CloudWatch console, allowing you to see all your custom metrics at a glance, all in one location.
One downside of using these scripts is that you have to predefine what metrics you wish to collect, before installing and running them on a compute instance. A detailed example of such scripts can be found in the EC2 docs.
Another way of creating custom CloudWatch metrics is through the AWS API, allowing us to create metrics directly from within our application code. Say, for example, we want to track certain user interactions on an ecommerce website (our Key Performance Indicator) that would give us an insight into business performance over time. Leveraging the power of the AWS SDK, we could write code to create custom metrics which track this data, and execute this code using a lambda function whenever the set event occurs. This code might be similar to the snippet below:
In simple terms, the lambda function above would create a custom metric called KeyPerformanceIndex which records data values for whatever data we specify in the Value field of the MetricData. Ideally, this data would be fed from our application and would represent the metric value for each time this metric is submitted. So, in essence, we could feed into this metric any data whose trend we want to keep an eye on. This function uses the boto3 library, which is the AWS SDK for python, to publish metrics to CloudWatch via the put_metric_data(PutMetricData) API call.
Worthy of note is the fact that the AWS CLI method (Monitoring Scripts) discussed above also makes use of this PutMetricData api call under the hood. Metrics produced by AWS services either have a default standard-resolution (they have a one-minute granularity), or a high-resolution (one second granularity), but keep in mind that every PutMetricData call for a custom metric is charged, so calling PutMetricData more often on a high-resolution metric can lead to a much higher cost. CloudWatch pricing is notorious for escalating above what was expected, for more info see CloudWatch pricing.
Prometheus is a very popular open source monitoring tool with a very simple (yet complex) use case: tell it where to find metrics by configuring a series of scrape jobs, with each job specifying a series of nodes with endpoints to be scraped. Prometheus then scrapes these endpoints for metric data at intervals, persists it locally and uses it to display visual metric charts.
You can display your metrics on the in-built Prometheus Expression Browser or export it to other graphing UI’s such as Grafana. For the most part, Prometheus custom metrics are somewhat similar to CloudWatch custom metrics in that they allow users the flexibility to specify and monitor “whatever” aspect of their application they wish to. However, Prometheus is a little restrictive on the kinds of metrics it supports, limiting them to 4 main types:
See more about Prometheus data structures in our article on how to query with PromQL.
So in order for us to create custom metrics, we’d have to specify one of the above types in our code and expose it to a dedicated endpoint, then have Prometheus scrape that endpoint at configurable intervals for the metric data.
However, the sheer amount of possible integration options Prometheus has, makes up for this slight lack of flexibility. Prometheus does this using exporters and other integrations, allowing third party software to push metrics to Prometheus with very little fuss. Being an Open Source project, these exporters can be written and maintained as part of the Prometheus GitHub organization, or written and hosted outside of Prometheus. Check out our article on Prometheus exporters here.
This is powerful in that it allows you to potentially write your own exporter for your system, and use this exporter to push your metrics to Prometheus. As a result, custom monitoring is a whole lot easier with Prometheus. Also, because Prometheus integrates with a huge variety of systems, the actual api we use in our code will depend on the underlying client library running our system. Here is a list of some of the client libraries in use with Prometheus.
Having looked at both systems, how their custom metrics work and how to create/publish them, it is obvious that they share striking similarities. Custom metrics, as a concept on its own, is meant to help the system developer or relevant stakeholders gain valuable insights into their applications and how they work, allowing them to specify and keep track of literally any aspect of that system, and then having this data fed into a monitoring application where it is processed and displayed in an easy to use manner.
Both AWS CloudWatch and Prometheus custom metrics satisfy this requirement sufficiently well. However, they do have a few differences which fosters the use of one in some scenarios over the other:
If you don’t want to go through the hassle of downloading and installing Prometheus locally, you can try out the hosted version, which has all the benefits of Prometheus, without having to worry about its maintenance overhead. Feel free to check out MetricFire’s hosted Prometheus offering.
Custom metrics are powerful features, no matter which monitoring platform is used to collect and process them, and despite the advantages and disadvantages of both platforms that we discussed above, they both have massive potential that’s just waiting to be applied.
There is an AWS CloudWatch integration with MetricFire, so one popular method is to send CloudWatch metrics over to MetricFire, where it's easy to do customized monitoring and dashboarding.
Try the MetricFire free trial to start monitoring with Prometheus today. Also, if you want to talk to the MetricFire team directly, book a demo and get us on video call. We’re always happy to talk about the best monitoring solutions for your company.