Building and deploying highly scalable, distributed applications in the ever-changing landscape of software development is only half the journey. The other half is monitoring your application states and instances while recording accurate metrics.
There are moments when you wish to check how many resources are being consumed, or how many files are under access by the specialized process, etc. These metrics provide valuable insights into our tech stack execution and management. This gives us leverage to understand the ultimate performance of what we have designed and eventually helps us optimize.
A vast majority of tools are already out there to get the job done, but today we will be talking about the specifics of StatsD. We will learn how to deploy your own Python StatsD client, how to employ it for monitoring your Python applications and then eventually store the recorded metrics in a database. Let’s get started!
StatsD is a node.js project that collects and listens to statistics and system performance metrics. These statistics are in turn sent over the network allowing us to collect varying kinds of metric data. A major advantage of using StatsD is the easy integration it offers with other tools like Grafana, Graphite, InfluxDB, etc.
But… what do you mean metrics are ‘pushed’ as they come?
Primarily, metric reporting has two execution models. In the pull model, the monitoring system "scrapes" your app at the given HTTP endpoint. In the push model, which is used by StatsD, the application sends the metrics to the monitoring system as they come.
1. First up, we need Python 3.6 or above and pip installed in the system.
You can verify your Python installation by running the following commands in your Linux, Windows or macOS system.
$ python --version
If not installed, check out these installation instructions for your system.
2. You will need StatsD, Flask, Flask-StatsD, collecting flask related metrics automatically. Along with that, we would need virtualenv - a tool to create isolated python environments and SQLALCHEMY, a sample database.
pip install StatsD, flask, Flask-StatsD, virtualenv, Flask-SQLAlchemy
Pip will automatically install the latest versions of these packages.
We will start by implementing a Basic Timer:
Similarly, for a Basic Counter:
For this tutorial, we’ll design a basic to-do list application on Flask and record the operation metrics.
The complete tutorial repo can be forked from Github.
Step 1: Import dependencies - Lines 5-12:
Step 2: Start Flask App, Statsd Client and DB - lines 14-23:
Create a task class and define it in the DB model - lines 26 - 35:
Now, we add a task - lines 42 - 57:
The code adds a task’s contents received from the form in the POST request. However, what’s more important to be discussed over here is the metric reporting that is added.
Deletion of task - lines 60 - 65:
The above code executes the deletion of a task from the DB by adding the delete count to the basic counter for incrementation.
Recording these metrics with StatsD does the job for beginners. However, for a more industry-grade production-ready environment, these metrics should be handled by a service which makes storing and handling graphs easy for us. This is where Graphite comes in.
Graphite is designed to be a monitoring tool that is employed to track the performance of websites, applications/other services, and network servers. Graphite is one of those sensations in the tech world which essentially ignited a new generation of monitoring tools, making it much easier to not just store and retrieve, but also share and visualize time-series data.
Graphite essentially performs two operations:
Graphite is not a collection agent and shouldn’t be treated like one, rather it offers a simpler path for getting your measurements into a time-series DB. To test sending metrics from your server or local machine to an already running graphite instance, run the following single line command:
`$ echo "foo.bar 1 `date +%s`" | nc localhost 2003`
Once installed, simply logging metrics with StatsD will make Graphite pick up on all the data logged. Now, Graphite seems to be a big deal, however there are still certain fallbacks of Graphite that developers would like to get resolved. This is where MetricFire comes in.
But if you would still prefer a self-hosted and self-managed service, wishing to have complete control over it all, then a straightforward way could be to launch graphite with StatsD and docker.
It is possible to deploy StatsD in your favorite environment with your preferred architecture and other services/microservices. Just make sure the StatsD server is reachable by all the client-side apps that want to send metrics to the StatsD server - and StatsD won’t complain about it.
Just in: AWS Cloudwatch now also supports StatsD metrics in case you employ AWS cloud for hosting your infrastructure.
As far as visualization for the metrics we have accumulated is concerned, Grafana is the de facto tool for it.
Python StatsD client also comes with its API and can be triggered through HTTP requests if that’s how you wish to integrate it.
About the Authors
Written by Mrinal Wahal. Along with being a writer at Mixster, Mrinal is also a visionary computer scientist in the making who also heads his premier company Oversight. Oversight is primarily targeted towards the enhancement of Research & Innovation.
Edited by Vipul Gupta, a strong generalist, OSS python developer & documentation specialist. Apart from his love for party parrots and being a full-time student, he has been contributing to open-source both online & offline for the past 4 years. He leads several developer communities & works with the outreach team for the non-profit, Sugar Labs.
He runs his own initiative, Mixster which specializes in writing technical content for startups & organizations, just like the amazing folks at MetricFire. Available all over the web as vipulgupta2048 & always looking for opportunities. Let’s connect!