How to monitor Python Applications with Prometheus

How to monitor Python Applications with Prometheus

Table of Contents

Banner opt.2.webp

 

Introduction 

Prometheus is becoming a popular tool for monitoring Python applications despite the fact that it was originally designed for single-process multi-threaded applications, rather than multi-process. 

Prometheus was developed in the Soundcloud environment and was inspired by Google’s Borgmon. In its original environment, Borgmon relies on straightforward methods of service discovery - where Borg can easily find all jobs running on a cluster. 

Prometheus inherits these assumptions, so Prometheus assumes that one target is a single multi-threaded process. Prometheus’s client libraries also assume that metrics come from various libraries and subsystems, in multiple threads of execution, running in a shared address space.

To get started, sign up for the MetricFire free trial, where you can start using our Prometheus alternative on our platform and try out what you learn from this article.

 

 

Key Takeaways

  1. Prometheus is gaining popularity as a monitoring tool for Python applications, even though it was originally designed for single-process multi-threaded applications, not multi-process applications.
  2. Prometheus was developed at Soundcloud and was inspired by Google's Borgmon. It inherits some assumptions from Borgmon, including straightforward service discovery methods.
  3. Prometheus assumes that a target represents a single multi-threaded process and that metrics come from various libraries and subsystems running in multiple threads within a shared address space.
  4. When Prometheus scrapes a multi-process application, it may receive different values from different workers for the same metric, leading to inconsistency.
  5. Despite the challenges of monitoring multi-process applications with Prometheus, the article emphasizes that the suggested solutions offer workarounds to effectively use Prometheus as a monitoring tool for various applications, including IT resources and Application Performance Monitoring (APM).

 

Problems with integrating Prometheus into Python WSGI applications 

We start to see the breakdown when we run a Python app under a WSGI application server. With WSGI applications, requests are allocated across many different workers, rather than to a single process. Each of these workers is deployed using multiple processes. This results in a multi-process application. 

 

When this kind of application exports to Prometheus, Prometheus gets multiple different workers responding to its scrape request. The workers each respond with the value that they know. This means that Prometheus could scrape a counter metric and have it returned as 100, then immediately after it gets returned as 200. Each worker is exporting its own value, so the counter metric measures random pieces of information rather than the whole job. 

 

To handle these issues, we have four solutions listed below.

   

Sum all of the worker nodes

If you give a unique label to each metric, then you can query all of them at once, and effectively query the whole job. For example, if you give each worker a label such as worker_name, you can write a query such as:

  

sum by (instance, http_status) (sum without (worker_name) (rate(request_count[5m])))

  

This results in aggregating all of the worker nodes for one job at once. The problem with this is getting an explosion in the number of metrics you have. 

  

Multi-process mode

This method is our favorite here at MetricFire. We actually use this method to monitor our own application with Prometheus. This method entails using the Prometheus Python Client, which handles multi-process apps on gunicorn application server.

  

The Django Prometheus Client

This method designates each worker as a completely separate target. The Django Prometheus client sets it up so that each worker is listening for Prometheus’s scrape requests through its own port. 

  

StatsD exporter

This method rejects the concept that Prometheus must scrape our application directly. Instead, export metrics from your app to a locally running StatsD instance, and set up Prometheus to scrape the StatsD instance instead of the application. This gives you more control over what’s counted by each counter.

  

Conclusion

Although multi-process applications cannot be natively monitored with Prometheus, these four solutions are great workarounds. This allows us to use Prometheus as the main monitoring tool throughout the corporation, for both IT resources as well as APM. 

For more information about how Prometheus can be used to monitor Python apps, check out our articles on Python Based Exporters, and our series on Developing and Deploying a Python API with Kubernetes
To try out our Prometheus alternative, check out our free trial. You can use Hosted Graphite directly in our platform, and monitor metrics without any setup. Also, talk to us directly by booking a demo - we’re always happy to talk with you about your company’s monitoring needs.

You might also like other posts...
prometheus Aug 28, 2024 · 14 min read

How the Prometheus rate() function works

Learn how to use Prometheus's rate() function. See two example use cases for rate()... Continue Reading

grafana Oct 30, 2023 · 2 min read

【Grafana】 導入方法を基礎から徹底解説

Grafanaは、監視と可観測性のためのオープンソースのプラットフォームです。 メトリクスが格納されている場所に関係なく、メトリクスを照会、視覚化、アラート、および理解することができます。 ダッシュボードを作成、調査、およびチームと共有し、データ主導の文化を育むこともできます。 Continue Reading

grafana Oct 23, 2023 · 3 min read

【Grafana】利用できるデータソースと可視化方法

Grafanaは、モニタリングや分析を行うための一般的なツールです。ダッシュボードを構築して、データを可視化、クエリ、分析したり、特定の条件のアラート通知を設定したりすることができます。この記事では、最も人気のあるGrafanaデータソースとその使用方法について詳しく説明します。 Continue Reading

header image

We strive for
99.999% uptime

Because our system is your system.

14-day trial 14-day trial
No Credit Card Required No Credit Card Required