easiest-way-to-monitor-loki-performance-with-telegraf

Easiest Way to Monitor Loki Performance With Telegraf

Table of Contents

Introduction 

Loki is a powerful, scalable log aggregation system designed by Grafana to efficiently collect, store, and query logs. It’s often deployed alongside Prometheus as part of modern observability stacks. Loki’s design emphasizes cost-effective storage by indexing only metadata, which makes it a great choice for high-volume environments.

But while Loki excels at log ingestion and indexing, many teams overlook the critical task of monitoring Loki itself. Without visibility into Loki’s own performance, things like ingestion rate, query latency, cache efficiency, and resource utilization, make it difficult to detect bottlenecks or prevent outages before they impact users.

In this article, we'll detail how to use the Telegraf agent to collect, convert, and forward Loki performance metrics to a FREE data source.

Getting Started with the Telegraf Agent

Telegraf is a plugin-driven server agent built on InfluxDB that collects and sends metrics/events from databases, systems, processes, devices, and applications. It is written in Go, compiles into a single binary with no external dependencies, and requires a minimal memory footprint. Telegraf is compatible with many operating systems and has many helpful output plugins and input plugins for collecting and forwarding a wide variety of system performance metrics.


Installing and configuring Telegraf is easy, but we simplified this process further with our HG-CLI tool. Install the tool on any OS, run it in TUI mode, and enter your Hosted Graphite API key to get Telegraf up and running quickly!

  • Install the HG-CLI tool:
curl -s "https://www.hostedgraphite.com/scripts/hg-cli/installer/" | sudo sh
  • Run it in TUI mode:
hg-cli tui

If you don't already have a Hosted Graphite account, sign up for a free trial here to obtain a Hosted Graphite API key.

Otherwise, you can configure a different telegraf output to forward metrics to another data source.

Install and Configure Loki (Linux)

Download Loki version 2.9.4, which is a recent and stable release:

sudo wget https://github.com/grafana/loki/releases/download/v2.9.4/loki-linux-amd64.zip -O /usr/local/bin/loki.zip

    Extract the package using the unzip utility, and prepare the binary:

    cd /usr/local/bin

    sudo unzip loki.zip
    sudo mv loki-linux-amd64 loki
    sudo chmod +x loki

    Configure Loki with basic settings to tell Loki how to ingest, store, and serve logs. Just create a new file for this at /etc/loki/loki-config.yaml:

    auth_enabled: false

    server:
      http_listen_port: 3100

    ingester:
      lifecycler:
        ring:
          kvstore:
            store: inmemory
          replication_factor: 1

    schema_config:
      configs:
        - from: 2024-01-01
          store: boltdb-shipper
          object_store: filesystem
          schema: v13
          index:
            prefix: index_
            period: 24h

    storage_config:
      boltdb_shipper:
        active_index_directory: /tmp/loki/index
        cache_location: /tmp/loki/cache
        cache_ttl: 24h
      filesystem:
        directory: /tmp/loki/chunks

    limits_config:
      max_entries_limit_per_query: 5000

    table_manager:
      retention_deletes_enabled: true
      retention_period: 24h

    compactor:
      working_directory: /tmp/loki/compactor

    Save the config file and create the following data directories. These directories match the paths in the config and Loki won’t create missing directories on startup, so this is a required step:

    sudo mkdir -p /tmp/loki/index /tmp/loki/cache /tmp/loki/chunks /tmp/loki/compactor
    sudo chown -R root:root /tmp/loki

    Run and Test Loki

    Run Loki manually and inspect the output for any errors:

    sudo /usr/local/bin/loki -config.file=/etc/loki/loki-config.yaml

    In another terminal window, confirm that Loki is healthy:

    curl -s http://localhost:3100/ready

    Confirm that the /metrics endpoint is returning metrics:

    curl -s http://localhost:3100/metrics | head -n 25

    Alternatively, you can create a systemd config for Loki, so it always runs in the background on startup. But for this example, you can just leave it running and continue with the next steps in another terminal window.

    Test Loki by running this ingestor stress loop in your terminal:

    while true; do
      ts=$(($(date +%s%N)))
      log="Stress log $(date) $(shuf -i 1-1000000 -n 1)"
      curl -s -XPOST "http://localhost:3100/loki/api/v1/push" -H "Content-Type: application/json" \
        -d '{"streams":[{"stream":{"job":"loki","level":"info"},"values":[["'"$ts"'", "'"$log"'"]]}]}' > /dev/null
      sleep 0.005
    done

    Configure Telegraf's Prometheus Input Plugin

    Telegraf has many input plugins that can collect a wide range of data from many popular technologies and 3rd party sources. For this example, Loki is publishing metrics to http://localhost:3100/metrics. These hold a Prometheus format and must be converted to a Graphite format as they are forwarded to the Hosted Graphite datasource (which was configured in the earlier steps).

    First, open your Telegraf configuration file (generally located at /etc/telegraf/telegraf.conf), and add the following section:

      [[inputs.prometheus]]
        urls = ["http://localhost:3100/metrics"]
        metric_version = 2
        name_prefix = "loki-performance."

      Then save your changes and run the telegraf daemon using the below command. This will help you see if there are any configuration errors in the output:

      telegraf --config /etc/telegraf/telegraf.conf

      Now, Telegraf will be scraping the local /metrics endpoint and forwarding these metrics to your Hosted Graphite account. You can locate these metrics in the app Metrics UI (with the *loki-performance* prefix):

      Easiest Way to Monitor Loki Performance With Telegraf - 1


      See the official GitHub repository for additional details and configuration options for the Prometheus input plugin.

      Use MetricFire to Create Custom Dashboards and Alerts

      MetricFire is a monitoring platform that enables you to gather, visualize and analyze metrics and data from servers, databases, networks, processes, devices, and applications. Using MetricFire, you can effortlessly identify problems and optimize resources within your infrastructure. Hosted Graphite by MetricFire removes the burden of self-hosting your monitoring solution, allowing you more time and freedom to work on your most important tasks.

      Once you have signed up for a Hosted Graphite account and used the above steps to configure your server(s) with the Telegraf Agent, metrics will be forwarded, timestamped, and aggregated into the Hosted Graphite backend.

      1. Metrics will be sent and stored in the Graphite format of: metric.name.path <numeric-value> <unix-timestamp>

      2. The dot notation format provides a tree-like data structure, making it efficient to query

      3. Metrics are stored in your Hosted Graphite account for two years, and you can use them to create custom Alerts and Grafana dashboards.

      Build Custom Dashboards in Hosted Grafana

      In the Hosted Graphite UI, navigate to Dashboards and select + New Dashboard to create a new visualization.

      Then go into Edit mode and use the Query UI to select a graphite metric path (the default data source will be the HostedGraphite backend if you are accessing Grafana via your HG account).

      The HG datasource also supports wildcard (*) searching to grab all metrics that match a specified path.

      Now you can apply Graphite functions to these metrics like aliasByNode() to clean up the metric names, and exclude() to omit specified patterns:

      Easiest Way to Monitor Loki Performance With Telegraf - 2


      Grafana has many additional options to apply different visualizations, modify the display, set units of measurement, and some more advanced features like configuring dashboard variables and event annotations.

      This is what a production-level Loki Performance dashboard might look like:

      Easiest Way to Monitor Loki Performance With Telegraf - 3

      Create Graphite Alerts

      In the Hosted Graphite UI, navigate to Alerts => Graphite Alerts to create a new alert. Name the alert, add a query to the alerting metric field, and add an description of what this alert is (optional).

      Then, select the Alert Criteria tab to set a value threshold and select a notification channel. The default notification channel will be the email you used to sign up for the Hosted Graphite account. However, you can easily configure channels for Slack, PagerDuty, Microsoft Teams, OpsGenie, custom webhooks, and more. This way you can receive a notification any time your metric values are outside of their expected bounds.

      Please see the Hosted Graphite docs for more details on Alerts and Notification Channels.

      Conclusion

      By combining Loki’s native /metrics endpoint with Telegraf’s Prometheus input plugin, you can build a complete Loki monitoring solution without needing a complicated Prometheus setup. This will give you real-time visibility into Loki’s ingestion rate, query latency, cache efficiency, and memory usage. This approach also scales well from development environments to production, and can be easily extended using Hosted Graphite's built-in alerting and visualization tools.

      Sign up for the free trial and begin monitoring your infrastructure today. You can also book a demo and talk to the MetricFire team directly about your monitoring needs.

      You might also like other posts...
      metricfire Apr 28, 2025 · 6 min read

      From Logs to Metrics Part 1: Building an Open-Source Logs-to-Graphite Pipeline

      Monitoring doesn't always need to be complex. In this guide, we'll show you how... Continue Reading

      metricfire Apr 19, 2025 · 5 min read

      Will it Monitor? Tracking the ISS in Real Time

      Tracking the International Space Station (ISS) as it orbits Earth is not just a... Continue Reading

      metricfire Apr 03, 2025 · 6 min read

      Step by Step Guide for Using the HG-CLI Agent Installation Tool

      At MetricFire, we’re committed to making infrastructure monitoring as seamless and accessible as possible.... Continue Reading

      header image

      We strive for 99.999% uptime

      Because our system is your system.

      14-day trial 14-day trial
      No Credit Card Required No Credit Card Required