Table of Contents
Introduction
Monitoring doesn't always need to be complex. In this guide, we'll show you how to turn raw logs into usable metrics using a lightweight open-source setup with no ELK stack and no heavy lifting. We'll use Loki, Python, and Telegraf to convert logs into Graphite metrics you can easily monitor or alert on. This is perfect for system admins, DevOps beginners, or anyone curious about building more innovative monitoring pipelines from scratch. If you don't already have a Hosed Graphite account with MetricFire, sign up for a free 14 day trial HERE.
-
Loki: A log database from Grafana Labs that's super lightweight compared to Elasticsearch.
-
Python: We'll write a small script to parse logs into metrics.
-
Telegraf: A metrics agent that will run our script and forward metrics to a Hosted Graphite account.
Follow along with the below (Linux) examples to create Graphite metrics from your system logs. We're aware this setup has several moving parts, so stay tuned for Pt2 of this series, where we detail how to accomplish this with a more minimal setup - using grok.
Install and Configure Loki
Loki is a log-structured database built by Grafana Labs. Think of it like Prometheus for logs, as it indexes labels instead of raw log content, making it fast and efficient. In this setup, we’ll run Loki locally, store logs on disk, and query them over HTTP using a simple Python script. No Promtail, no Elasticsearch, and no cloud buckets needed!
sudo wget https://github.com/grafana/loki/releases/download/v2.9.4/loki-linux-amd64.zip -O /usr/local/bin/loki.zip
cd /usr/local/bin
sudo unzip loki.zip
sudo mv loki-linux-amd64 loki
sudo chmod +x loki
Create a Loki config file at /etc/loki-config.yaml with basic settings:
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
schema_config:
configs:
- from: 2024-01-01
store: boltdb-shipper
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /tmp/loki/index
cache_location: /tmp/loki/cache
cache_ttl: 24h
filesystem:
directory: /tmp/loki/chunks
limits_config:
max_entries_limit_per_query: 5000
table_manager:
retention_deletes_enabled: true
retention_period: 24h
Run Loki manually, to listen on localhost:3100
sudo /usr/local/bin/loki -config.file=/etc/loki-config.yaml
Create a Simple Python Parser
This Python script reads the last 500 lines of /var/log/syslog and counts how many times common system events occur, like successful or failed SSH logins, sudo command usage, and cron job executions. It outputs these counts as Graphite-formatted metrics, which Telegraf can forward to your Hosted Graphite account. This gives you a lightweight way to track key system activity (like login attempts or job schedules) without needing a full logging stack. Just create a new Python file at: /etc/telegraf/parse_loki_metrics.py
#!/usr/bin/env python3
import time
import re
from collections import deque
LOG_PATH = "/var/log/syslog"
# Patterns and counters for most common log events
patterns = {
"logs.sshd.success": r"sshd.*Accepted password",
"logs.sshd.failure": r"sshd.*Failed password",
"logs.sudo.command": r"sudo: .*COMMAND=",
"logs.cron.job": r"CRON\[.*\]:"
}
metrics = {key: 0 for key in patterns}
try:
with open(LOG_PATH, "r") as f:
recent_lines = deque(f, maxlen=500)
for line in recent_lines:
for metric, pattern in patterns.items():
if re.search(pattern, line):
metrics[metric] += 1
ts = int(time.time())
for key, val in metrics.items():
print(f"{key} {val} {ts}")
except Exception as e:
print(f"logs.script_error 1 {int(time.time())} # error: {e}")
Make the script executable:
sudo chmod +x /etc/telegraf/parse_loki_metrics.py
Configure Telegraf to Run the Script
If you don't already have an instance of Telegraf running in your server, install our HG-CLI tool to quickly and easily get Telegraf up and running:
curl -s "https://www.hostedgraphite.com/scripts/hg-cli/installer/" | sudo sh
Now, just open your Telegraf configuration file at: /etc/telegraf/telegraf.conf and add the following section:
[[inputs.exec]]
commands = ["/etc/telegraf/parse_loki_metrics.py"]
timeout = "5s"
data_format = "graphite"
name_prefix = "syslog-metrics."
If your syslog can only be accessed with sudo permissions, you may need to update the Telegraf 'command' line to something like this:
commands = ["/bin/bash -c 'sudo /usr/bin/python3 /etc/telegraf/parse_loki_metrics.py'"]
Once you restart the Telegraf service, the Exec Input Plugin will execute your Python script, read the output, and forward the data to your Hosted Graphite account, it's that easy!
telegraf --config /etc/telegraf/telegraf.conf
Visualize Your Metrics
Once Loki and Telegraf are both running in your server, metrics will be forwarded to your Hosted Graphite account and can be located in the Metrics Search UI (with the telegraf.syslog-metrics.* prefix).
See our Dashboard docs on how to use these metrics to create visualizations in our Hosted Grafana, here's an example of syslog and Loki performance logs as metrics:
Conclusion
By completing this setup, you've built a powerful pipeline that transforms raw system logs into structured, real-time metrics, all using lightweight, open-source tools. Instead of sifting through endless log lines manually, you can now monitor key activities like SSH logins, cron jobs, and system events directly from your Graphite dashboards. This gives you instant visibility into system health without the complexity (or cost) of a full ELK stack.
In a DevOps role, having log observability isn't just nice to have, it’s crucial. Monitoring logs as metrics helps you spot failures faster, catch suspicious activity early, and automate your incident response. It empowers you to move from reactive troubleshooting to proactive system management. And best of all, this approach is lightweight enough to scale from a single server to an entire fleet without breaking your infrastructure budget.
Want to learn more? Reach out to us today and start a conversation with us. Happy monitoring!