This article is intended for people who don't have a lot of experience with using collectd, Graphite, and Grafana - and for people who don't have much experience in performance monitoring in general. As there are not many people using an Arch machine for monitoring, this article compiles information from these search terms:
what is Grafana and Graphite
what is Grafana
what is Graphite Linux
Linux collectd docs
Arch Linux collectd and Graphite
get collectd to send metrics
check if collectd is working
where does collectd store metrics
Going through this article should smooth out the process of setting up a system performance monitor on an Arch machine. A few components need to be understood before we get started.
RRDtool: Short for Round Robin Database Tool, it stores metric data in the format of timestamp: metric_value in a database with the file extension .rrd.
collectd: This is a program that is a daemon (an always running process), that can collect system performance data and store them with RRDtool. The default folder location on Arch is typically /var/lib/collectd/rrd when using RRDtool to store the data.
Graphite: This can receive metrics data from any supported source, and store it in it’s own Graphite database.
Grafana: This can graph your metrics data that is stored in your Graphite database, or any other supported database.
Hosted Graphite: This is a hosted version of the open source Graphite that automatically applies Grafana for visualizations. Hosted Graphite is one of the products offered by MetricFire, alongside Hosted Prometheus and Grafana. Hosted Graphite pretty much handles the set up for everything, and all you need to do is send it metrics with your API key.
Using Hosted Graphite will let us skip the process of setting up Graphite and Grafana. Below is a chart on what Hosted Graphite can do for us:
As you can see, Hosted Graphite does most of the heavy lifting for us, and significantly simplifies the process of setting up a performance monitoring system. The general idea of this chart also applies to anything else you may want to monitor, but the setup process will be slightly different based on your monitoring environment.
First we will make an account with Hosted Graphite and it should automatically start you off with a 14-day trial. Then let's install collectd and rrdtool. Use the code below:
sudo pacman -S collectd
sudo pacman -S rrdtool
Next we will get our pre-configured config file from Hosted Graphite. Go to your main overview, then go to add-ons, add-ons, collectd, and download config 5.4. Check the image below for reference.
Then we want to move the file that we just downloaded into the proper directory, which on Arch will be called /etc. Go into the directory with the file that was just downloaded, and move it using terminal with sudo mv collectd.conf /etc. If there is already a config file in /etc, remove it with sudo rm /etc/collectd.conf. This config file should already have a preset with your API key in the write_graphite plugin parameters.
Now let's start and enable collectd. Use this code below:
sudo systemctl enable collectd
sudo systemctl start collectd
To check if collectd is actually working you can use top, and then and a process called collectd -f should blink on and off.
If you want to check if collectd is actually storing data locally, we can use the command rrdtool last FILEPATH. This will give a timestamp value, and it will look like a big integer.
We can use the small script below to make the big integer more readable, check the image above to see the script in action.
# replace the big int with the `rrdtool last` output
By this time, metrics on your system performance should have been sent to Hosted Graphite. Check your metrics by going to your overview and hovering over Metrics, then click on Metrics Treemap. You should see a giant collectd box, and you can click through it to check what it contains.
After having setup our metrics to be sent to Hosted Graphite, Hosted Graphite now uses that data and creates graphs. Let’s go to Add Dashboard and then Add Query to add a visualization to our graph panel. Check the image for reference:
You can query by clicking Select Metric, and then giving it only the first word from your Metric Treemap. It should give options on what you have if you go one by one. We can do collectd -> HOSTNAME -> cpu-* -> cpu-user for getting how much CPU the user is using (* for all cores). Check the image below for reference.
Now the graph should populate with some data points, and you have made your first graph.
There are several other options to play around with when creating your graph, but this should get you started up quickly so you can have time to explore. To be able to externally link the graphs, you can use the access keys provided if you go to Access on the left hand menu, then click sharing. Check the image below for reference.
While system performance monitoring would be most useful for web-app servers for several reasons, personal applications could exist. For example, we could use a personal laptop’s metrics to determine a good timing of when it should be charged. This app could also integrate with your calendar to see when you will be away from an outlet for a long period of time, so it can notify you if you should charge your battery. There are several personal applications that this could be useful for.
Regardless of the OS system you plan to monitor on, and your purpose for monitoring, Hosted Graphite can save a lot of time setting up that system. All you need to do is send metrics to your Hosted Graphite account, and all the heavy lifting is done there.