Securing Your Monitoring Infrastructure

PRODUCT

Oct 08, 2020 ∙ 12 min read

SerHack

Table of Contents

Introduction
Where are the Vulnerabilities
How to Secure These Vulnerabilities
The Zero Trust Architecture
Limit Your Infrastructure’s Exposure
Installing the Firewall
- Installing NginX
- Installing Let’s Encrypt and Obtaining Your First TLS Certificate
Conclusion: Security Requires Constant Attention

Great systems are not just built. They are monitored.

MetricFire runs Graphite and Grafana as a fully managed service for growing engineering teams, taking care of storage, scaling, and version updates so your team doesn't have to. Plans start at $19/month, billed per metric namespace rather than per host, and include engineer-staffed support. Integrations work natively with Heroku, AWS, Azure, and GCP, and data is stored with 3× redundancy in SOC2- and ISO:27001-certified data centres.

Introduction

Your monitoring system provides a comprehensive overview of any infrastructure. To effectively monitor your infrastructure and systems, you’ll need to get all of your data into one place - regardless if you have 1 node or 10 nodes. This centralization of data inevitably creates a vulnerable point that attackers can potentially target and exploit.

In this article, we look at how to design your infrastructure in a secure way, as well as focus in on how to secure your nodes. In the second half of this article, we will exclusively focus on the central node that is the point of aggregation between data that is taken from internal nodes.

By working with MetricFire, you can hand over the worry of maintaining your monitoring to the experts, and ensure that your monitoring infrastructure is fully secured. Check out our free trial, and book a demo to talk about monitoring security!

Where are the Vulnerabilities

There are a few critical ways that a monitoring system can be exploited.

First, in the case where sensitive data is accessed in an unauthorized way (such as during an attack), a nefarious party may have the ability to modify, alter, and/or manipulate the information in your dashboards as an effort to hide possible red flags.

Second, monitoring an infrastructure requires a specific set of skills and if you are not an expert, it is quite easy to make errors that can lead to misconfigurations. As we can see from a previous article, Monitoring your own infrastructure with open-source Graphite and Grafana, configuring Graphite and Grafana is not a simple task.

Potential errors are sometimes difficult to detect due to the complex nature of the configurations, and misconfigurations or other errors can become an open door into your infrastructure for attackers.

How to Secure These Vulnerabilities

In the face of a threat, given the variability of an infrastructure, there is no exact way to manage such an attack; rather, there are only practical and logical behaviors and methods to put into practice.

This article is not intended to be a complete guide, but it provides insight that can be used to deepen your research within the area of computer security.

Now, we will take a look at the best common practices for securing your monitoring systems.

The Zero Trust Architecture

It is considered best practice to implement a zero trust architecture, where every component (including internal components) “doubts” the trust of the user (hence, zero trust).

This setup ensures that there is no default trust of any entity and further ensures that an attacker will have difficulty executing on its own attack given the compartmentalization of the architecture. While configuring a zero trust architecture is resource intensive, it is considered one of the best infrastructures that can be included in a thoughtfully designed IT security plan.

Limit Your Infrastructure’s Exposure

The best way to reduce your attack surface is to limit your infrastructure’s exposure on the web. As with most companies, there is intranet (internal company network) and extranet (the company network facing outwards).

In this context, two elements are useful for us: firewall and reverse proxy. In the following tutorial sections, we assume that the nodes are set to dialogue only with the internal company network.

The following tutorial sections are running on a machine with the IP x.x.x.x and domain http://example.com with a Debian distribution installed.

Installing the Firewall

As previously mentioned, we will exclusively focus on the central node that is the point of aggregation between data that is taken from internal nodes. In this section, let’s install the firewall on the central node.

On all Unix based systems, there are tables that contain some rules for packet filtering. Each TCP/IP packet contains a series of information that is useful for iptables to decide whether or not to accept a packet. In a very simple way, this is how a firewall works. Although iptables is the most complete solution, it is often complicated to configure for newbies.

UFW (Uncomplicated Firewall) is a particular utility developed in Python that allows for the simple configuration of iptables. It is one of the best utilities developed by the Linux community and, in some instances, you can find it installed by default on some distributions, such as Ubuntu.

To install UFW, we proceed with updating the package repository:

root@nodo apt update
root@nodo apt install ufw

If all went well, call ufw -v from shell should return the usage version. UFW is based on using and enforcing rules to block or accept requests on ports. The classic syntax is as follows:

ufw [option] [port/protocol]

option: can be ALLOW or DENY
port: interface port (integer value between 0 and 65535, according to TCP/IP standard); it could be combined with a protocol specifying protocol (TCP or UDP)

Our approach will now be to first prohibit all incoming and outgoing connections and then open only those of interest.

ufw default deny incoming
ufw default deny outgoing

Done this way, a possible attacker will not be able to enter our node. In the case where the attacker violates this rule by other programs (where the port is whitelisted), the attacker will not be able to open a shell because all connections from the node to the outside are blocked. It is important to note that before enabling the firewall, it is vital to whitelist SSH or the protocol from which you are talking to the machine (RDP, or similar).

ufw allow ssh

So, now we are ready to enable the firewall:

ufw enable

For the moment, we do not enable any additional rules. To check the status of our firewall (and which ports it blocks), we launch the command:

ufw status

Let’s install the reverse proxy and then open the ports on Grafana and Graphite:

ufw allow [port graphite]/tcp
ufw allow [port graphite]/udp

Installing NginX

A reverse proxy is a particular server used to forward all external requests made by a client to the internal server, therefore eliminating exposure. This constitutes an additional layer of “layering” as it can be configured together with a load balancer to handle multiple requests without tiring the internal server.

In our case, we will have the internal server that will be used in Grafana, while the reverse proxy will be NginX. NginX is a well known web server (together with Apache) developed in C by Igor Sysoev. NetCraft, a popular internet company, has estimated that approximately 36% of the web is served through NginX.

NginX has the ability to serve as an HTTP server, but also as load balancer and reverse proxy. Unlike Apache, to handle each request, NginX uses an asynchronous event-based approach instead of single threads. Therefore, the consumption resources of NginX is much lower than Apache ― making it a logical choice in this context.

As a first step, we install NginX through package management:

root@node apt update && apt install nginx

To verify the success of the installation, we enable port 80 (HTTP) and 443 on the UFW firewall:

root@node ufw allow 'Nginx Full'

Now, let’s check http://example.com. If you see “NginX works”, then the installation procedure went well. After this, we insert the reverse proxy. It is important to note that all NginX configurations are located in the directory /etc/nginx/sites-enabled ― copy the default configuration file and enable the reverse proxy.

NginX has a few simple rules for writing the configuration. Each line represents a directive and must end with a semicolon (;). If a line starts with #, then it is a comment. Several directives can be included in a single section called a “block” depending on the service they modify (for example, the “HTTP” block is used for directives that modify or alter the HTTP response of a server).

root@node cp /etc/nginx/sites-enabled/default /etc/nginx/sites-enabled/example.com
root@node nano /etc/nginx/sites-enabled/example.com

From here, we modify the HTTP block:

server {
  listen       80;
  server_name  example.com;
location / {
    proxy_pass         127.0.0.1:3000;
    proxy_redirect     off;
    proxy_set_header   Host             $host;
    proxy_set_header   X-Real-IP        $remote_addr;
    proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
    proxy_set_header   Upgrade $http_upgrade;
    proxy_set_header   Connection "upgrade";
  }
}

The proxy_pass directive is important, as it specifies the internal address to which NginX must connect. The other directives are to set the headers of the client ― the internal server must know to whom it sends the information, otherwise a client would receive the answer of another client.

To verify that the configuration is syntactically correct, we run NginX with the -t flag:

root@nodo nginx -t

If there is no error, let’s restart NginX (making sure that the graph-web service is active):

root@nodo systemctl restart nginx

Let’s access http://example.com. After doing so, the Grafana user interface should appear.

Installing Let’s Encrypt and Obtaining Your First TLS Certificate

The HTTP protocol is one of the major protocols used to communicate through the client-server model. In the client-server model, the client requests a service from the server. The exchange of information between the client and the server is carried out in clear text; although, the information is not encrypted.

This being the case, one of the fundamental properties of computer security is missing, namely the authenticity of the message. With this protocol, there is no way for the client to verify whether the server’s response is actually the original response. To resolve this, a new type of protocol was introduced, TLS (Transportation Layer Security, ex-SSL).

The TLS protocol allows client-server applications to communicate over a network in an effort to prevent data tampering, forgery, and interception. Computer attacks such as “Man In the Middle”, an attack where a malicious node replicates sent packets by modifying them at will, are preventable thanks to TLS.

HTTPS, an extension of HTTP, is an application layer protocol that provides HTTP traffic encryption using TLS. To enable HTTPS in your domain, you need to obtain a certificate that guarantees the authenticity of the message. Let’s Encrypt is a non-profit authority that allows you to obtain free certificates for your infrastructure.

Now, let’s install CertBot ― a utility that enables you to get a certificate in a simple and fast way without being an encryption expert:

sudo apt-get install certbot python-certbot-nginx

Let’s generate the new certificate and install it!

sudo certbot --nginx

The certificates generated by CertBot are of limited duration (about 90 days), after that you have to renew them through the following command:

sudo certbot renew --dry-run

Conclusion: Security Requires Constant Attention

The maintenance of a security system is as important as its design and installation phase. These are complex systems with different technological equipment that must be 100% operational if they are to perform their function. A bug may have the potential to be catastrophic for our monitoring infrastructure and, in turn, could result in unintended consequences for an entire company.

As we have seen with Let’s Encrypt, and the application of the “critical” patches in the previous article, data security requires constant maintenance. It is important to remember that a thorough maintenance design is the result of using best-in-class resources ― for companies, this requires substantial time and money. If a company decides to subcontract these responsibilities to a third party, such a company must be diligent in selecting a third party, as it may be underwhelmed with the services provided.

If you have time constraints and want to work efficiently and effectively, MetricFire allows you to sleep soundly thanks to its 24/7 team that constantly monitors the security of your monitoring infrastructure and applies continuous updates to keep up with the best-in-class standards. Check out MetricFire’s demo or free trial for more!