{"id":540,"date":"2022-03-26T20:29:22","date_gmt":"2022-03-26T20:29:22","guid":{"rendered":"https:\/\/andrejacobs.org\/?p=540"},"modified":"2022-04-11T20:22:58","modified_gmt":"2022-04-11T20:22:58","slug":"installing-prometheus-on-ubuntu-20-04","status":"publish","type":"post","link":"https:\/\/andrejacobs.org\/linux\/installing-prometheus-on-ubuntu-20-04\/","title":{"rendered":"Installing Prometheus on Ubuntu 20.04"},"content":{"rendered":"\n
I noticed that my hard drives<\/a> in my Ubuntu server was not spinning down after being idle and thus I needed a way to monitor their power states and to verify through trial and error if any of my attempts is working or not. Enter Prometheus to the rescue!<\/p>\n I will be following this guide<\/a> to start with.<\/p>\n This article will cover the following:<\/p>\n I would recommend you first check which ports are already in use and assign maybe a less obvious port number. To see what ports are being used to listen on, run I use UFW so will allow access to the service only from the specific NIC and ip-port.<\/p>\n Next I will be setting up Basic Auth following the official guide<\/a>.<\/p>\n I will be using a self signed certificate for my server since it is only accessible from the local network.<\/p>\n I found this article<\/a> showing how you can generate a self signed certificate using an IP address.<\/p>\n You will have issues getting your browser to trust the self signed certificate. What I have done is to copy the cert.pem file over to my Mac and import it into Keychain. Then \u201cGet Info\u201d on the certificate and change the trust settings to \u201cAlways Trust\u201d.<\/p>\n Now I can open https:\/\/192.168.x.x:9090<\/a> in my browser and I am asked for the username and password and the site shows that it is protected by TLS.<\/p>\n To gather some system metrics I will be installing and using the Prometheus Node Exporter following the official guide<\/a>.<\/p>\n While trying to figure out why my drives are no longer being put to sleep while idle, I came across this article<\/a> from Peter Marheine which is what lead me to install and use Prometheus in the first place.<\/p>\n In order to gather metrics from smartmon I will need to use the See my other article about openSeaChest<\/a> for more details on using Seagate\u2019s openSeaChest tools.<\/p>\n Since I am using openSeaChest to correctly control my Seagate drives I will use On the Prometheus dashboard I can already see that drives have started going into the various idle states.<\/p>\n After two days of trial and error I have finally managed to get the drives to spin down after about 25 minutes of inactivity.<\/p>\n https:\/\/linoxide.com\/how-to-install-prometheus-on-ubuntu\/<\/a><\/p>\n https:\/\/medium.com\/@antelle\/how-to-generate-a-self-signed-ssl-certificate-for-an-ip-address-f0dd8dddf754<\/a><\/p>\n\n
Prerequisites<\/h2>\n
\n
$ sudo apt update && sudo apt upgrade\n# Optionally remove packages no longer used\n$ sudo apt autoremove\n<\/code><\/pre>\n
\n
$ sudo mkdir -p \/etc\/prometheus\n$ sudo mkdir -p \/var\/lib\/prometheus\n<\/code><\/pre>\n
\n
Download and Install<\/h2>\n
\n
$ wget https:\/\/github.com\/prometheus\/prometheus\/releases\/download\/v2.34.0\/prometheus-2.34.0.linux-amd64.tar.gz\n$ tar -xvf prometheus-2.34.0.linux-amd64.tar.gz\n<\/code><\/pre>\n
\n
$ cd prometheus-2.34.0.linux-amd64\n$ sudo cp prometheus promtool \/usr\/local\/bin\/\n$ sudo cp -r consoles\/ console_libraries\/ \/etc\/prometheus\/\n$ sudo cp prometheus.yml \/etc\/prometheus\/prometheus.yml\n<\/code><\/pre>\n
\n
$ cd ~\n$ prometheus --version\n\nprometheus, version 2.34.0 (branch: HEAD, revision: 881111fec4332c33094a6fb2680c71fffc427275)\n build user: root@121ad7ea5487\n build date: 20220315-15:18:00\n go version: go1.17.8\n platform: linux\/amd64\n\n$ promtool --version\n\npromtool, version 2.34.0 (branch: HEAD, revision: 881111fec4332c33094a6fb2680c71fffc427275)\n build user: root@121ad7ea5487\n build date: 20220315-15:18:00\n go version: go1.17.8\n platform: linux\/amd64\n<\/code><\/pre>\n
Configure permissions<\/h2>\n
\n
$ sudo groupadd --system prometheus\n$ sudo useradd -s \/sbin\/nologin --system -g prometheus prometheus\n<\/code><\/pre>\n
\n
$ sudo chown -R prometheus:prometheus \/etc\/prometheus\/ \/var\/lib\/prometheus\/\n$ sudo chmod -R 775 \/etc\/prometheus\/ \/var\/lib\/prometheus\/\n<\/code><\/pre>\n
Configure Prometheus to be run as a systemd service<\/h2>\n
\n
$ sudo vi \/etc\/systemd\/system\/prometheus.service\n\n# Add the following\n[Unit]\nDescription=Prometheus\nWants=network-online.target\nAfter=network-online.target\n\n[Service]\nUser=prometheus\nGroup=prometheus\nRestart=always\nType=simple\nExecStart=\/usr\/local\/bin\/prometheus \\\n --config.file=\/etc\/prometheus\/prometheus.yml \\\n --storage.tsdb.path=\/var\/lib\/prometheus\/ \\\n --web.console.templates=\/etc\/prometheus\/consoles \\\n --web.console.libraries=\/etc\/prometheus\/console_libraries \\\n --web.listen-address=0.0.0.0:9090\n\n[Install]\nWantedBy=multi-user.target\n<\/code><\/pre>\n
netstat -tulpn | grep LISTEN<\/code><\/p>\n
\n
\/etc\/prometheus\/prometheus.yml<\/code> to suit your purposes.<\/li>\n
$ sudo systemctl enable prometheus\n<\/code><\/pre>\n
\n
$ sudo systemctl start prometheus\n$ sudo systemctl status prometheus\n...\n\u25cf prometheus.service - Prometheus\n Loaded: loaded (\/etc\/systemd\/system\/prometheus.service; enabled; vendor preset: enabled)\n Active: active (running) since Tue 2022-03-22 16:34:37 UTC; 15s ago\n Main PID: 57191 (prometheus)\n Tasks: 10 (limit: 76945)\n Memory: 19.4M\n<\/code><\/pre>\n
Allow access from firewall<\/h3>\n
# Get the name of the NIC to use\n$ ip a\n...\n2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> ...\n\n# port 9090 here is what prometheus was configured on\n$ sudo ufw allow in on enp2s0 proto tcp to 192.168.x.x port 9090\n<\/code><\/pre>\n
\n
<\/p>\n
Securing Prometheus<\/h2>\n
\n
$ sudo apt install python3-bcrypt\n<\/code><\/pre>\n
\n
gen-bcrypt-password.py<\/code>.<\/li>\n<\/ul>\n
import getpass\nimport bcrypt\n\npassword = getpass.getpass("password: ")\nhashed_password = bcrypt.hashpw(password.encode("utf-8"), bcrypt.gensalt())\nprint(hashed_password.decode())\n<\/code><\/pre>\n
\n
$ python3 gen-bcrypt-password.py\n\npassword:\n$2b$12$z2shPn1IGtuyj7CL3E1Tb.dnci.HA7KXxFFysng3rWViTpaZK0LMS\n<\/code><\/pre>\n
\n
\/etc\/prometheus\/web.yml<\/code> config file. The username in the example is admin.<\/li>\n<\/ul>\n
$ sudo vi \/etc\/prometheus\/web.yml\n<\/code><\/pre>\n
basic_auth_users:\n admin: $2b$12$z2shPn1IGtuyj7CL3E1Tb.dnci.HA7KXxFFysng3rWViTpaZK0LMS\n<\/code><\/pre>\n
\n
# Change the permissions\n$ sudo chown prometheus:prometheus \/etc\/prometheus\/web.yml\n$ sudo chmod 640 \/etc\/prometheus\/web.yml\n\n$ sudo promtool check web-config \/etc\/prometheus\/web.yml\n\/etc\/prometheus\/web.yml SUCCESS\n<\/code><\/pre>\n
\n
$ sudo vi \/etc\/systemd\/system\/prometheus.service\n\n# Modify this setting to include\nExecStart=\n...\n# Add this bit (ensure you add \\ to the previous line)\n --web.config.file=\/etc\/prometheus\/web.yml\n\n# Restart the service\n$ sudo systemctl daemon-reload\n$ sudo systemctl restart prometheus\n$ sudo systemctl status prometheus\n<\/code><\/pre>\n
\n
$ curl --head http:\/\/localhost:9090\/graph\nHTTP\/1.1 401 Unauthorized\n\n# Open the dashboard in your browser again and you should be\n# asked for username and password.\n<\/code><\/pre>\n
Enable TLS<\/h3>\n
\n
generate-ip-cert.sh<\/code>.<\/li>\n<\/ul>\n
#!\/bin\/sh\n# Generate a self signed certificate using the IP\n# Based on: https:\/\/raw.githubusercontent.com\/antelle\/generate-ip-cert\/master\/generate-ip-cert.sh\n\nIP=$(echo $1 | egrep -o "^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}$")\n\nif [ ! $IP ]\nthen\n echo "Usage: generate-ip-cert.sh 127.0.0.1"\n exit 1\nfi\n\necho "[req]\ndefault_bits = 4096\ndistinguished_name = req_distinguished_name\nreq_extensions = req_ext\nx509_extensions = v3_req\nprompt = no\n\n[req_distinguished_name]\ncountryName = XX\nstateOrProvinceName = N\/A\nlocalityName = N\/A\norganizationName = Self-signed certificate\ncommonName = $IP: Self-signed certificate\n\n[req_ext]\nsubjectAltName = @alt_names\n\n[v3_req]\nsubjectAltName = @alt_names\n\n[alt_names]\nIP.1 = $IP\n" > san.cnf\n\nopenssl req -x509 -nodes -days 3650 -newkey rsa:4096 -keyout key.pem -out cert.pem -config san.cnf\nrm san.cnf\n<\/code><\/pre>\n
\n
$ .\/generate-ip-cert.sh 192.168.x.x\n\n$ sudo mv cert.pem key.pem \/etc\/prometheus\n$ sudo chown prometheus:prometheus \/etc\/prometheus\/*.pem\n<\/code><\/pre>\n
\n
tls_server_config:\n cert_file: \/etc\/prometheus\/cert.pem\n key_file: \/etc\/prometheus\/key.pem\n<\/code><\/pre>\n
\n
$ sudo systemctl daemon-reload\n$ sudo systemctl restart prometheus\n<\/code><\/pre>\n
Gather the first metrics<\/h2>\n
\n
$ wget https:\/\/github.com\/prometheus\/node_exporter\/releases\/download\/v1.3.1\/node_exporter-1.3.1.linux-amd64.tar.gz\n$ tar xvfz node_exporter-*.*-amd64.tar.gz\n<\/code><\/pre>\n
\n
$ cd node_exporter-*.*-amd64\n$ .\/node_exporter\n...\n# See some output.\n\n# Start a new SSH session and get the metrics from this new service\n$ curl http:\/\/localhost:9100\/metrics\n...\n# See some metric output.\n<\/code><\/pre>\n
\n
$ sudo cp node_exporter \/usr\/local\/bin\/\n$ sudo vi \/etc\/systemd\/system\/node_exporter.service\n<\/code><\/pre>\n
[Unit]\nDescription=Node Exporter\nAfter=network.target\n\n[Service]\nUser=prometheus\nGroup=prometheus\nType=simple\nExecStart=\/usr\/local\/bin\/node_exporter\n\n[Install]\nWantedBy=multi-user.target\n<\/code><\/pre>\n
\n
$ sudo systemctl daemon-reload\n$ sudo systemctl start node_exporter\n# Enable the service to start at boot\n$ sudo systemctl enable node_exporter\n\n# Check status\n$ sudo systemctl status node_exporter\n<\/code><\/pre>\n
\n
$ sudo vi \/etc\/prometheus\/prometheus.yml\n\n# Add the following into the scape_configs section\n - job_name: 'node_exporter_metrics'\n scrape_interval: 1m\n static_configs:\n - targets: ['localhost:9100']\n\n$ sudo systemctl daemon-reload\n$ sudo systemctl restart prometheus\n<\/code><\/pre>\n
\n
node_exporter.service<\/code> file.<\/li>\n<\/ul>\n
# Example of disabling all collectors and then specify each collector individually\nExecStart=\/usr\/local\/bin\/node_exporter \\\n --collector.disable-defaults \\\n --collector.<name>\n<\/code><\/pre>\n
\u201cReal world\u201d usage: Check how often my drives are being spin up<\/h2>\n
textfile<\/code> collector from
node_exporter<\/code>.<\/p>\n
\n
$ sudo mkdir -p \/var\/lib\/prometheus\/textfiles\n$ sudo chown prometheus:prometheus \/var\/lib\/prometheus\/textfiles\n<\/code><\/pre>\n
\n
$ sudo vi \/etc\/systemd\/system\/node_exporter.service\n\n# Modify this value to include the textfile collector if needed\n# and you do need to set the textfile.directory\nExecStart=\/usr\/local\/bin\/node_exporter \\\n --collector.textfile \\\n --collector.textfile.directory=\/var\/lib\/prometheus\/textfiles\n\n# Reload service\n$ sudo systemctl daemon-reload\n$ sudo systemctl restart node_exporter\n$ sudo systemctl status node_exporter\n<\/code><\/pre>\n
\n
$ sudo mkdir -p \/usr\/local\/bin\/node_exporter_textfile_collector\n<\/code><\/pre>\n
Trial run<\/h3>\n
\n
$ sudo vi \/usr\/local\/bin\/node_exporter_textfile_collector\/hello_world.sh\n\n#!\/bin\/bash\nMAGIC=`shuf -i 0-10 -n 1`\necho '# HELP test_hello_world_number Used for testing that the textfile collector is working'\necho '# TYPE test_hello_world_number gauge'\necho "test_hello_world_number ${MAGIC}"\n\n# Save and change permissions\n$ sudo chmod +x \/usr\/local\/bin\/node_exporter_textfile_collector\/hello_world.sh\n# Run the script a couple of times and you should see output like this\n\n# HELP test_hello_world_number Used for testing that the textfile collector is working\n# TYPE test_hello_world_number gauge\ntest_hello_world_number 4\n<\/code><\/pre>\n
\n
sponge<\/code> to write the output files. Looking into this it is because you want to ensure a textfile\u2019s metrics is written completely before node_exporter starts reading metrics. I.e. atomically. To install sponge
sudo apt install moreutils<\/code>.<\/li>\n
$ sudo sh -c '\/usr\/local\/bin\/node_exporter_textfile_collector\/hello_world.sh | sponge \\\n\/var\/lib\/prometheus\/textfiles\/test_hello_world.prom'\n<\/code><\/pre>\n
<\/p>\n
Smartmon monitoring<\/h3>\n
\n
\/usr\/local\/bin\/node_exporter_textfile_collector\/smartmon.sh<\/code>. Make it executable
sudo chmod +x<\/code><\/li>\n
$ sudo \/usr\/local\/bin\/node_exporter_textfile_collector\/smartmon.sh\n...\n# Expect to see a lot of stats here. If you only get the version then the\n# user running the script does not have enough permissions for smartctlCreate a systemd timer to run the script every 5 minutes and to produce a textfile for node_exporter to pick up.\n<\/code><\/pre>\n
\n
$ sudo crontab -e\n\n# Export smartctl metrics to Prometheus\n*\/5 * * * * \/usr\/local\/bin\/node_exporter_textfile_collector\/smartmon.sh | sponge \/var\/lib\/prometheus\/textfiles\/smartmon.prom\n\n<\/code><\/pre>\n
\n
Time to roll my own script<\/h3>\n
openSeaChest_PowerControl<\/code> to report on the power status and capture that to a textfile for node_exporter to pick up.<\/p>\n
\n
$ sudo cd openSeaChest\/builddir\n$ sudo mkdir -p \/usr\/local\/bin\/openSeaChest\n$ sudo cp openSeaChest_* \/usr\/local\/bin\/openSeaChest\/\n<\/code><\/pre>\n
\n
$ sudo \/usr\/local\/bin\/openSeaChest\/openSeaChest_PowerControl \\\n-q -d \/dev\/... --checkPowerMode\n\n# This is the list of the various states\n\nDevice is in the PM0: Active state or PM1: Idle State\nDevice is in the PM2: Standby state and device is in the Standby_z power condition\nDevice is in the PM2: Standby state.\nDevice is in the PM1: Idle state and the device is in the Idle_a power condition\nDevice is in the PM1: Idle state and the device is in the Idle_b power condition\nDevice is in the PM1: Idle state and the device is in the Idle_c power condition\n<\/code><\/pre>\n
\n
\/usr\/local\/bin\/node_exporter_textfile_collector\/drive_powermode.sh<\/code><\/li>\n<\/ul>\n
#!\/bin\/bash\n# Check drive states and report it as metrics for the node exporter textfile collector\n\necho '# HELP andre_drive_powermode Report the power mode of the \/dev\/sd? drives'\necho '# TYPE andre_drive_powermode gauge'\n\nfunction checkPowerMode() {\n local guage=1.0\n local powerMode=$(\/usr\/local\/bin\/openSeaChest\/openSeaChest_PowerControl -q -d $1 --checkPowerMode)\n \n if echo $powerMode | grep -q -w 'Active'; then\n guage=1.0\n elif echo $powerMode | grep -q -w 'Standby'; then\n guage=0.0\n elif echo $powerMode | grep -q -w 'Idle_c'; then\n guage=0.25\n elif echo $powerMode | grep -q -w 'Idle_b'; then\n guage=0.5\n elif echo $powerMode | grep -q -w 'Idle_a'; then\n guage=0.75\n fi\n \n echo "andre_drive_powermode{dev=\\"$1\\"} $guage"\n}\n\nfor drive in \/dev\/sd? ; do\n checkPowerMode $drive\ndone\n<\/code><\/pre>\n
\n
$ sudo crontab -e\n\n# Export drive power mode metrics to Prometheus\n*\/5 * * * * \/usr\/local\/bin\/node_exporter_textfile_collector\/drive_powermode.sh | sponge \/var\/lib\/prometheus\/textfiles\/drive_powermode.prom\n<\/code><\/pre>\n
<\/p>\n
<\/p>\n
References:<\/h2>\n