kb:intranet:services:admin:start
Grafana
Logging, really. Grafana just because it is a popular open source log monitoring tool.
Installation
> sudo apt install -y software-properties-common > wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - > echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list > sudo apt update > sudo apt install grafana > sudo systemctl daemon-reload > sudo systemctl enable grafana-server > sudo systemctl start grafana-server
Following this tutorial: https://grafana.com/tutorials/grafana-fundamentals/
- Eventually can take a look at this too: https://prometheus.io/docs/tutorials/getting_started/
- Very convenient Docker ps command:
docker ps --format "table {{.Image}}\t{{.Names}}\t{{.Status}}\t{{.Ports}}"
- Or you can try dry:
curl -sSf https://moncho.github.io/dry/dryup.sh | sudo sh && sudo chmod 755 /usr/local/bin/dry
Prometheus
https://www.digitalocean.com/community/tutorials/how-to-install-prometheus-on-ubuntu-16-04
sudo useradd --no-create-home --shell /bin/false prometheus sudo mkdir /etc/prometheus sudo mkdir /var/lib/prometheus sudo chown prometheus:prometheus /etc/prometheus sudo chown prometheus:prometheus /var/lib/prometheus wget https://github.com/prometheus/prometheus/releases/download/v2.34.0/prometheus-2.34.0.linux-amd64.tar.gz tar -xvf prometheus-2.34.0.linux-amd64.tar.gz cd prometheus-2.34.0.linux-amd64 sudo cp prometheus /usr/local/bin/ sudo cp promtool /usr/local/bin/ sudo cp -r consoles/ /etc/prometheus/ sudo cp -r console_libraries/ /etc/prometheus/ sudo chown prometheus:prometheus /usr/local/bin/prometheus sudo chown prometheus:prometheus /usr/local/bin/promtool sudo chown -R prometheus:prometheus /etc/prometheus # vim /etc/prometheus/prometheus.yml global: scrape_interval: 15s scrape_configs: - job_name: "prometheus" scrape_interval: 5s static_configs: - targets: ["localhost:9090"] - job_name: "node_exporter" scrape_interval: 5s static_configs: - targets: ["localhost:9100"] # vim /etc/prometheus/prometheus.yml [Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file /etc/prometheus/prometheus.yml \ --storage.tsdb.path /var/lib/prometheus/ \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries [Install] WantedBy=multi-user.target # systemctl start prometheus # systemctl enable prometheus # # Download node_exporter from https://prometheus.io/download/#node_exporter # chown prometheus:prometheus node_exporter # # Note that collectors can be whitelisted with: --collectors.enabled meminfo,loadavg,filesystem # vim /etc/systemd/system/node_exporter.service [Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target $ # Proceed to do all the usual proxying etc.
Loki
Installation: https://grafana.com/docs/loki/latest/installation/local/
# Download promtail > curl -O -L "https://github.com/grafana/loki/releases/download/v2.5.0/promtail-linux-amd64.zip" > unzip promtail-linux-amd64.zip > sudo mv promtail-linux-amd64 /usr/local/bin/promtail > sudo chown prometheus:prometheus /usr/local/bin/promtail # Download loki > curl -O -L "https://github.com/grafana/loki/releases/download/v2.5.0/loki-linux-amd64.zip" > unzip loki-linux-amd64.zip > sudo mv loki-linux-amd64 /usr/local/bin/loki > sudo chown prometheus:prometheus /usr/local/bin/loki # Download loki configuration > sudo mkdir -p /etc/loki/ > sudo wget https://raw.githubusercontent.com/grafana/loki/master/cmd/loki/loki-local-config.yaml -O /etc/loki/loki.yml > sudo chown -R prometheus:prometheus /etc/loki > sudo cat /etc/loki/loki.yml auth_enabled: false server: http_listen_port: 3101 grpc_listen_port: 9096 common: path_prefix: /tmp/loki storage: filesystem: chunks_directory: /tmp/loki/chunks rules_directory: /tmp/loki/rules replication_factor: 1 ring: instance_addr: 127.0.0.1 kvstore: store: inmemory schema_config: configs: - from: 2020-10-24 store: boltdb-shipper object_store: filesystem schema: v11 index: prefix: index_ period: 24h ruler: alertmanager_url: http://localhost:9093 # Download promtail configuration > sudo mkdir -p /etc/promtail/ > sudo wget https://raw.githubusercontent.com/grafana/loki/main/clients/cmd/promtail/promtail-local-config.yaml -O /etc/promtail/promtail.yml > sudo chown -R prometheus:prometheus /etc/promtail > sudo cat /etc/promtail/promtail.yml # Create systemd services, as usual # Reader's exercise to do the same for promtail > cat /etc/systemd/system/loki.service [Unit] Description=Loki Wants=network-online.target After=network-online.target [Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/loki \ --config.file=/etc/loki/loki.yml [Install] WantedBy=multi-user.target
Stuff to note
- Concepts:
promtail
is the agent that pulls data from log files whenever changes to the files are detected. Logs or metrics can be generated from this agent, which are then processed byLoki
orPrometheus
respectively.Loki
sends logs to Grafana, whilePrometheus
sends metrics to Grafana.- Logs
- Best practices when setting up promtail and Loki. Key points:
- Use labels only for bounded values (at most 10s of values). Each unique combination of labels are indexed as individual streams, which will kill Loki.
- Loki monitors logs exclusively, i.e. there is no functionality to monitor metrics/values over time. Use the metrics feature of promtail for this, and have Prometheus scrape from promtail instead.
The basic use of pipelines in promtail is as follows:
... scrape_configs: - job_name: system static_configs: - labels: job: tapo __path__: /var/log/tapo/energy.log pipeline_stages: - match: selector: '{job="tapo"}' stages: - regex: expression: '(?P<timestamp>[\dT\-+:]+)\t(?P<energy>\d+).*' - labels: energy: - timestamp: format: RFC3339 source: timestamp
With ingested logs:
... 2022-04-25T12:10:01+08:00 29773 [29, 30, 5] 2022-04-25T12:11:02+08:00 29684 [29, 30, 5] 2022-04-25T12:12:02+08:00 29604 [29, 30, 6] 2022-04-25T12:13:01+08:00 29123 [29, 30, 6]
Quick reads:
- Logs vs metrics - log is the usual, metrics are aggregations (and hence more scalable): https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/
- InfluxDB's time structured merge tree: https://docs.influxdata.com/influxdb/v1.7/concepts/storage_engine/
- On using Grafana + Prometheus + AlertManager stack:
Sidenote, some quick reading on K8s:
- K8s vs EC2 Docker autoscaled instances: https://ably.com/blog/no-we-dont-use-kubernetes
kb/intranet/services/admin/start.txt · Last modified: 19 months ago ( 2 May 2023) by 127.0.0.1