Linux

Linux Performance Monitoring: A Complete Guide to Essential Tools

This guide introduces essential and advanced Linux monitoring tools, shows how to interpret metrics, and quickly identify performance bottlenecks.

Updated at February 14, 2026
15-30 min
Medium
FixPedia Team
Применимо к:Ubuntu 22.04CentOS 8Debian 11Fedora 36+

Introduction to Linux Performance Monitoring

Linux performance monitoring is not just about checking CPU load. It's a comprehensive analysis of the system: processor, memory, disks, network, and I/O. Understanding metrics helps prevent downtime, optimize resource costs, and quickly respond to anomalies.

In this guide, you'll master both basic utilities and advanced tools. We'll focus on practical scenarios: how to find a "hot" process, why a disk is slow, why the network is overloaded. All commands work on most distributions (Ubuntu, CentOS, Debian, Fedora).

Basic Utilities for Daily Use

top and htop: Interactive Process Monitoring

top is your first tool when analyzing. Run it and study the screen:

top

Key lines:

  • %Cpu(s): breakdown into us (user processes), sy (system), id (idle).
  • KiB Mem: RAM usage: used, free, buff/cache.
  • KiB Swap: swap activity.

Sorting: press P (by CPU), M (by memory). To see all processes, including threads, add -H at startup: top -H.

Tip: htop is an improved version with colors, a process tree, and convenient management. Install it via sudo apt install htop or sudo yum install htop.

vmstat: Virtual Statistics

vmstat provides a system summary every N seconds. Ideal for a quick "health check".

vmstat 2

Example output:

procs -----------memory---------- ---swap-- -----io------ -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 123456  78900 456789    0    0   100   200  123  456 30 10 55  5  0

Decoding:

  • r: processes in the run queue. Value > number of cores indicates CPU shortage.
  • si/so: pages moved in/out of swap. Non-zero values indicate insufficient RAM.
  • us/sy: high values (>80%) indicate CPU load.
  • wa: time spent waiting for I/O. High wa (e.g., >20%) indicates a disk problem.

iostat: Disk and CPU Details

Install the sysstat package if you haven't already. Command:

iostat -x 2

Key metrics for disks (Device):

  • %util: percentage of time the disk is busy with operations. Close to 100% means the disk is overloaded.
  • await: average time (in ms) to complete an operation. High values (e.g., >50 ms for SSD) indicate a problem.
  • svctm: average service time per operation. Compare with await. If await >> svctm, the queue is large.

For CPU: %user, %system, %idle.

df and du: Disk Space

Quick check of free space:

df -h

human-readable (-h) output in gigabytes. Watch %Use. If >90% — clean logs or increase the volume.

To find the largest "space eaters" in a specific folder:

du -sh /var/* | sort -rh | head -10

This shows the 10 largest subfolders in /var.

ss and netstat: Network Activity

ss is the modern replacement for netstat. Quick view of connections:

ss -tuln

Flags:

  • -t: TCP,
  • -u: UDP,
  • -l: listening,
  • -n: numeric (no name resolution).

For interface statistics:

ip -s link

Or for detailed network packet stats:

nstat

Advanced Tools for Deep Analysis

sar: Historical Data Collection

sar (System Activity Reporter) records metrics every N minutes. Data is stored in /var/log/sysstat/ (filename depends on the distro, e.g., sa14 for day 14).

View today's data:

sar -u  # CPU
sar -r  # Memory
sar -b  # I/O
sar -n DEV  # Network interfaces

Example: sar -u 2 5 — CPU every 2 seconds, 5 times.

Advantage: you can see what happened during a problem, even if you weren't at the terminal.

nmon: Interactive Monitoring of All Resources

Install nmon (sudo apt install nmon). Run:

nmon

Keys:

  • c — CPU,
  • m — memory,
  • d — disks,
  • n — network,
  • t — top processes,
  • q — exit.

nmon is useful for a quick overview and session recording (.nmon file), which can later be analyzed in Excel or via nmon2csv.

glances: Cross-Platform Monitoring

glances is a Python utility that combines many metrics in one interactive interface. Installation:

pip install glances
# or for the system:
sudo apt install glances

Run: glances. Supports colors, alerts (thresholds), export to JSON, InfluxDB, Elasticsearch.

Graphical and Web Solutions

For long-term monitoring and visualization, use combinations:

  1. Prometheus + Grafana: collect metrics via exporters (node_exporter) and beautiful dashboards.
  2. Netdata: "out-of-the-box" monitoring with a web interface on port 19999. Install: bash <(curl -Ss https://my-netdata.io/kickstart.sh).
  3. Zabbix/Nagios: for enterprise monitoring with alerts.

Practical Scenarios

Scenario 1: High CPU Load

  1. Run top or htop.
  2. Sort by %CPU. Find the process with the highest consumption.
  3. If it's java, python, node — check the application logs.
  4. If it's kworker or migration — the problem might be in the kernel or IRQ.
  5. Use perf top for profiling (install linux-tools).

Scenario 2: Disk Fully Busy

  1. iostat -x 2 — look at %util and await per disk.
  2. iotop (install via sudo apt install iotop) — shows which process is writing/reading.
  3. If await is high but %util is low — the problem might be in the network (NFS, iSCSI).
  4. Check disk queue: cat /proc/diskstats | grep <device>.

Scenario 3: Memory Shortage

  1. free -h — look at available (available) and swap.
  2. If swap is actively used (si/so in vmstat >0) — insufficient RAM.
  3. ps aux --sort=-%mem | head -10 — top 10 by memory.
  4. Check cache: cat /proc/meminfo | grep -E "Cached|Buffers". Large cache is normal; the OS uses free RAM.
  5. If a process is "eating" memory — look for leaks (e.g., via valgrind for C/C++).

Scenario 4: Network Overload

  1. ip -s link — errors (errs) and drops (drop) per interface.
  2. ss -s — summary of sockets (e.g., many TIME-WAIT).
  3. nethogs (install) — shows traffic per process.
  4. iftop — similar to top, but for network.

Automation and Alerting

For regular data collection, set up cron and sar:

# Enable data collection (if not running)
sudo systemctl enable sysstat
sudo systemctl start sysstat

File /etc/default/sysstat (Debian/Ubuntu) or /etc/sysconfig/sysstat (RHEL/CentOS) contains collection parameters (e.g., SA1_OPTIONS="-S XALL" for all metrics).

For alerts, use:

  • monit — simple daemon that watches processes, disks, CPU.
  • nagios/zabbix — complex systems with web interfaces.
  • Bash/Python scripts that check metrics and send notifications (e.g., via mail or Telegram API).

Example script to check CPU load:

#!/bin/bash
LOAD=$(awk '{print $1}' /proc/loadavg)
THRESHOLD=$(nproc)  # number of cores
if (( $(echo "$LOAD > $THRESHOLD" | bc -l) )); then
  echo "High load: $LOAD" | mail -s "Alert: CPU load" admin@example.com
fi

Interpreting Metrics and Prevention

Key Indicators

  • CPU: %idle < 20% — overload. But for web servers, 70-80% idle is normal if there's no queue.
  • Memory: available < 10% of total — alarm. Watch swap — if active, it's a sign of insufficient RAM.
  • Disk: await > 20 ms for SSD, > 10 ms for HDD — problem. %util > 80% — disk can't cope.
  • Network: rising drop/errs — overload or driver error.

Prevention

  • Regularly check logs (/var/log/syslog, dmesg).
  • Set up monitoring with thresholds (e.g., CPU > 90% for 5 minutes).
  • Limit processes via cgroups (systemd slice, docker limits).
  • Update kernel and drivers — sometimes problems are fixed in new versions.
  • For I/O-intensive tasks, use ionice and nice.

Common Beginner Mistakes

  • Looking only at top without considering wa — miss I/O problems.
  • Treating free in free -m as "free memory" — ignoring cache. Better use available.
  • Ignoring si/so in vmstat — swap kills performance.
  • Not setting up alerts — they find out about the problem when the server has already crashed.

Conclusion

Monitoring is a continuous process. Start with basic utilities (top, vmstat, iostat), then add sar for history and glances/nmon for a comprehensive overview. For production environments, definitely set up graphical dashboards (Grafana) and alerts.

Remember: metrics without context are useless. Know your workload: requests per second, data volume, peak hours. Then anomalies will be visible immediately.

F.A.Q.

Which command displays real-time CPU usage?
How to check disk space usage?
What to do if a process consumes too much memory?
Can I monitor a remote server without installing additional software?

Hints

Install necessary utilities
Use `top` for a quick overview
Analyze overall statistics with `vmstat`
Check disk activity with `iostat`
Monitor the network with `ss` and `netstat`
Collect historical data with `sar`

Did this article help you solve the problem?

FixPedia

Free encyclopedia for fixing errors. Step-by-step guides for Windows, Linux, macOS and more.

© 2026 FixPedia. All materials are available for free.

Made with for the community