Linux Monitoring: 10 Essential Utilities for Administration

Introduction / Why This Is Needed

Continuous monitoring of system resources is the foundation of stable operation for any Linux server or workstation. Without understanding what's happening with the CPU, memory, disks, and network, problem diagnosis becomes guesswork. This guide will introduce you to a set of built-in and popular utilities that come preinstalled or are easily installable on any distribution. You'll gain practical skills for quickly identifying bottlenecks and maintaining system performance.

Requirements / Preparation

Access to a command line (SSH or local terminal).
Superuser privileges (sudo) for some commands (e.g., iostat, vmstat without the -a flag, or viewing systemd logs). For basic viewing (df, top), they are not required.
Basic familiarity with the Linux terminal.

It is recommended to install additional packages for more convenient monitoring:

# For Ubuntu/Debian
sudo apt update && sudo apt install htop nmon sysstat iftop nethogs

# For CentOS/RHEL/Fedora
sudo yum install htop nmon sysstat iftop nethogs
# or on newer versions
sudo dnf install htop nmon sysstat iftop nethogs

Note: sysstat contains the iostat and sar utilities. nethogs groups traffic by processes.

Step-by-Step Instructions

Step 1: Overall CPU and Memory Load (`top` / `htop`)

These utilities are your primary tools for a quick system snapshot.

top — the classic, always-available utility.
```
top
```
Key fields for analysis:
- %Cpu(s): breakdown of load into us (user), sy (system), id (idle), wa (I/O wait). High wa indicates disk problems.
- KiB Mem: total, used, and free RAM.
- Process list (PID, USER, %CPU, %MEM, COMMAND). Press P to sort by CPU, M — by memory.
htop — an enhanced version with colors, graphs, and mouse support.
```
htop
```
Advantages: convenient process tree (F5), ability to kill a process (F9), CPU load per core is immediately visible.

💡 Tip: In htop, configure the display of additional fields (F2 -> Display options), adding, for example, IO_READ_RATE and IO_WRITE_RATE to monitor disk I/O directly in the process list.

Step 2: Disk Space Usage (`df` and `du`)

Before the system "crashes" due to lack of space on the root partition, you need to check it.

df (disk free) — shows free space on mounted filesystems.
```
df -h
```
The -h flag ("human-readable") outputs sizes in GB/MB. Pay attention to the usage percentage (%Use) in the column corresponding to the root partition (/) or /var.

du (disk usage) — estimates the size of files and directories.

# Estimate the size of a specific folder (quickly)
du -sh /var/log

# Find the 10 largest folders in the current directory
du -h --max-depth=1 | sort -hr | head -n 10

-s — total summary size, -h — readable format, --max-depth=1 — depth of subdirectories.

Step 3: Network Activity and Open Ports (`ss`, `netstat`)

ss — a modern and fast replacement for the outdated netstat. Use it by default.

# Show all listening TCP/UDP sockets with port numbers and processes
sudo ss -tulnpe

# Key flags:
# -t: TCP, -u: UDP, -l: listening, -n: numeric (don't resolve names), -p: show process

Look for lines with LISTEN and check on which interfaces (Local Address:Port) services are running.

iftop — a top analog for the network. Shows which connections use the most traffic.
```
sudo iftop -i eth0  # Replace eth0 with your network interface (ip a)
```
Note: requires root to access statistics.
nethogs — groups network traffic by processes (PID/Program). Essential for finding the application "eating" traffic.
```
sudo nethogs
```

Step 4: Disk I/O Statistics (`iostat`)

Utility from the sysstat package. Shows how loaded disks are and how long the CPU waits for I/O operations (%iowait).

iostat -dx 2 5

-d — show disk statistics.
-x — extended statistics.
2 — update interval in seconds.
5 — number of reports.

What to look for:

%util — percentage of time the disk was busy processing requests. A value close to 100% indicates a "clogged" disk.
await — average time (in ms) an I/O operation waited to complete. High await combined with high %util is a clear sign of a bottleneck.

Step 5: Overall System Statistics (`vmstat`)

Reports on processes, memory, swap, block I/O, and CPU over a selected interval.

vmstat 2 5

procs: r (processes waiting for run time), b (processes in uninterruptible sleep).
memory: swpd (virtual memory used), free (free RAM).
cpu: breakdown of CPU time (us, sy, id, wa, st — time stolen from a virtual machine).

Quick diagnostics:

Increase in r > number of CPU cores → insufficient CPU resources.
Persistent swpd > 0 and growth in si/so (swap in/out) → insufficient RAM.
High wa → disk I/O problems.

Step 6: Viewing System Logs (`journalctl`)

For systems using systemd, journalctl is the single entry point for all logs (kernel, services, boot).

# View logs in real-time (like tail -f)
sudo journalctl -f

# Logs for a specific service (e.g., nginx)
sudo journalctl -u nginx.service

# Logs from the last boot
sudo journalctl -b

# Logs for a specific period
sudo journalctl --since "2026-02-16 09:00:00" --until "2026-02-16 12:00:00"

# Show only error messages
sudo journalctl -p err..alert

Step 7: Universal Monitoring (`nmon`)

nmon is a powerful interactive utility that collects and displays all major metrics on one screen: CPU, Memory, Network, Disks, Filesystem, Kernel.

nmon

Controls: press letters (c — CPU, m — memory, d — disks, n — network, t — top processes) to switch views. To collect data to a file for later analysis: nmon -f -s 2 -c 30 (collect every 2 seconds, 30 times).

Step 8: Analyzing Process Load (`ps` and `pstree`)

To get a detailed "snapshot" of processes at a point in time.

# Top 10 processes by memory consumption
ps aux --sort=-%mem | head -n 11

# Top 10 processes by CPU
ps aux --sort=-%cpu | head -n 11

# Process tree (see which process spawned which)
pstree -p

Step 9: Monitoring Specific File Descriptors (`lsof`)

"List open files" utility. Shows which processes have files, sockets, devices open. Critical for diagnosing "cannot delete file" or "cannot unmount disk".

# Which processes are using /var/log/syslog?
sudo lsof /var/log/syslog

# Which processes have port 80 open?
sudo lsof -i :80

# All open files in the current directory (for the current user)
lsof .

Step 10: Creating a Simple Wrapper Script for a Basic Report

For regular collection of key metrics, you can create a script.

#!/bin/bash
# basic_sys_report.sh
echo "=== System Status Report: $(date) ==="
echo -e "\n--- CPU and Memory Load (top -bn1) ---"
top -bn1 | head -n 15
echo -e "\n--- Disk Usage (df -h) ---"
df -h
echo -e "\n--- Memory Usage (free -h) ---"
free -h
echo -e "\n--- Open Ports (ss -tuln) ---"
sudo ss -tuln

Save as basic_sys_report.sh, make it executable (chmod +x basic_sys_report.sh), and run as needed.

Verification

After completing the steps, you should:

Be able to quickly determine which resource (CPU, RAM, Disk I/O, Network) is the bottleneck.
Find "heavy" processes and services.
Check remaining disk space and the size of key directories.
View error logs for specific services.
Understand the basic output of iostat and vmstat.

Check: Try to answer the questions:

Which two processes use the most CPU? (htop or ps aux --sort=-%cpu).
On which partition did space run out? (df -h).
Why did the system "freeze"? Check top for high %wa (disk) or vmstat for rising r (processes in queue).

Potential Issues

⚠️ Important: Many commands (ss -p, iostat, lsof, nethogs, journalctl) require sudo or root privileges. If you get "Permission denied" or empty output, try running the command with sudo.

iostat is not installed. Solution: install the sysstat package (see Requirements).
htop/nethogs/iftop not found. Solution: install them from your distribution's repository.
df doesn't show the /home partition or another. It might not be mounted. Check mount or cat /etc/fstab.
journalctl doesn't show old boot logs. The -b flag shows only the current boot. Use journalctl --list-boots to get a list and select the needed index: journalctl -b -1 for the previous one.
ss/netstat show a port but not the process. Add the -p flag. If even with sudo the process is not indicated (?), the socket might be used by the kernel (e.g., for NFS) or the process may have already terminated.

No iowait data in top. In some older versions of top (or in default mode), %wa might not be included. Press f in top, then i to add the IOWAIT column, or use vmstat/iostat.
du takes a very long time on a large directory. This is normal. To speed it up, use du -x (stay within one filesystem) or limit depth (--max-depth). For a preliminary estimate, you can use ncdu (installed separately).

Additional Resources

For deeper study, explore the man pages for each utility (man top, man iostat). For long-term metric collection and graphing, explore monitoring systems like Prometheus + Grafana or Zabbix, which use these basic utilities or agents to collect data.

F.A.Q.

Which tool is better for beginners: top or htop?

How to monitor the system in real-time without interactive mode?

Where are system logs stored in Linux?

Which tool should be used for monitoring performance history?

Hints

Install additional utilities (optional)

Examine overall system load

Check disk space usage

Monitor network activity

Analyze I/O performance

View system logs