Introduction / Why This Is Needed
Continuous monitoring of system resources is the foundation of stable operation for any Linux server or workstation. Without understanding what's happening with the CPU, memory, disks, and network, problem diagnosis becomes guesswork. This guide will introduce you to a set of built-in and popular utilities that come preinstalled or are easily installable on any distribution. You'll gain practical skills for quickly identifying bottlenecks and maintaining system performance.
Requirements / Preparation
- Access to a command line (SSH or local terminal).
- Superuser privileges (sudo) for some commands (e.g.,
iostat,vmstatwithout the-aflag, or viewing systemd logs). For basic viewing (df,top), they are not required. - Basic familiarity with the Linux terminal.
- It is recommended to install additional packages for more convenient monitoring:
Note:# For Ubuntu/Debian sudo apt update && sudo apt install htop nmon sysstat iftop nethogs # For CentOS/RHEL/Fedora sudo yum install htop nmon sysstat iftop nethogs # or on newer versions sudo dnf install htop nmon sysstat iftop nethogssysstatcontains theiostatandsarutilities.nethogsgroups traffic by processes.
Step-by-Step Instructions
Step 1: Overall CPU and Memory Load (top / htop)
These utilities are your primary tools for a quick system snapshot.
top— the classic, always-available utility.top
Key fields for analysis:%Cpu(s): breakdown of load intous(user),sy(system),id(idle),wa(I/O wait). Highwaindicates disk problems.KiB Mem: total, used, and free RAM.- Process list (
PID,USER,%CPU,%MEM,COMMAND). PressPto sort by CPU,M— by memory.
htop— an enhanced version with colors, graphs, and mouse support.htop
Advantages: convenient process tree (F5), ability to kill a process (F9), CPU load per core is immediately visible.
💡 Tip: In
htop, configure the display of additional fields (F2 -> Display options), adding, for example,IO_READ_RATEandIO_WRITE_RATEto monitor disk I/O directly in the process list.
Step 2: Disk Space Usage (df and du)
Before the system "crashes" due to lack of space on the root partition, you need to check it.
df(disk free) — shows free space on mounted filesystems.df -h
The-hflag ("human-readable") outputs sizes in GB/MB. Pay attention to the usage percentage (%Use) in the column corresponding to the root partition (/) or/var.du(disk usage) — estimates the size of files and directories.# Estimate the size of a specific folder (quickly) du -sh /var/log # Find the 10 largest folders in the current directory du -h --max-depth=1 | sort -hr | head -n 10-s— total summary size,-h— readable format,--max-depth=1— depth of subdirectories.
Step 3: Network Activity and Open Ports (ss, netstat)
ss— a modern and fast replacement for the outdatednetstat. Use it by default.# Show all listening TCP/UDP sockets with port numbers and processes sudo ss -tulnpe # Key flags: # -t: TCP, -u: UDP, -l: listening, -n: numeric (don't resolve names), -p: show process
Look for lines withLISTENand check on which interfaces (Local Address:Port) services are running.iftop— atopanalog for the network. Shows which connections use the most traffic.sudo iftop -i eth0 # Replace eth0 with your network interface (ip a)
Note: requires root to access statistics.nethogs— groups network traffic by processes (PID/Program). Essential for finding the application "eating" traffic.sudo nethogs
Step 4: Disk I/O Statistics (iostat)
Utility from the sysstat package. Shows how loaded disks are and how long the CPU waits for I/O operations (%iowait).
iostat -dx 2 5
-d— show disk statistics.-x— extended statistics.2— update interval in seconds.5— number of reports.
What to look for:
%util— percentage of time the disk was busy processing requests. A value close to 100% indicates a "clogged" disk.await— average time (in ms) an I/O operation waited to complete. Highawaitcombined with high%utilis a clear sign of a bottleneck.
Step 5: Overall System Statistics (vmstat)
Reports on processes, memory, swap, block I/O, and CPU over a selected interval.
vmstat 2 5
procs:r(processes waiting for run time),b(processes in uninterruptible sleep).memory:swpd(virtual memory used),free(free RAM).cpu: breakdown of CPU time (us,sy,id,wa,st— time stolen from a virtual machine).
Quick diagnostics:
- Increase in
r> number of CPU cores → insufficient CPU resources. - Persistent
swpd> 0 and growth insi/so(swap in/out) → insufficient RAM. - High
wa→ disk I/O problems.
Step 6: Viewing System Logs (journalctl)
For systems using systemd, journalctl is the single entry point for all logs (kernel, services, boot).
# View logs in real-time (like tail -f)
sudo journalctl -f
# Logs for a specific service (e.g., nginx)
sudo journalctl -u nginx.service
# Logs from the last boot
sudo journalctl -b
# Logs for a specific period
sudo journalctl --since "2026-02-16 09:00:00" --until "2026-02-16 12:00:00"
# Show only error messages
sudo journalctl -p err..alert
Step 7: Universal Monitoring (nmon)
nmon is a powerful interactive utility that collects and displays all major metrics on one screen: CPU, Memory, Network, Disks, Filesystem, Kernel.
nmon
Controls: press letters (c — CPU, m — memory, d — disks, n — network, t — top processes) to switch views. To collect data to a file for later analysis: nmon -f -s 2 -c 30 (collect every 2 seconds, 30 times).
Step 8: Analyzing Process Load (ps and pstree)
To get a detailed "snapshot" of processes at a point in time.
# Top 10 processes by memory consumption
ps aux --sort=-%mem | head -n 11
# Top 10 processes by CPU
ps aux --sort=-%cpu | head -n 11
# Process tree (see which process spawned which)
pstree -p
Step 9: Monitoring Specific File Descriptors (lsof)
"List open files" utility. Shows which processes have files, sockets, devices open. Critical for diagnosing "cannot delete file" or "cannot unmount disk".
# Which processes are using /var/log/syslog?
sudo lsof /var/log/syslog
# Which processes have port 80 open?
sudo lsof -i :80
# All open files in the current directory (for the current user)
lsof .
Step 10: Creating a Simple Wrapper Script for a Basic Report
For regular collection of key metrics, you can create a script.
#!/bin/bash
# basic_sys_report.sh
echo "=== System Status Report: $(date) ==="
echo -e "\n--- CPU and Memory Load (top -bn1) ---"
top -bn1 | head -n 15
echo -e "\n--- Disk Usage (df -h) ---"
df -h
echo -e "\n--- Memory Usage (free -h) ---"
free -h
echo -e "\n--- Open Ports (ss -tuln) ---"
sudo ss -tuln
Save as basic_sys_report.sh, make it executable (chmod +x basic_sys_report.sh), and run as needed.
Verification
After completing the steps, you should:
- Be able to quickly determine which resource (CPU, RAM, Disk I/O, Network) is the bottleneck.
- Find "heavy" processes and services.
- Check remaining disk space and the size of key directories.
- View error logs for specific services.
- Understand the basic output of
iostatandvmstat.
Check: Try to answer the questions:
- Which two processes use the most CPU? (
htoporps aux --sort=-%cpu). - On which partition did space run out? (
df -h). - Why did the system "freeze"? Check
topfor high%wa(disk) orvmstatfor risingr(processes in queue).
Potential Issues
⚠️ Important: Many commands (
ss -p,iostat,lsof,nethogs,journalctl) require sudo or root privileges. If you get "Permission denied" or empty output, try running the command withsudo.
iostatis not installed. Solution: install thesysstatpackage (see Requirements).htop/nethogs/iftopnot found. Solution: install them from your distribution's repository.dfdoesn't show the/homepartition or another. It might not be mounted. Checkmountorcat /etc/fstab.journalctldoesn't show old boot logs. The-bflag shows only the current boot. Usejournalctl --list-bootsto get a list and select the needed index:journalctl -b -1for the previous one.ss/netstatshow a port but not the process. Add the-pflag. If even withsudothe process is not indicated (?), the socket might be used by the kernel (e.g., for NFS) or the process may have already terminated.
- No
iowaitdata intop. In some older versions oftop(or in default mode),%wamight not be included. Pressfintop, thenito add theIOWAITcolumn, or usevmstat/iostat. dutakes a very long time on a large directory. This is normal. To speed it up, usedu -x(stay within one filesystem) or limit depth (--max-depth). For a preliminary estimate, you can usencdu(installed separately).
Additional Resources
For deeper study, explore the man pages for each utility (man top, man iostat). For long-term metric collection and graphing, explore monitoring systems like Prometheus + Grafana or Zabbix, which use these basic utilities or agents to collect data.