Linux Perf Analysis - Quickly Check Your Systems Health (1)
Introduction
I’ve been using Linux for a while. In the early days, I used to have this very cheap and slow computer. I always used to wonder why the hell it was so slow all of a sudden since I was new to Linux and had been a Windows user previously; I had no idea how to even open the terminal and check what’s wrong. Eventually, of course, I learned through tutorials and all… Umm, well that’s the intro, man; I have nothing to say anymore. Let’s just go straight into this schize.
Note: This is a multi-part series. In this part, we’ll cover some basic tools.
Goal
The goal is simple: quickly collect system data and form a rough diagnosis.
We want to identify whether the issue is:
- CPU issue?
- Memory issue?
- Disk issue?
- Network issue?
uptime
$ uptime
19:41:35 up 10:50, 1 user, load average: 0.87, 1.06, 1.11
-
uptime is a quick way to check the load average’s over time. As you can see(got the reference??) above there are three numbers, linux updates them continuously, using an exponential moving average.
-
The kernel recalculates it roughly every 5 seconds, each value - 1, 5, 15 is just a different smoothing window
-
So:
- 1-min load → reacts quickly
- 5-min load → smoother
- 15-min load → very slow, has a stable trend
-
See below image, the 1-min load dramatically increased beacuse I started playing some random 4k video, whilst 5-min load increased a bit and 15-min load by just one.
Tip: Use
watchto observe load changes live:$ watch -n 2 uptimeThis is especially useful when testing something like this. Learn more about watch using
man watch.
Important: Load average is NOT CPU usage. It includes:
- Running processes
- Runnable (waiting for CPU)
- Uninterruptible sleep (usually I/O wait)
What can we infer from this and what to look at?
- Compare load to CPU count:
- Load ≈ CPU cores — system is busy but fine
- Load » CPU cores — contention/bottleneck
- If load spikes suddenly:
- Check
toporpidstat
- Check
- If load is high but CPU usage is low:
- Likely I/O bottleneck — check
iostat
- Likely I/O bottleneck — check
- Trend matters:
- 1-min » 15-min — recent spike
- All high — sustained pressure
vmstat
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
r b swpd free buff cache si so bi bo in cs us sy id wa st gu
1 0 582540 6354152 909084 5165968 0 0 259 408 2377 6 4 1 94 1 0 0
0 0 582540 6337568 909084 5168056 0 0 0 0 2188 7145 1 1 97 0 0 0
0 0 582540 6344640 909084 5168188 0 0 0 944 1445 5891 1 1 98 0 0 0
0 0 582540 6344864 909084 5166072 0 0 0 0 1678 5439 1 1 98 0 0 0
0 0 582540 6356228 909084 5166584 0 0 0 0 2747 10801 2 2 96 0 0 0
1 0 582540 6358864 909092 5166584 0 0 0 92 1777 6262 1 1 98 0 0 0
0 0 582540 6359072 909092 5166072 0 0 0 0 1420 5457 1 1 98 0 0 0
0 0 582540 6356900 909092 5168120 0 0 0 1084 1982 5958 1 1 98 0 0 0
0 0 582540 6357072 909092 5168120 0 0 0 0 1448 5994 1 1 98 0 0 0
0 0 582540 6358244 909092 5166136 0 0 0 0 1676 5427 1 1 98 0 0 0
vmstat= Virtual Memory Statistics (not just memory btw).- It gives a compact, real-time view of the entire system like: CPU, memory, processes, I/O, and context switching.
Proc
r— no. of processes running on CPU and waiting for a turn (doesn't include I/O).b— blocked (waiting on I/O).
Memory and Swap
free— free memory in kilobytes.si— swap-in.so— swap-out. If either is non-zero, you're out of memory (mostly relevant when swap devices are configured).swap— swap used (I have ~295MB used).buff— kernel buffers.cache— filesystem cache (important one).
I/O
bi— blocks read from disk.bo— blocks written to disk.
CPU
us— user CPU %sy— kernel CPU %id— idle %wa— waiting on I/Ost— stolen (VM)gu— guest (VMs)
What can we infer from this and what to look at?
r> CPU cores → CPU contentionb> 0 → I/O blocking → check disksi/so> 0 → memory pressure (bad sign)- High
wa→ disk bottleneck - High
us→ user-space CPU heavy workload - High
sy→ kernel/system overhead
dmesg
$ sudo dmesg | tail
[ 1588.754691] iwlwifi 0000:00:14.3: Unhandled alg: 0x703
[ 1588.754694] iwlwifi 0000:00:14.3: Unhandled alg: 0x703
[ 1588.754697] iwlwifi 0000:00:14.3: Unhandled alg: 0x703
[ 1588.754700] iwlwifi 0000:00:14.3: Unhandled alg: 0x703
[ 1588.754703] iwlwifi 0000:00:14.3: Unhandled alg: 0x703
[ 1602.189160] input: realme Buds Wireless 3 (AVRCP) as /devices/virtual/input/input32
[13068.129622] input: realme Buds Wireless 3 (AVRCP) as /devices/virtual/input/input33
[17264.803519] nvme nvme0: using unchecked data buffer
[17264.807705] block nvme0n1: No UUID available providing old NGUID
dmesgshows the kernel ring buffer — informational messages, warnings, errors, and sometimes debug logs.- You could see things like: hardware information, driver messages, filesystem events, kernel warnings, and security messages (if on SELinux).
What can we infer from this and what to look at?
Look for hardware errors (disk, GPU, USB), driver failures, and filesystem issues. Filter by severity using:
$ sudo dmesg --level=err,warn
- Disk errors → check
iostat - OOM killer → check memory (
free,vmstat) - Repeated warnings → likely root cause
iostat
$ iostat -xz 1
Linux 6.18.16-1-lts (vv) 05/08/2026 _x86_64_ (12 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
4.08 0.00 1.44 0.64 0.00 93.83
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz d/s dkB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util
loop0 0.00 0.04 0.00 0.00 0.06 15.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
nvme0n1 13.03 246.12 5.06 27.96 0.22 18.90 8.33 394.98 9.28 52.72 3.44 47.44 0.00 0.00 0.00 0.00 0.00 0.00 1.13 1.93 0.03 1.19
zram0 0.00 0.07 0.00 0.00 0.00 16.46 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
iostat is used to show I/O metrics.
Options:
x= extended statsz= hide idle devices1= refresh every second
Key Fields
r/s, w/s, rkB/s, and wkB/sare the delivered reads, writes, read Kbytes, and write Kbytes.userCPU time running user-space programsniceuser processes with modified priority (nice)systemkernel work (syscalls, interrupts, etc.)iowaitCPU waiting for disk I/Ostealtime stolen by hypervisor (VMs)aqu-szaverage queue size
What can we infer from this and what to look at?
%util~100% → disk saturated- High
await(>10–20ms SSD, >50ms HDD) → latency issue - High
aqu-sz→ queue buildup - Low util but high await → possible driver/fs issue
- High writes → check logs, journaling, apps
free
total used free shared buff/cache available
Mem: 15Gi 4.9Gi 6.2Gi 1.1Gi 5.7Gi 10Gi
Swap: 7.7Gi 522Mi 7.1Gi
- Focus on
available, notfree - Low available memory → pressure
- High swap usage + growing → is a bad sign
- High cache is GOOD (Linux uses memory efficiently)
- If swapping:
- Check
vmstat - Identify heavy processes (
top,pidstat)
- Check
top
top - 14:03:35 up 5:44, 1 user, load average: 0.58, 0.98, 1.14
Tasks: 329 total, 1 running, 325 sleep, 0 d-sleep, 0 stopped, 3 zombie
%Cpu(s): 2.2 us, 1.7 sy, 0.0 ni, 95.5 id, 0.2 wa, 0.3 hi, 0.1 si, 0.0 st
MiB Mem : 15684.2 total, 6405.1 free, 4956.9 used, 5850.3 buff/cache
MiB Swap: 7842.0 total, 7319.1 free, 522.9 used. 10727.3 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
162550 vamsi 20 0 3313788 646444 214152 S 7.3 4.0 2:36.70 Isolated Web Co
1107 vamsi 20 0 496004 32352 18480 S 2.3 0.2 2:15.87 wireplumber
1960 vamsi 20 0 1555272 248560 218228 S 2.3 1.5 14:16.67 Hyprland
28869 vamsi 20 0 1131588 299768 257972 S 2.3 1.9 0:04.43 alacritty
3366 vamsi 20 0 838444 68296 60576 S 1.7 0.4 1:18.29 Utility Process
2810 vamsi 20 0 12.3g 849872 542272 S 1.3 5.3 52:18.70 zen
880 root 20 0 344216 26152 20996 S 1.0 0.2 0:45.65 NetworkManager
911 root 20 0 3164688 122652 78932 S 1.0 0.8 4:47.72 opensnitchd
2756 vamsi 20 0 583756 14084 10584 S 1.0 0.1 1:55.32 btop
2003 vamsi 20 0 2033888 549632 306264 S 0.7 3.4 2:00.91 qs
157278 vamsi 20 0 377676 81484 15300 S 0.7 0.5 1:48.58 nvim
775 dbus 20 0 6080 4436 2408 S 0.3 0.0 0:35.84 dbus-broker
777 avahi 20 0 6704 4304 4000 S 0.3 0.0 0:03.01 avahi-daemon
881 polkitd 20 0 382936 11712 7628 S 0.3 0.1 0:15.28 polkitd
927 root 20 0 2380228 57820 38896 S 0.3 0.4 0:17.39 containerd
2470 vamsi 20 0 1142996 221492 195400 S 0.3 1.4 0:58.89 alacritty
3832 vamsi 9 -11 201668 23648 9896 S 0.3 0.1 0:32.19 pipewire-pulse
185532 vamsi 20 0 11144 8180 5892 R 0.3 0.1 0:00.01 top
1 root 20 0 23460 14068 9732 S 0.0 0.1 0:05.95 systemd
- Identify top CPU consumers
- Look for:
- Runaway processes
- Zombies
- High memory users
- CPU breakdown:
- High
us→ apps - High
sy→ kernel - High
wa→ disk wait
- High
mpstat
$ mpstat -P ALL 1
Linux 6.18.16-1-lts (vv) 05/08/2026 _x86_64_ (12 CPU)
02:06:06 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:06:07 PM all 1.84 0.00 0.67 0.17 0.33 0.17 0.00 0.00 0.00 96.82
02:06:07 PM 0 1.03 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 98.97
02:06:07 PM 1 0.98 0.00 0.98 0.00 0.00 0.98 0.00 0.00 0.00 97.06
02:06:07 PM 2 1.03 0.00 1.03 0.00 1.03 0.00 0.00 0.00 0.00 96.91
02:06:07 PM 3 1.01 0.00 0.00 0.00 1.01 0.00 0.00 0.00 0.00 97.98
02:06:07 PM 4 2.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 96.00
02:06:07 PM 5 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 0.00 99.01
02:06:07 PM 6 5.05 0.00 2.02 0.00 0.00 1.01 0.00 0.00 0.00 91.92
02:06:07 PM 7 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 98.00
02:06:07 PM 8 3.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 95.00
02:06:07 PM 9 2.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.98
02:06:07 PM 10 1.98 0.00 0.00 0.99 0.99 0.00 0.00 0.00 0.00 96.04
02:06:07 PM 11 3.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 97.00
$ mpstat -P ALL 1
What can we infer from this and what to look at?
- Per-core CPU usage
- Single-core bottleneck (one core at 100%) or imbalanced workloads
- High
%iowait→ disk issue - Useful for diagnosing multi-threading issues and CPU pinning problems
pidstat
$ pidstat 1
Linux 6.18.16-1-lts (vv) 05/08/2026 _x86_64_ (12 CPU)
02:09:16 PM UID PID %usr %system %guest %wait %CPU CPU Command
02:09:17 PM 1000 1107 1.98 0.99 0.00 0.00 2.97 8 wireplumber
02:09:17 PM 1000 1960 0.99 0.00 0.00 0.00 0.99 4 Hyprland
02:09:17 PM 1000 2003 5.94 1.98 0.00 0.00 7.92 8 qs
02:09:17 PM 1000 2470 0.00 0.99 0.00 0.00 0.99 8 alacritty
02:09:17 PM 1000 2810 0.99 0.00 0.00 0.00 0.99 2 zen
02:09:17 PM 1000 3366 0.99 0.00 0.00 0.00 0.99 7 Utility Process
02:09:17 PM 1000 157278 0.99 0.00 0.00 0.00 0.99 2 nvim
02:09:17 PM 1000 162550 4.95 1.98 0.00 0.00 6.93 6 Isolated Web Co
02:09:17 PM 0 184335 0.00 0.99 0.00 0.00 0.99 3 kworker/u49:1-hci0
02:09:17 PM 1000 188110 0.00 0.99 0.00 0.00 0.99 7 pidstat
- Per-process breakdown over time
- Identify:
- CPU-heavy processes
- Processes waiting on I/O (
%wait)
- This is better than
topfor trends - Use when:
- Issue is intermittent
- Need per-process historical view
Final Thoughts
A quick workflow:
uptime→ is load high?vmstat→ CPU vs memory vs I/O?iostat→ disk bottleneck?top/pidstat→ which process?dmesg→ any kernel-level issues?
More advanced tools later 🙂