The examples so far have focused on tracing a particular program or workload - in other words, every profiling run has specified the program to profile in the command-line e.g. 'perf record wget ...'.
It's also possible, and more interesting in many cases, to run a system-wide profile or trace while running the workload in a separate shell.
To do system-wide profiling or tracing, you typically use the -a flag to 'perf record'.
To demonstrate this, open up one window and start the profile using the -a flag (press Ctrl-C to stop tracing):
root@crownbay:~# perf record -g -a ^C[ perf record: Woken up 6 times to write data ] [ perf record: Captured and wrote 1.400 MB perf.data (~61172 samples) ]
In another window, run the wget test:
root@crownbay:~# wget http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2 Connecting to downloads.yoctoproject.org (140.211.169.59:80) linux-2.6.19.2.tar.b 100% |*******************************| 41727k 0:00:00 ETA
Here we see entries not only for our wget load, but for other processes running on the system as well:
![]() |
In the snapshot above, we can see callchains that originate in libc, and a callchain from Xorg that demonstrates that we're using a proprietary X driver in userspace (notice the presence of 'PVR' and some other unresolvable symbols in the expanded Xorg callchain).
Note also that we have both kernel and userspace entries in the above snapshot. We can also tell perf to focus on userspace but providing a modifier, in this case 'u', to the 'cycles' hardware counter when we record a profile:
root@crownbay:~# perf record -g -a -e cycles:u ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.376 MB perf.data (~16443 samples) ]
![]() |
Notice in the screenshot above, we see only userspace entries ([.])
Finally, we can press 'enter' on a leaf node and select the 'Zoom into DSO' menu item to show only entries associated with a specific DSO. In the screenshot below, we've zoomed into the 'libc' DSO which shows all the entries associated with the libc-xxx.so DSO.
![]() |
We can also use the system-wide -a switch to do system-wide tracing. Here we'll trace a couple of scheduler events:
root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup ^C[ perf record: Woken up 38 times to write data ] [ perf record: Captured and wrote 9.780 MB perf.data (~427299 samples) ]
We can look at the raw output using 'perf script' with no arguments:
root@crownbay:~# perf script perf 1383 [001] 6171.460045: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001 perf 1383 [001] 6171.460066: sched_switch: prev_comm=perf prev_pid=1383 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=21 next_prio=120 kworker/1:1 21 [001] 6171.460093: sched_switch: prev_comm=kworker/1:1 prev_pid=21 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=1383 next_prio=120 swapper 0 [000] 6171.468063: sched_wakeup: comm=kworker/0:3 pid=1209 prio=120 success=1 target_cpu=000 swapper 0 [000] 6171.468107: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:3 next_pid=1209 next_prio=120 kworker/0:3 1209 [000] 6171.468143: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 perf 1383 [001] 6171.470039: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001 perf 1383 [001] 6171.470058: sched_switch: prev_comm=perf prev_pid=1383 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=21 next_prio=120 kworker/1:1 21 [001] 6171.470082: sched_switch: prev_comm=kworker/1:1 prev_pid=21 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=1383 next_prio=120 perf 1383 [001] 6171.480035: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001