3.1.2.4. Using Dynamic Tracepoints

perf isn't restricted to the fixed set of static tracepoints listed by 'perf list'. Users can also add their own 'dynamic' tracepoints anywhere in the kernel. For instance, suppose we want to define our own tracepoint on do_fork(). We can do that using the 'perf probe' perf subcommand:

     root@crownbay:~# perf probe do_fork
     Added new event:
       probe:do_fork        (on do_fork)

     You can now use it in all perf tools, such as:

	     perf record -e probe:do_fork -aR sleep 1
                

Adding a new tracepoint via 'perf probe' results in an event with all the expected files and format in /sys/kernel/debug/tracing/events, just the same as for static tracepoints (as discussed in more detail in the trace events subsystem section:

     root@crownbay:/sys/kernel/debug/tracing/events/probe/do_fork# ls -al
     drwxr-xr-x    2 root     root             0 Oct 28 11:42 .
     drwxr-xr-x    3 root     root             0 Oct 28 11:42 ..
     -rw-r--r--    1 root     root             0 Oct 28 11:42 enable
     -rw-r--r--    1 root     root             0 Oct 28 11:42 filter
     -r--r--r--    1 root     root             0 Oct 28 11:42 format
     -r--r--r--    1 root     root             0 Oct 28 11:42 id

     root@crownbay:/sys/kernel/debug/tracing/events/probe/do_fork# cat format
     name: do_fork
     ID: 944
     format:
	     field:unsigned short common_type;	offset:0;	size:2;	signed:0;
	     field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
	     field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
	     field:int common_pid;	offset:4;	size:4;	signed:1;
	     field:int common_padding;	offset:8;	size:4;	signed:1;

	     field:unsigned long __probe_ip;	offset:12;	size:4;	signed:0;

     print fmt: "(%lx)", REC->__probe_ip
                

We can list all dynamic tracepoints currently in existence:

     root@crownbay:~# perf probe -l
      probe:do_fork        (on do_fork)
      probe:schedule       (on schedule)
                

Let's record system-wide ('sleep 30' is a trick for recording system-wide but basically do nothing and then wake up after 30 seconds):

     root@crownbay:~# perf record -g -a -e probe:do_fork sleep 30
     [ perf record: Woken up 1 times to write data ]
     [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ]
                

Using 'perf script' we can see each do_fork event that fired:

     root@crownbay:~# perf script

     # ========
     # captured on: Sun Oct 28 11:55:18 2012
     # hostname : crownbay
     # os release : 3.4.11-yocto-standard
     # perf version : 3.4.11
     # arch : i686
     # nrcpus online : 2
     # nrcpus avail : 2
     # cpudesc : Intel(R) Atom(TM) CPU E660 @ 1.30GHz
     # cpuid : GenuineIntel,6,38,1
     # total memory : 1017184 kB
     # cmdline : /usr/bin/perf record -g -a -e probe:do_fork sleep 30
     # event : name = probe:do_fork, type = 2, config = 0x3b0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern
      = 0, id = { 5, 6 }
     # HEADER_CPU_TOPOLOGY info available, use -I to display
     # ========
     #
      matchbox-deskto  1197 [001] 34211.378318: do_fork: (c1028460)
      matchbox-deskto  1295 [001] 34211.380388: do_fork: (c1028460)
              pcmanfm  1296 [000] 34211.632350: do_fork: (c1028460)
              pcmanfm  1296 [000] 34211.639917: do_fork: (c1028460)
      matchbox-deskto  1197 [001] 34217.541603: do_fork: (c1028460)
      matchbox-deskto  1299 [001] 34217.543584: do_fork: (c1028460)
               gthumb  1300 [001] 34217.697451: do_fork: (c1028460)
               gthumb  1300 [001] 34219.085734: do_fork: (c1028460)
               gthumb  1300 [000] 34219.121351: do_fork: (c1028460)
               gthumb  1300 [001] 34219.264551: do_fork: (c1028460)
              pcmanfm  1296 [000] 34219.590380: do_fork: (c1028460)
      matchbox-deskto  1197 [001] 34224.955965: do_fork: (c1028460)
      matchbox-deskto  1306 [001] 34224.957972: do_fork: (c1028460)
      matchbox-termin  1307 [000] 34225.038214: do_fork: (c1028460)
      matchbox-termin  1307 [001] 34225.044218: do_fork: (c1028460)
      matchbox-termin  1307 [000] 34225.046442: do_fork: (c1028460)
      matchbox-deskto  1197 [001] 34237.112138: do_fork: (c1028460)
      matchbox-deskto  1311 [001] 34237.114106: do_fork: (c1028460)
                 gaku  1312 [000] 34237.202388: do_fork: (c1028460)
                

And using 'perf report' on the same file, we can see the callgraphs from starting a few programs during those 30 seconds:

Tying it Together: The trace events subsystem accommodate static and dynamic tracepoints in exactly the same way - there's no difference as far as the infrastructure is concerned. See the ftrace section for more details on the trace event subsystem.
Tying it Together: Dynamic tracepoints are implemented under the covers by kprobes and uprobes. kprobes and uprobes are also used by and in fact are the main focus of SystemTap.