aboutsummaryrefslogtreecommitdiff
path: root/clang/docs/CallGraphSection.rst
blob: d5db507f9ca9a60e5e2feaeaa4b8e1d53fb8ddd4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
==================
Call Graph Section
==================

Introduction
============

With ``-fcall-graph-section``, the compiler will create a call graph section 
in the object file. It will include type identifiers for indirect calls and 
targets. This information can be used to map indirect calls to their receivers 
with matching types. A complete and high-precision call graph can be 
reconstructed by complementing this information with disassembly 
(see ``llvm-objdump --call-graph-info``).

Semantics
=========

A coarse-grained, type-agnostic call graph may allow indirect calls to target
any function in the program. This approach ensures completeness since no
indirect call edge is missing. However, it is generally poor in precision
due to having unneeded edges.

A call graph section provides type identifiers for indirect calls and targets.
This information can be used to restrict the receivers of an indirect target to
indirect calls with matching type. Consequently, the precision for indirect
call edges are improved while maintaining the completeness.

The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract
full call graph information by parsing the content of the call graph section
and disassembling the program for complementary information, e.g., direct
calls.

Section layout
==============

A call graph section consists of zero or more call graph entries.
Each entry contains information on a function and its indirect calls.

An entry of a call graph section has the following layout in the binary:

+---------------------+-----------------------------------------------------------------------+
| Element             | Content                                                               |
+=====================+=======================================================================+
| FormatVersionNumber | Format version number.                                                |
+---------------------+-----------------------------------------------------------------------+
| FunctionEntryPc     | Function entry address.                                               |
+---------------------+-----------------------------------+-----------------------------------+
|                     | A flag whether the function is an | - 0: not an indirect target       |
| FunctionKind        | indirect target, and if so,       | - 1: indirect target, unknown id  |
|                     | whether its type id is known.     | - 2: indirect target, known id    |
+---------------------+-----------------------------------+-----------------------------------+
| FunctionTypeId      | Type id for the indirect target. Present only when FunctionKind is 2. |
+---------------------+-----------------------------------------------------------------------+
| CallSiteCount       | Number of type id to indirect call site mappings that follow.         |
+---------------------+-----------------------------------------------------------------------+
| CallSiteList        | List of type id and indirect call site pc pairs.                      |
+---------------------+-----------------------------------------------------------------------+

Each element in an entry (including each element of the contained lists and
pairs) occupies 64-bit space.

The format version number is repeated per entry to support concatenation of
call graph sections with different format versions by the linker.

As of now, the only supported format version is described above and has version
number 0.

Type identifiers
================

The type for an indirect call or target is the function signature.
The mapping from a type to an identifier is an ABI detail.
In the current experimental implementation, an identifier of type T is
computed as follows:

  -  Obtain the generalized mangled name for “typeinfo name for T”.
  -  Compute MD5 hash of the name as a string.
  -  Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer.

To avoid mismatched pointer types, generalizations are applied.
Pointers in return and argument types are treated as equivalent as long as the
qualifiers for the type they point to match.
For example, ``char*``, ``char**``, and ``int*`` are considered equivalent
types. However, ``char*`` and ``const char*`` are considered separate types.

Missing type identifiers
========================

For functions, two cases need to be considered. First, if the compiler cannot
deduce a type id for an indirect target, it will be listed as an indirect target
without a type id. Second, if an object without a call graph section gets
linked, the final call graph section will lack information on functions from
the object. For completeness, these functions need to be taken as receiver to
any indirect call regardless of their type id.
``llvm-objdump --call-graph-info`` lists these functions as indirect targets
with `UNKNOWN` type id.

For indirect calls, current implementation guarantees a type id for each
compiled call. However, if an object without a call graph section gets linked,
no type id will be present for its indirect calls. For completeness, these calls
need to be taken to target any indirect target regardless of their type id. For
indirect calls, ``llvm-objdump --call-graph-info`` prints 1) a complete list of
indirect calls, 2) type id to indirect call mappings. The difference of these
lists allow to deduce the indirect calls with missing type ids.

TODO: measure and report the ratio of missed type ids

Performance
===========

A call graph section does not affect the executable code and does not occupy
memory during process execution. Therefore, there is no performance overhead.

The scheme has not yet been optimized for binary size.

TODO: measure and report the increase in the binary size

Example
=======

For example, consider the following C++ code:

.. code-block:: cpp

    namespace {
      // Not an indirect target
      void foo() {}
    }

    // Indirect target 1
    void bar() {}

    // Indirect target 2
    int baz(char a, float *b) {
      return 0;
    }

    // Indirect target 3
    int main() {
      char a;
      float b;
      void (*fp_bar)() = bar;
      int (*fp_baz1)(char, float*) = baz;
      int (*fp_baz2)(char, float*) = baz;

      // Indirect call site 1
      fp_bar();

      // Indirect call site 2
      fp_baz1(a, &b);

      // Indirect call site 3: shares the type id with indirect call site 2
      fp_baz2(a, &b);

      // Direct call sites
      foo();
      bar();
      baz(a, &b);

      return 0;
    }

Following will compile it with a call graph section created in the binary:

.. code-block:: bash

  $ clang -fcall-graph-section example.cpp

During the construction of the call graph section, the type identifiers are 
computed as follows:

+---------------+-----------------------+----------------------------+----------------------------+
| Function name | Generalized signature | Mangled name (itanium ABI) | Numeric type id (md5 hash) |
+===============+=======================+============================+============================+
|  bar          | void ()               | _ZTSFvvE.generalized       | f85c699bb8ef20a2           |
+---------------+-----------------------+----------------------------+----------------------------+
|  baz          | int (char, void*)     | _ZTSFicPvE.generalized     | e3804d2a7f2b03fe           |
+---------------+-----------------------+----------------------------+----------------------------+
|  main         | int ()                | _ZTSFivE.generalized       | a9494def81a01dc            |
+---------------+-----------------------+----------------------------+----------------------------+

The call graph section will have the following content:

+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
| FormatVersion | FunctionEntryPc | FunctionKind | FunctionTypeId | CallSiteCount | CallSiteList                         |
+===============+=================+==============+================+===============+======================================+
| 0             | EntryPc(foo)    | 0            | (empty)        | 0             | (empty)                              |
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
| 0             | EntryPc(bar)    | 2            | TypeId(bar)    | 0             | (empty)                              |
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
| 0             | EntryPc(baz)    | 2            | TypeId(baz)    | 0             | (empty)                              |
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+
| 0             | EntryPc(main)   | 2            | TypeId(main)   | 3             | * TypeId(bar), CallSitePc(fp_bar())  |
|               |                 |              |                |               | * TypeId(baz), CallSitePc(fp_baz1()) |
|               |                 |              |                |               | * TypeId(baz), CallSitePc(fp_baz2()) |
+---------------+-----------------+--------------+----------------+---------------+--------------------------------------+


The ``llvm-objdump`` utility can parse the call graph section and disassemble
the program to provide complete call graph information. This includes any
additional call sites from the binary:

.. code-block:: bash

    $ llvm-objdump --call-graph-info a.out

    # Comments are not a part of the llvm-objdump's output but inserted for clarifications.

    a.out:  file format elf64-x86-64
    # These warnings are due to the functions and the indirect calls coming from linked objects.
    llvm-objdump: warning: 'a.out': callgraph section does not have type ids for 3 indirect calls
    llvm-objdump: warning: 'a.out': callgraph section does not have information for 10 functions

    # Unknown targets are the 10 functions the warnings mention.
    INDIRECT TARGET TYPES (TYPEID [FUNC_ADDR,])
    UNKNOWN 401000 401100 401234 401050 401090 4010d0 4011d0 401020 401060 401230
    a9494def81a01dc 401150            # main()
    f85c699bb8ef20a2 401120           # bar()
    e3804d2a7f2b03fe 401130           # baz()

    # Notice that the call sites share the same type id as target functions
    INDIRECT CALL TYPES (TYPEID [CALL_SITE_ADDR,])
    f85c699bb8ef20a2 401181           # Indirect call site 1 (fp_bar())
    e3804d2a7f2b03fe 401191 4011a1    # Indirect call site 2 and 3 (fp_baz1() and fp_baz2())

    INDIRECT CALL SITES (CALLER_ADDR [CALL_SITE_ADDR,])
    401000 401012                     # _init
    401150 401181 401191 4011a1       # main calls fp_bar(), fp_baz1(), fp_baz2()
    4011d0 401215                     # __libc_csu_init
    401020 40104a                     # _start

    DIRECT CALL SITES (CALLER_ADDR [(CALL_SITE_ADDR, TARGET_ADDR),])
    4010d0 4010e2 401060              # __do_global_dtors_aux
    401150 4011a6 401110 4011ab 401120 4011ba 401130   # main calls foo(), bar(), baz()
    4011d0 4011fd 401000              # __libc_csu_init

    FUNCTIONS (FUNC_ENTRY_ADDR, SYM_NAME)
    401000 _init
    401100 frame_dummy
    401234 _fini
    401050 _dl_relocate_static_pie
    401090 register_tm_clones
    4010d0 __do_global_dtors_aux
    401110 _ZN12_GLOBAL__N_13fooEv    # (anonymous namespace)::foo()
    401150 main                       # main
    4011d0 __libc_csu_init
    401020 _start
    401060 deregister_tm_clones
    401120 _Z3barv                    # bar()
    401130 _Z3bazcPf                  # baz(char, float*)
    401230 __libc_csu_fini