1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
|
\input texinfo @c -*- Texinfo -*-
@setfilename ctf-spec.info
@settitle The CTF File Format
@ifnottex
@xrefautomaticsectiontitle on
@end ifnottex
@synindex fn cp
@synindex tp cp
@synindex vr cp
@copying
Copyright @copyright{} 2021-2024 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU General Public License, Version 3 or any
later version published by the Free Software Foundation. A copy of the
license is included in the section entitled ``GNU General Public
License''.
@end copying
@dircategory Software development
@direntry
* CTF: (ctf-spec). The CTF file format.
@end direntry
@titlepage
@title The CTF File Format
@subtitle Version 3
@author Nick Alcock
@page
@vskip 0pt plus 1filll
@insertcopying
@end titlepage
@contents
@ifnottex
@node Top
@top The CTF file format
This manual describes version 3 of the CTF file format, which is
intended to model the C type system in a fashion that C programs can
consume at runtime.
@end ifnottex
@node Overview
@unnumbered Overview
@cindex Overview
The CTF file format compactly describes C types and the association
between function and data symbols and types: if embedded in ELF objects,
it can exploit the ELF string table to reduce duplication further.
There is no real concept of namespacing: only top-level types are
described, not types scoped to within single functions.
CTF dictionaries can be @dfn{children} of other dictionaries, in a
one-level hierarchy: child dictionaries can refer to types in the
parent, but the opposite is not sensible (since if you refer to a child
type in the parent, the actual type you cited would vary depending on
what child was attached). This parent/child definition is recorded in
the child, but only as a recommendation: users of the API have to attach
parents to children explicitly, and can choose to attach a child to any
parent they like, or to none, though doing so might lead to unpleasant
consequences like dangling references to types. @xref{Type indexes and
type IDs}. Type lookups in child dicts that are not associated with a
parent at all will fail with @code{ECTF_NOPARENT} if a parent type was
needed.
The associated API to generate, merge together, and query this file
format will be described in the accompanying @code{libctf} manual once
it is written. There is no API to modify dictionaries once they've been
written out: CTF is a write-once file format. (However, it is always
possible to dynamically create a new child dictionary on the fly and
attach it to a pre-existing, read-only parent.)
There are two major pieces to CTF: the @dfn{archive} and the
@dfn{dictionary}. Some relatives and ancestors of CTF call dictionaries
@dfn{containers}: the archive format is unique to this variant of CTF.
(Much of the source code still uses the old term.)
The archive file format is a very simple mmappable archive used to group
multiple dictionaries together into groups: it is expected to slowly go
away and be replaced by other mechanisms, but right now it is an
important part of the file format, used to group dictionaries containing
types with conflicting definitions in different TUs with the overarching
dictionary used to store all other types. (Even when archives go away,
the @code{libctf} API used to access them will remain, and access the
other mechanisms that replace it instead.)
The CTF dictionary consists of a @dfn{preamble}, which does not vary
between versions of the CTF file format, and a @dfn{header} and some
number of @dfn{sections}, which can vary between versions.
The rest of this specification describes the format of these sections,
first for the latest version of CTF, then for all earlier versions
supported by @code{libctf}: the earlier versions are defined in terms of
their differences from the next later one. We describe each part of the
format first by reproducing the C structure which defines that part,
then describing it at greater length in terms of file offsets.
The description of the file format ends with a description of relevant
limits that apply to it. These limits can vary between file format
versions.
This document is quite young, so for now the C code in @file{ctf.h}
should be presumed correct when this document conflicts with it.
@node CTF archive
@chapter CTF archives
@cindex archive, CTF archive
The CTF archive format maps names to CTF dictionaries. The names may
contain any character other than \0, but for now archives containing
slashes in the names may not extract correctly. It is possible to
insert multiple members with the same name, but these are quite hard to
access reliably (you have to iterate through all the members rather than
opening by name) so this is not recommended.
CTF archives are not themselves compressed: the constituent components,
CTF dictionaries, can be compressed. (@xref{CTF header}).
CTF archives usually contain a collection of related dictionaries, one
parent and many children of that parent. CTF archives can have a member
with a @dfn{default name}, @code{.ctf} (which can be represented as
@code{NULL} in the API). If present, this member is usually the parent
of all the children, but it is possible for CTF producers to emit
parents with different names if they wish (usually for backward-
compatibility purposes).
@code{.ctf} sections in ELF objects consist of a single CTF dictionary
rather than an archive of dictionaries if and only if the section
contains no types with identical names but conflicting definitions: if
two conflicting definitions exist, the deduplicator will place the type
most commonly referred to by other types in the parent and will place
the other type in a child named after the translation unit it is found
in, and will emit a CTF archive containing both dictionaries instead of
a raw dictionary. All types that refer to such conflicting types are
also placed in the per-translation-unit child.
The definition of an archive in @file{ctf.h} is as follows:
@verbatim
struct ctf_archive
{
uint64_t ctfa_magic;
uint64_t ctfa_model;
uint64_t ctfa_nfiles;
uint64_t ctfa_names;
uint64_t ctfa_ctfs;
};
typedef struct ctf_archive_modent
{
uint64_t name_offset;
uint64_t ctf_offset;
} ctf_archive_modent_t;
@end verbatim
(Note one irregularity here: the @code{ctf_archive_t} is not a typedef
to @code{struct ctf_archive}, but a different typedef, private to
@code{libctf}, so that things that are not really archives can be made
to appear as if they were.)
All the above items are always in little-endian byte order, regardless
of the machine endianness.
The archive header has the following fields:
@tindex struct ctf_archive
@multitable {Offset} {@code{uint64_t ctfa_nfiles}} {The data model for this archive: an arbitrary integer}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint64_t ctfa_magic}
@vindex ctfa_magic
@vindex struct ctf_archive, ctfa_magic
@tab The magic number for archives, @code{CTFA_MAGIC}: 0x8b47f2a4d7623eeb.
@tindex CTFA_MAGIC
@item 0x08
@tab @code{uint64_t ctfa_model}
@vindex ctfa_model
@vindex struct ctf_archive, ctfa_model
@tab The data model for this archive: an arbitrary integer that serves no
purpose but to be handed back by the libctf API. @xref{Data models}.
@item 0x10
@tab @code{uint64_t ctfa_nfiles}
@vindex ctfa_nfiles
@vindex struct ctf_archive, ctfa_nfiles
@tab The number of CTF dictionaries in this archive.
@item 0x18
@tab @code{uint64_t ctfa_names}
@vindex ctfa_names
@vindex struct ctf_archive, ctfa_names
@tab Offset of the name table, in bytes from the start of the archive.
The name table is an array of @code{struct ctf_archive_modent_t[ctfa_nfiles]}.
@item 0x20
@tab @code{uint64_t ctfa_ctfs}
@vindex ctfa_ctfs
@vindex struct ctf_archive, ctfa_ctfs
@tab Offset of the CTF table. Each element starts with a @code{uint64_t} size,
followed by a CTF dictionary.
@end multitable
The array pointed to by @code{ctfa_names} is an array of entries of
@code{ctf_archive_modent}:
@tindex struct ctf_archive_modent
@tindex ctf_archive_modent_t
@multitable {Offset} {@code{uint64_t name_offset}} {Offset of this name, in bytes from the start}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint64_t name_offset}
@vindex name_offset
@vindex struct ctf_archive_modent, name_offset
@vindex ctf_archive_modent_t, name_offset
@tab Offset of this name, in bytes from the start of the archive.
@item 0x08
@tab @code{uint64_t ctf_offset}
@vindex ctf_offset
@vindex struct ctf_archive_modent, ctf_offset
@vindex ctf_archive_modent_t, ctf_offset
@tab Offset of this CTF dictionary, in bytes from the start of the archive.
@end multitable
The @code{ctfa_names} array is sorted into ASCIIbetical order by name
(i.e. by the result of dereferencing the @code{name_offset}).
The archive file also contains a name table and a table of CTF
dictionaries: these are pointed to by the structures above. The name
table is a simple strtab which is not required to be sorted; the
dictionary array is described above in the entry for @code{ctfa_ctfs}.
The relative order of these various parts is not defined, except that
the header naturally always comes first.
@node CTF dictionaries
@chapter CTF dictionaries
@cindex dictionary, CTF dictionary
CTF dictionaries consist of a header, starting with a premable, and a
number of sections.
@node CTF Preamble
@section CTF Preamble
The preamble is the only part of the CTF dictionary whose format cannot
vary between versions. It is never compressed. It is correspondingly
simple:
@verbatim
typedef struct ctf_preamble
{
unsigned short ctp_magic;
unsigned char ctp_version;
unsigned char ctp_flags;
} ctf_preamble_t;
@end verbatim
@code{#define}s are provided under the names @code{cth_magic},
@code{cth_version} and @code{cth_flags} to make the fields of the
@code{ctf_preamble_t} appear to be part of the @code{ctf_header_t}, so
consuming programs rarely need to consider the existence of the preamble
as a separate structure.
@tindex struct ctf_preamble
@tindex ctf_preamble_t
@multitable {Offset} {@code{unsigned char ctp_version}} {The magic number for CTF dictionaries}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{unsigned short ctp_magic}
@vindex ctp_magic
@vindex cth_magic
@vindex ctf_preamble_t, ctp_magic
@vindex struct ctf_preamble, ctp_magic
@vindex ctf_header_t, cth_magic
@vindex struct ctf_header, cth_magic
@tab The magic number for CTF dictionaries, @code{CTF_MAGIC}: 0xdff2.
@tindex CTF_MAGIC
@item 0x02
@tab @code {unsigned char ctp_version}
@vindex ctp_version
@vindex cth_version
@vindex ctf_preamble_t, ctp_version
@vindex struct ctf_preamble, ctp_version
@vindex ctf_header_t, cth_version
@vindex struct ctf_header, cth_version
@tab The version number of this CTF dictionary.
@item 0x03
@tab @code{ctp_flags}
@vindex ctp_flags
@vindex cth_flags
@vindex ctf_preamble_t, ctp_flags
@vindex struct ctf_preamble, ctp_flags
@vindex ctf_header_t, cth_flags
@vindex struct ctf_header, cth_flags
@tab Flags for this CTF file. @xref{CTF file-wide flags}.
@end multitable
@cindex alignment
Every element of a dictionary must be naturally aligned unless otherwise
specified. (This restriction will be lifted in later versions.)
@cindex endianness
CTF dictionaries are stored in the native endianness of the system that
generates them: the consumer (e.g., @code{libctf}) can detect whether to
endian-flip a CTF dictionary by inspecting the @code{ctp_magic}. (If it
appears as 0xf2df, endian-flipping is needed.)
The version of the CTF dictionary can be determined by inspecting
@code{ctp_version}. The following versions are currently valid, and
@code{libctf} can read all of them:
@tindex CTF_VERSION_3
@cindex CTF versions, versions
@multitable {@code{CTF_VERSION_1_UPGRADED_3}} {Number} {First version, rare. Very similar to Solaris CTF.}
@headitem Version @tab Number @tab Description
@item @code{CTF_VERSION_1}
@tab 1 @tab First version, rare. Very similar to Solaris CTF.
@item @code{CTF_VERSION_1_UPGRADED_3}
@tab 2 @tab First version, upgraded to v3 or higher and written out again.
Name may change. Very rare.
@item @code{CTF_VERSION_2}
@tab 3 @tab Second version, with many range limits lifted.
@item @code{CTF_VERSION_3}
@tab 4 @tab Third and current version, documented here.
@end multitable
This section documents @code{CTF_VERSION_3}.
@vindex ctp_flags
@node CTF file-wide flags
@subsection CTF file-wide flags
The preamble contains bitflags in its @code{ctp_flags} field that
describe various file-wide properties. Some of the flags are valid only
for particular file-format versions, which means the flags can be used
to fix file-format bugs. Consumers that see unknown flags should
accordingly assume that the dictionary is not comprehensible, and
refuse to open them.
The following flags are currently defined. Many are bug workarounds,
valid only in CTFv3, and will not be valid in any future versions: the
same values may be reused for other flags in v4+.
@multitable {@code{CTF_F_NEWFUNCINFO}} {Versions} {Value} {The external strtab is in @code{.dynstr} and the}
@headitem Flag @tab Versions @tab Value @tab Meaning
@tindex CTF_F_COMPRESS
@item @code{CTF_F_COMPRESS} @tab All @tab 0x1 @tab Compressed with zlib
@tindex CTF_F_NEWFUNCINFO
@item @code{CTF_F_NEWFUNCINFO} @tab 3 only @tab 0x2
@tab ``New-format'' func info section.
@tindex CTF_F_IDXSORTED
@item @code{CTF_F_IDXSORTED} @tab 3+ @tab 0x4 @tab The index section is
in sorted order
@tindex CTF_F_DYNSTR
@item @code{CTF_F_DYNSTR} @tab 3 only @tab 0x8 @tab The external strtab is
in @code{.dynstr} and the symtab used is @code{.dynsym}.
@xref{The string section}
@end multitable
@code{CTF_F_NEWFUNCINFO} and @code{CTF_F_IDXSORTED} relate to the
function info and data object sections. @xref{The symtypetab sections}.
Further flags (and further compression methods) wil be added in future.
@node CTF header
@section CTF header
@cindex CTF header
@cindex Sections, header
The CTF header is the first part of a CTF dictionary, including the
preamble. All parts of it other than the preamble (@pxref{CTF Preamble})
can vary between CTF file versions and are never compressed. It
contains things that apply to the dictionary as a whole, and a table of
the sections into which the rest of the dictionary is divided. The
sections tile the file: each section runs from the offset given until
the start of the next section. Only the last section cannot follow this
rule, so the header has a length for it instead.
All section offsets, here and in the rest of the CTF file, are relative to the
@emph{end} of the header. (This is annoyingly different to how offsets in CTF
archives are handled.)
This is the first structure to include offsets into the string table, which are
not straight references because CTF dictionaries can include references into the
ELF string table to save space, as well as into the string table internal to the
CTF dictionary. @xref{The string section} for more on these. Offset 0 is
always the null string.
@verbatim
typedef struct ctf_header
{
ctf_preamble_t cth_preamble;
uint32_t cth_parlabel;
uint32_t cth_parname;
uint32_t cth_cuname;
uint32_t cth_lbloff;
uint32_t cth_objtoff;
uint32_t cth_funcoff;
uint32_t cth_objtidxoff;
uint32_t cth_funcidxoff;
uint32_t cth_varoff;
uint32_t cth_typeoff;
uint32_t cth_stroff;
uint32_t cth_strlen;
} ctf_header_t;
@end verbatim
In detail:
@tindex struct ctf_header
@tindex ctf_header_t
@multitable {Offset} {@code{ctf_preamble_t cth_preamble}} {The parent label, if deduplication happened against}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{ctf_preamble_t cth_preamble}
@vindex cth_preamble
@vindex struct ctf_header, cth_preamble
@vindex ctf_header_t, cth_preamble
@tab The preamble (conceptually embedded in the header). @xref{CTF Preamble}
@item 0x04
@tab @code{uint32_t cth_parlabel}
@vindex cth_parlabel
@vindex struct ctf_header, cth_parlabel
@vindex ctf_header_t, cth_parlabel
@tab The parent label, if deduplication happened against a specific label: a
strtab offset. @xref{The label section}. Currently unused and always 0, but may
be used in future when semantics are attached to the label section.
@item 0x08
@tab @code{uint32_t cth_parname}
@vindex cth_parname
@vindex struct ctf_header, cth_parname
@vindex ctf_header_t, cth_parname
@tab The name of the parent dictionary deduplicated against: a strtab offset.
Interpretation is up to the consumer (usually a CTF archive member name). 0
(the null string) if this is not a child dictionary.
@item 0x1c
@tab @code{uint32_t cth_cuname}
@vindex cth_cuname
@vindex struct ctf_header, cth_cuname
@vindex ctf_header_t, cth_cuname
@tab The name of the compilation unit, for consumers like GDB that want to
know the name of CUs associated with single CUs: a strtab offset. 0 if this
dictionary describes types from many CUs.
@item 0x10
@tab @code{uint32_t cth_lbloff}
@vindex cth_lbloff
@vindex struct ctf_header, cth_lbloff
@vindex ctf_header_t, cth_lbloff
@tab The offset of the label section, which tiles the type space into
named regions. @xref{The label section}.
@item 0x14
@tab @code{uint32_t cth_objtoff}
@vindex cth_objtoff
@vindex struct ctf_header, cth_objtoff
@vindex ctf_header_t, cth_objtoff
@tab The offset of the data object symtypetab section, which maps ELF data symbols to
types. @xref{The symtypetab sections}.
@item 0x18
@tab @code{uint32_t cth_funcoff}
@vindex cth_funcoff
@vindex struct ctf_header, cth_funcoff
@vindex ctf_header_t, cth_funcoff
@tab The offset of the function info symtypetab section, which maps ELF function
symbols to a return type and arg types. @xref{The symtypetab sections}.
@item 0x1c
@tab @code{uint32_t cth_objtidxoff}
@vindex cth_objtidxoff
@vindex struct ctf_header, cth_objtidxoff
@vindex ctf_header_t, cth_objtidxoff
@tab The offset of the object index section, which maps ELF object symbols to
entries in the data object section. @xref{The symtypetab sections}.
@item 0x20
@tab @code{uint32_t cth_funcidxoff}
@vindex cth_funcidxoff
@vindex struct ctf_header, cth_funcidxoff
@vindex ctf_header_t, cth_funcidxoff
@tab The offset of the function info index section, which maps ELF function
symbols to entries in the function info section. @xref{The symtypetab sections}.
@item 0x24
@tab @code{uint32_t cth_varoff}
@vindex cth_varoff
@vindex struct ctf_header, cth_varoff
@vindex ctf_header_t, cth_varoff
@tab The offset of the variable section, which maps string names to types.
@xref{The variable section}.
@item 0x28
@tab @code{uint32_t cth_typeoff}
@vindex cth_typeoff
@vindex struct ctf_header, cth_typeoff
@vindex ctf_header_t, cth_typeoff
@tab The offset of the type section, the core of CTF, which describes types
using variable-length array elements. @xref{The type section}.
@item 0x2c
@tab @code{uint32_t cth_stroff}
@vindex cth_stroff
@vindex struct ctf_header, cth_stroff
@vindex ctf_header_t, cth_stroff
@tab The offset of the string section. @xref{The string section}.
@item 0x30
@tab @code{uint32_t cth_strlen}
@vindex cth_strlen
@vindex struct ctf_header, cth_strlen
@vindex ctf_header_t, cth_strlen
@tab The length of the string section (not an offset!). The CTF file ends
at this point.
@end multitable
Everything from this point on (until the end of the file at @code{cth_stroff} +
@code{cth_strlen}) is compressed with zlib if @code{CTF_F_COMPRESS} is set in
the preamble's @code{ctp_flags}.
@node The type section
@section The type section
@cindex Type section
@cindex Sections, type
This section is the most important section in CTF, describing all the top-level
types in the program. It consists of an array of type structures, each of which
describes a type of some @dfn{kind}: each kind of type has some amount of
variable-length data associated with it (some kinds have none). The amount of
variable-length data associated with a given type can be determined by
inspecting the type, so the reading code can walk through the types in sequence
at opening time.
Each type structure is one of a set of overlapping structures in a discriminated
union of sorts: the variable-length data for each type immediately follows the
type's type structure. Here's the largest of the overlapping structures, which
is only needed for huge types and so is very rarely seen:
@verbatim
typedef struct ctf_type
{
uint32_t ctt_name;
uint32_t ctt_info;
__extension__
union
{
uint32_t ctt_size;
uint32_t ctt_type;
};
uint32_t ctt_lsizehi;
uint32_t ctt_lsizelo;
} ctf_type_t;
@end verbatim
Here's the much more common smaller form:
@verbatim
typedef struct ctf_stype
{
uint32_t ctt_name;
uint32_t ctt_info;
__extension__
union
{
uint32_t ctt_size;
uint32_t ctt_type;
};
} ctf_stype_t;
@end verbatim
If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type
is described by a @code{ctf_type_t}: otherwise, a @code{ctf_stype_t}.
@tindex CTF_LSIZE_SENT
Here's what the fields mean:
@tindex struct ctf_type
@tindex struct ctf_stype
@tindex ctf_type_t
@tindex ctf_stype_t
@multitable {0x1c (@code{ctf_type_t}} {@code{uint32_t ctt_lsizehi}} {The size of this type, if this type is of a kind for}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint32_t ctt_name}
@vindex ctt_name
@tab Strtab offset of the type name, if any (0 if none).
@item 0x04
@tab @code{uint32_t ctt_info}
@vindex ctt_info
@vindex struct ctf_type, ctt_info
@vindex ctf_type_t, ctt_info
@vindex struct ctf_stype, ctt_info
@vindex ctf_stype_t, ctt_info
@tab The @dfn{info word}, containing information on the kind of this type, its
variable-length data and whether it is visible to name lookup. See @xref{The
info word}.
@item 0x08
@tab @code{uint32_t ctt_size}
@vindex ctt_size
@vindex struct ctf_type, ctt_size
@vindex ctf_type_t, ctt_size
@vindex struct ctf_stype, ctt_size
@vindex ctf_stype_t, ctt_size
@tab The size of this type, if this type is of a kind for which a size needs
to be recorded (constant-size types don't need one). If this is
@code{CTF_LSIZE_SENT}, this type is a huge type described by @code{ctf_type_t}.
@item 0x08
@tab @code{uint32_t ctt_type}
@vindex ctt_type
@vindex struct ctf_stype, ctt_type
@vindex ctf_stype_t, ctt_type
@tab The type this type refers to, if this type is of a kind which refers to
other types (like a pointer). All such types are fixed-size, and no types that
are variable-size refer to other types, so @code{ctt_size} and @code{ctt_type}
overlap. All type kinds that use @code{ctt_type} are described by
@code{ctf_stype_t}, not @code{ctf_type_t}. @xref{Type indexes and type IDs}.
@item 0x0c (@code{ctf_type_t} only)
@tab @code{uint32_t ctt_lsizehi}
@vindex ctt_lsizehi
@vindex struct ctf_type, ctt_lsizehi
@vindex ctf_type_t, ctt_lsizehi
@tab The high 32 bits of the size of a very large type. The @code{CTF_TYPE_LSIZE} macro
can be used to get a 64-bit size out of this field and the next one.
@code{CTF_SIZE_TO_LSIZE_HI} splits the @code{ctt_lsizehi} out of it again.
@findex CTF_TYPE_LSIZE
@findex CTF_SIZE_TO_LSIZE_HI
@item 0x10 (@code{ctf_type_t} only)
@tab @code{uint32_t ctt_lsizelo}
@vindex ctt_lsizelo
@vindex struct ctf_type, ctt_lsizelo
@vindex ctf_type_t, ctt_lsizelo
@tab The low 32 bits of the size of a very large type.
@code{CTF_SIZE_TO_LSIZE_LO} splits the @code{ctt_lsizelo} out of a 64-bit size.
@findex CTF_SIZE_TO_LSIZE_LO
@end multitable
Two aspects of this need further explanation: the info word, and what exactly a
type ID is and how you determine it. (Information on the various type-kind-
dependent things, like whether @code{ctt_size} or @code{ctt_type} is used,
is described in the section devoted to each kind.)
@node The info word
@subsection The info word, ctt_info
The info word is a bitfield split into three parts. From MSB to LSB:
@multitable {Bit offset} {@code{isroot}} {Length of variable-length data for this type (some kinds only).}
@headitem Bit offset @tab Name @tab Description
@item 26--31
@tab @code{kind}
@tab Type kind: @pxref{Type kinds}.
@item 25
@tab @code{isroot}
@tab 1 if this type is visible to name lookup
@item 0--24
@tab @code{vlen}
@tab Length of variable-length data for this type (some kinds only).
The variable-length data directly follows the @code{ctf_type_t} or
@code{ctf_stype_t}. This is a kind-dependent array length value,
not a length in bytes. Some kinds have no variable-length data, or
fixed-size variable-length data, and do not use this value.
@end multitable
The most mysterious of these is undoubtedly @code{isroot}. This indicates
whether types with names (nonzero @code{ctt_name}) are visible to name lookup:
if zero, this type is considered a @dfn{non-root type} and you can't look it up
by name at all. Multiple types with the same name in the same C namespace
(struct, union, enum, other) can exist in a single dictionary, but only one of
them may have a nonzero value for @code{isroot}. @code{libctf} validates this
at open time and refuses to open dictionaries that violate this constraint.
Historically, this feature was introduced for the encoding of bitfields
(@pxref{Integer types}): for instance, int bitfields will all be named
@code{int} with different widths or offsets, but only the full-width one at
offset zero is wanted when you look up the type named @code{int}. With the
introduction of slices (@pxref{Slices}) as a more general bitfield encoding
mechanism, this is less important, but we still use non-root types to handle
conflicts if the linker API is used to fuse multiple translation units into one
dictionary and those translation units contain types with the same name and
conflicting definitions. (We do not discuss this further here, because the
linker never does this: only specialized type mergers do, like that used for the
Linux kernel. The libctf documentation will describe this in more detail.)
@c XXX update when libctf docs are written.
The @code{CTF_TYPE_INFO} macro can be used to compose an info word from
a @code{kind}, @code{isroot}, and @code{vlen}; @code{CTF_V2_INFO_KIND},
@code{CTF_V2_INFO_ISROOT} and @code{CTF_V2_INFO_VLEN} pick it apart again.
@findex CTF_TYPE_INFO
@findex CTF_V2_INFO_KIND
@findex CTF_V2_INFO_ISROOT
@findex CTF_V2_INFO_VLEN
@node Type indexes and type IDs
@subsection Type indexes and type IDs
@cindex Type indexes
@cindex Type IDs
@cindex Type, IDs of
@cindex Type, indexes of
@cindex ctf_id_t
@cindex Parent range
@cindex Child range
@cindex Type IDs, ranges
Types are referred to within the CTF file via @dfn{type IDs}. A type ID is a
number from 0 to @math{2^32}, from a space divided in half. Types @math{2^31-1}
and below are in the @dfn{parent range}: these IDs are used for dictionaries
that have not had any other dictionary @code{ctf_import}ed into it as a parent.
Both completely standalone dictionaries and parent dictionaries with children
hanging off them have types in this range. Types @math{2^31} and above are in
the @dfn{child range}: only types in child dictionaries are in this range.
These IDs appear in @code{ctf_type_t.ctt_type} (@pxref{The type section}), but
the types themselves have no visible ID: quite intentionally, because adding an
ID uses space, and every ID is different so they don't compress well. The IDs
are implicit: at open time, the consumer walks through the entire type section
and counts the types in the type section. The type section is an array of
variable-length elements, so each entry could be considered as having an index,
starting from 1. We count these indexes and associate each with its
corresponding @code{ctf_type_t} or @code{ctf_stype_t}.
Lookups of types with IDs in the parent space look in the parent dictionary if
this dictionary has one associated with it; lookups of types with IDs in the
child space error out if the dictionary does not have a parent, and otherwise
convert the ID into an index by shaving off the top bit and look up the index
in the child.
These properties mean that the same dictionary can be used as a parent of child
dictionaries and can also be used directly with no children at all, but a
dictionary created as a child dictionary must always be associated with a parent
--- usually, the same parent --- because its references to its own types have
the high bit turned on and this is only flipped off again if this is a child
dictionary. (This is not a problem, because if you @emph{don't} associate the
child with a parent, any references within it to its parent types will fail, and
there are almost certain to be many such references, or why is it a child at
all?)
This does mean that consumers should keep a close eye on the distinction between
type IDs and type indexes: if you mix them up, everything will appear to work as
long as you're only using parent dictionaries or standalone dictionaries, but as
soon as you start using children, everything will fail horribly.
Type index zero, and type ID zero, are used to indicate that this type cannot be
represented in CTF as currently constituted: they are emitted by the compiler,
but all type chains that terminate in the unknown type are erased at link time
(structure fields that use them just vanish, etc). So you will probably never
see a use of type zero outside the symtypetab sections, where they serve as
sentinels of sorts, to indicate symbols with no associated type.
The macros @code{CTF_V2_TYPE_TO_INDEX} and @code{CTF_V2_INDEX_TO_TYPE} may help
in translation between types and indexes: @code{CTF_V2_TYPE_ISPARENT} and
@code{CTF_V2_TYPE_ISCHILD} can be used to tell whether a given ID is in the
parent or child range.
@findex CTF_V2_TYPE_TO_INDEX
@findex CTF_V2_INDEX_TO_TYPE
@findex CTF_V2_TYPE_ISPARENT
@findex CTF_V2_TYPE_ISCHILD
It is quite possible and indeed common for type IDs to point forward in the
dictionary, as well as backward.
@node Type kinds
@subsection Type kinds
@cindex Type kinds
@cindex Type, kinds of
Every type in CTF is of some @dfn{kind}. Each kind is some variety of C type:
all structures are a single kind, as are all unions, all pointers, all arrays,
all integers regardless of their bitfield width, etc. The kind of a type is
given in the @code{kind} field of the @code{ctt_info} word (@pxref{The info
word}).
The space of type kinds is only a quarter full so far, so there is plenty of
room for expansion. It is likely that in future versions of the file format,
types with smaller kinds will be more efficiently encoded than types with larger
kinds, so their numerical value will actually start to matter in future. (So
these IDs will probably change their numerical values in a later release of this
format, to move more frequently-used kinds like structures and cv-quals towards
the top of the space, and move rarely-used kinds like integers downwards. Yes,
integers are rare: how many kinds of @code{int} are there in a program? They're
just very frequently @emph{referenced}.)
Here's the set of kinds so far. Each kind has a @code{#define} associated with
it, also given here.
@multitable {Kind} {@code{CTF_K_VOLATILE}} {Indicates a type that cannot be represented in CTF, or that} {@xref{Pointers typedefs and cvr-quals}}
@headitem Kind @tab Macro @tab Purpose
@item 0
@tab @code{CTF_K_UNKNOWN}
@tab Indicates a type that cannot be represented in CTF, or that is being skipped.
It is very similar to type ID 0, except that you can have @emph{multiple}, distinct types
of kind @code{CTF_K_UNKNOWN}.
@tindex CTF_K_UNKNOWN
@item 1
@tab @code{CTF_K_INTEGER}
@tab An integer type. @xref{Integer types}.
@item 2
@tab @code{CTF_K_FLOAT}
@tab A floating-point type. @xref{Floating-point types}.
@item 3
@tab @code{CTF_K_POINTER}
@tab A pointer. @xref{Pointers typedefs and cvr-quals}.
@item 4
@tab @code{CTF_K_ARRAY}
@tab An array. @xref{Arrays}.
@item 5
@tab @code{CTF_K_FUNCTION}
@tab A function pointer. @xref{Function pointers}.
@item 6
@tab @code{CTF_K_STRUCT}
@tab A structure. @xref{Structs and unions}.
@item 7
@tab @code{CTF_K_UNION}
@tab A union. @xref{Structs and unions}.
@item 8
@tab @code{CTF_K_ENUM}
@tab An enumerated type. @xref{Enums}.
@item 9
@tab @code{CTF_K_FORWARD}
@tab A forward. @xref{Forward declarations}.
@item 10
@tab @code{CTF_K_TYPEDEF}
@tab A typedef. @xref{Pointers typedefs and cvr-quals}.
@item 11
@tab @code{CTF_K_VOLATILE}
@tab A volatile-qualified type. @xref{Pointers typedefs and cvr-quals}.
@item 12
@tab @code{CTF_K_CONST}
@tab A const-qualified type. @xref{Pointers typedefs and cvr-quals}.
@item 13
@tab @code{CTF_K_RESTRICT}
@tab A restrict-qualified type. @xref{Pointers typedefs and cvr-quals}.
@item 14
@tab @code{CTF_K_SLICE}
@tab A slice, a change of the bit-width or offset of some other type. @xref{Slices}.
@end multitable
Now we cover all type kinds in turn. Some are more complicated than others.
@node Integer types
@subsection Integer types
@cindex Integer types
@cindex Types, integer
@tindex int
@tindex long
@tindex long long
@tindex short
@tindex char
@tindex bool
@tindex unsigned int
@tindex unsigned long
@tindex unsigned long long
@tindex unsigned short
@tindex unsigned char
@tindex signed int
@tindex signed long
@tindex signed long long
@tindex signed short
@tindex signed char
@cindex CTF_K_INTEGER
Integral types are all represented as types of kind @code{CTF_K_INTEGER}. These
types fill out @code{ctt_size} in the @code{ctf_stype_t} with the size in bytes
of the integral type in question. They are always represented by
@code{ctf_stype_t}, never @code{ctf_type_t}. Their variable-length data is one
@code{uint32_t} in length: @code{vlen} in the info word should be disregarded
and is always zero.
The variable-length data for integers has multiple items packed into it much
like the info word does.
@multitable {Bit offset} {Encoding} {The integer encoding and desired display representation.}
@headitem Bit offset @tab Name @tab Description
@item 24--31
@tab Encoding
@tab The desired display representation of this integer. You can extract this
field with the @code{CTF_INT_ENCODING} macro. See below.
@findex CTF_INT_ENCODING
@item 16--23
@tab Offset
@tab The offset of this integral type in bits from the start of its enclosing
structure field, adjusted for endianness: @pxref{Structs and unions}. You can
extract this field with the @code{CTF_INT_OFFSET} macro.
@findex CTF_INT_OFFSET
@item 0--15
@tab Bit-width
@tab The width of this integral type in bits. You can extract this field with
the @code{CTF_INT_BITS} macro.
@findex CTF_INT_BITS
@end multitable
If you choose, bitfields can be represented using the things above as a sort of
integral type with the @code{isroot} bit flipped off and the offset and bits
values set in the vlen word: you can populate it with the @code{CTF_INT_DATA}
macro. (But it may be more convenient to represent them using slices of a
full-width integer: @pxref{Slices}.)
@findex CTF_INT_DATA
Integers that are bitfields usually have a @code{ctt_size} rounded up to the
nearest power of two in bytes, for natural alignment (e.g. a 17-bit integer
would have a @code{ctt_size} of 4). However, not all types are naturally
aligned on all architectures: packed structures may in theory use integral
bitfields with different @code{ctt_size}, though this is rarely observed.
The @dfn{encoding} for integers is a bit-field comprised of the values below,
which consumers can use to decide how to display values of this type:
@multitable {Offset} {@code{CTF_INT_VARARGS}} {If set, this is a char type. It is platform-dependent whether unadorned}
@headitem Offset @tab Name @tab Description
@item 0x01
@tab @code{CTF_INT_SIGNED}
@tab If set, this is a signed int: if false, unsigned.
@tindex CTF_INT_SIGNED
@item 0x02
@tab @code{CTF_INT_CHAR}
@tab If set, this is a char type. It is platform-dependent whether unadorned
@code{char} is signed or not: the @code{CTF_CHAR} macro produces an integral
type suitable for the definition of @code{char} on this platform.
@tindex CTF_INT_CHAR
@findex CTF_CHAR
@item 0x04
@tab @code{CTF_INT_BOOL}
@tab If set, this is a boolean type. (It is theoretically possible to turn this
and @code{CTF_INT_CHAR} on at the same time, but it is not clear what this would
mean.)
@tindex CTF_INT_BOOL
@item 0x08
@tab @code{CTF_INT_VARARGS}
@tab If set, this is a varargs-promoted value in a K&R function definition.
This is not currently produced or consumed by anything that we know of: it is set
aside for future use.
@end multitable
The GCC ``@code{Complex int}'' and fixed-point extensions are not yet supported:
references to such types will be emitted as type 0.
@node Floating-point types
@subsection Floating-point types
@cindex Floating-point types
@cindex Types, floating-point
@tindex float
@tindex double
@tindex signed float
@tindex signed double
@tindex unsigned float
@tindex unsigned double
@tindex Complex, float
@tindex Complex, double
@tindex Complex, signed float
@tindex Complex, signed double
@tindex Complex, unsigned float
@tindex Complex, unsigned double
@cindex CTF_K_FLOAT
Floating-point types are all represented as types of kind @code{CTF_K_FLOAT}.
Like integers, These types fill out @code{ctt_size} in the @code{ctf_stype_t}
with the size in bytes of the floating-point type in question. They are always
represented by @code{ctf_stype_t}, never @code{ctf_type_t}.
This part of CTF shows many rough edges in the more obscure corners of
floating-point handling, and is likely to change in format v4.
The variable-length data for floats has multiple items packed into it just like
integers do:
@multitable {Bit offset} {Encoding} {The floating-;point encoding and desired display representation.}
@headitem Bit offset @tab Name @tab Description
@item 24--31
@tab Encoding
@tab The desired display representation of this float. You can extract this
field with the @code{CTF_FP_ENCODING} macro. See below.
@findex CTF_FP_ENCODING
@item 16--23
@tab Offset
@tab The offset of this floating-point type in bits from the start of its enclosing
structure field, adjusted for endianness: @pxref{Structs and unions}. You can
extract this field with the @code{CTF_FP_OFFSET} macro.
@findex CTF_FP_OFFSET
@item 0--15
@tab Bit-width
@tab The width of this floating-point type in bits. You can extract this field with
the @code{CTF_FP_BITS} macro.
@findex CTF_FP_BITS
@end multitable
The purpose of the floating-point offset and bit-width is somewhat opaque, since
there are no such things as floating-point bitfields in C: the bit-width should
be filled out with the full width of the type in bits, and the offset should
always be zero. It is likely that these fields will go away in the future. As
with integers, you can use @code{CTF_FP_DATA} to assemble one of these vlen
items from its component parts.
@findex CTF_INT_DATA
The @dfn{encoding} for floats is not a bitfield but a simple value indicating
the display representation. Many of these are unused, relate to
Solaris-specific compiler extensions, and will be recycled in future: some are
unused and will become used in future.
@multitable {Offset} {@code{CTF_FP_LDIMAGRY}} {This is a @code{float} interval type, a Solaris-specific extension.}
@headitem Offset @tab Name @tab Description
@item 1
@tab @code{CTF_FP_SINGLE}
@tab This is a single-precision IEEE 754 @code{float}.
@tindex CTF_FP_SINGLE
@item 2
@tab @code{CTF_FP_DOUBLE}
@tab This is a double-precision IEEE 754 @code{double}.
@tindex CTF_FP_DOUBLE
@item 3
@tab @code{CTF_FP_CPLX}
@tab This is a @code{Complex float}.
@tindex CTF_FP_CPLX
@item 4
@tab @code{CTF_FP_DCPLX}
@tab This is a @code{Complex double}.
@tindex CTF_FP_DCPLX
@item 5
@tab @code{CTF_FP_LDCPLX}
@tab This is a @code{Complex long double}.
@tindex CTF_FP_LDCPLX
@item 6
@tab @code{CTF_FP_LDOUBLE}
@tab This is a @code{long double}.
@tindex CTF_FP_LDOUBLE
@item 7
@tab @code{CTF_FP_INTRVL}
@tab This is a @code{float} interval type, a Solaris-specific extension.
Unused: will be recycled.
@tindex CTF_FP_INTRVL
@cindex Unused bits
@item 8
@tab @code{CTF_FP_DINTRVL}
@tab This is a @code{double} interval type, a Solaris-specific extension.
Unused: will be recycled.
@tindex CTF_FP_DINTRVL
@cindex Unused bits
@item 9
@tab @code{CTF_FP_LDINTRVL}
@tab This is a @code{long double} interval type, a Solaris-specific extension.
Unused: will be recycled.
@tindex CTF_FP_LDINTRVL
@cindex Unused bits
@item 10
@tab @code{CTF_FP_IMAGRY}
@tab This is a the imaginary part of a @code{Complex float}. Not currently
generated. May change.
@tindex CTF_FP_IMAGRY
@cindex Unused bits
@item 11
@tab @code{CTF_FP_DIMAGRY}
@tab This is a the imaginary part of a @code{Complex double}. Not currently
generated. May change.
@tindex CTF_FP_DIMAGRY
@cindex Unused bits
@item 12
@tab @code{CTF_FP_LDIMAGRY}
@tab This is a the imaginary part of a @code{Complex long double}. Not currently
generated. May change.
@tindex CTF_FP_LDIMAGRY
@cindex Unused bits
@end multitable
The use of the complex floating-point encodings is obscure: it is possible that
@code{CTF_FP_CPLX} is meant to be used for only the real part of complex types,
and @code{CTF_FP_IMAGRY} et al for the imaginary part -- but for now, we are
emitting @code{CTF_FP_CPLX} to cover the entire type, with no way to get at its
constituent parts. There appear to be no uses of these encodings anywhere, so
they are quite likely to change incompatibly in future.
@node Slices
@subsection Slices
@cindex Slices
@cindex Types, slices of integral
@tindex CTF_K_SLICE
Slices, with kind @code{CTF_K_SLICE}, are an unusual CTF construct: they do not
directly correspond to any C type, but are a way to model other types in a more
convenient fashion for CTF generators.
A slice is like a pointer or other reference type in that they are always
represented by @code{ctf_stype_t}: but unlike pointers and other reference
types, they populate the @code{ctt_size} field just like integral types do, and
come with an attached encoding and transform the encoding of the underlying
type. The underlying type is described in the variable-length data, similarly
to structure and union fields: see below. Requests for the type size should
also chase down to the referenced type.
Slices are always nameless: @code{ctt_name} is always zero for them.
(The @code{libctf} API behaviour is unusual as well, and justifies the existence
of slices: @code{ctf_type_kind} never returns @code{CTF_K_SLICE} but always the
underlying type kind, so that consumers never need to know about slices: they
can tell if an apparent integer is actually a slice if they need to by calling
@code{ctf_type_reference}, which will uniquely return the underlying integral
type rather than erroring out with @code{ECTF_NOTREF} if this is actually a
slice. So slices act just like an integer with an encoding, but more closely
mirror DWARF and other debugging information formats by allowing CTF file
creators to represent a bitfield as a slice of an underlying integral type.)
@findex Slices, effect on ctf_type_kind
@findex Slices, effect on ctf_type_reference
@findex libctf, effect of slices
The vlen in the info word for a slice should be ignored and is always zero. The
variable-length data for a slice is a single @code{ctf_slice_t}:
@verbatim
typedef struct ctf_slice
{
uint32_t cts_type;
unsigned short cts_offset;
unsigned short cts_bits;
} ctf_slice_t;
@end verbatim
@tindex struct ctf_slice
@tindex ctf_slice_t
@multitable {Offset} {@code{unsigned short cts_offset}} {The type this slice is a slice of. Must be an}
@headitem Offset @tab Name @tab Description
@item 0x0
@tab @code{uint32_t cts_type}
@vindex cts_type
@vindex struct ctf_slice, cts_type
@vindex ctf_slice_t, cts_type
@tab The type this slice is a slice of. Must be an integral type (or a
floating-point type, but this nonsensical option will go away in v4.)
@item 0x4
@tab @code{unsigned short cts_offset}
@vindex cts_offset
@vindex struct ctf_slice, cts_offset
@vindex ctf_slice_t, cts_offset
@tab The offset of this integral type in bits from the start of its enclosing
structure field, adjusted for endianness: @pxref{Structs and unions}. Identical
semantics to the @code{CTF_INT_OFFSET} field: @pxref{Integer types}. This field
is much too long, because the maximum possible offset of an integral type would
easily fit in a char: this field is bigger just for the sake of alignment. This
will change in v4.
@item 0x6
@tab @code{unsigned short cts_bits}
@vindex cts_bits
@vindex struct ctf_slice, cts_bits
@vindex ctf_slice_t, cts_bits
@tab The bit-width of this integral type. Identical semantics to the
@code{CTF_INT_BITS} field: @pxref{Integer types}. As above, this field is
really too large and will shrink in v4.
@end multitable
@node Pointers typedefs and cvr-quals
@subsection Pointers, typedefs, and cvr-quals
@cindex Pointers
@cindex Typedefs
@cindex cvr-quals
@tindex typedef
@tindex const
@tindex volatile
@tindex restrict
@tindex CTF_K_POINTER
@tindex CTF_K_TYPEDEF
@tindex CTF_K_CONST
@tindex CTF_K_VOLATILE
@tindex CTF_K_RESTRICT
Pointers, @code{typedef}s, and @code{const}, @code{volatile} and @code{restrict}
qualifiers are represented identically except for their type kind (though they
may be treated differently by consuming libraries like @code{libctf}, since
pointers affect assignment-compatibility in ways cvr-quals do not, and they may
have different alignment requirements, etc).
All of these are represented by @code{ctf_stype_t}, have no variable data at
all, and populate @code{ctt_type} with the type ID of the type they point
to. These types can stack: a @code{CTF_K_RESTRICT} can point to a
@code{CTF_K_CONST} which can point to a @code{CTF_K_POINTER} etc.
They are all unnamed: @code{ctt_name} is 0.
The size of @code{CTF_K_POINTER} is derived from the data model (@pxref{Data
models}), i.e. in practice, from the target machine ABI, and is not explicitly
represented. The size of other kinds in this set should be determined by
chasing ctf_types as necessary until a non-typedef/const/volatile/restrict is
found, and using that.
@node Arrays
@subsection Arrays
@cindex Arrays
Arrays are encoded as types of kind @code{CTF_K_ARRAY} in a @code{ctf_stype_t}.
Both size and kind for arrays are zero. The variable-length data is a
@code{ctf_array_t}: @code{vlen} in the info word should be disregarded and is
always zero.
@verbatim
typedef struct ctf_array
{
uint32_t cta_contents;
uint32_t cta_index;
uint32_t cta_nelems;
} ctf_array_t;
@end verbatim
@tindex struct ctf_array
@tindex ctf_array_t
@multitable {Offset} {@code{unsigned short cta_contents}} {The type of the array index: a type ID of an}
@headitem Offset @tab Name @tab Description
@item 0x0
@tab @code{uint32_t cta_contents}
@vindex cta_contents
@vindex struct ctf_array, cta_contents
@vindex ctf_array_t, cta_contents
@tab The type of the array elements: a type ID.
@item 0x4
@tab @code{uint32_t cta_index}
@vindex cta_index
@vindex struct ctf_array, cta_index
@vindex ctf_array_t, cta_index
@tab The type of the array index: a type ID of an integral type.
If this is a variable-length array, the index type ID will be 0
(but the actual index type of this array is probably @code{int}).
Probably redundant and may be dropped in v4.
@item 0x8
@tab @code{uint32_t cta_nelems}
@vindex cta_nelems
@vindex struct ctf_array, cta_nelems
@vindex ctf_array_t, cta_nelems
@tab The number of array elements. 0 for VLAs, and also for
the historical variety of VLA which has explicit zero dimensions (which will
have a nonzero @code{cta_index}.)
@end multitable
The size of an array can be computed by simple multiplication of the size of the
@code{cta_contents} type by the @code{cta_nelems}.
@node Function pointers
@subsection Function pointers
@cindex Function pointers
@cindex Pointers, to functions
Function pointers are explicitly represented in the CTF type section by a type
of kind @code{CTF_K_FUNCTION}, always encoded with a @code{ctf_stype_t}. The
@code{ctt_type} is the function return type ID. The @code{vlen} in the info
word is the number of arguments, each of which is a type ID, a @code{uint32_t}:
if the last argument is 0, this is a varargs function and the number of
arguments is one less than indicated by the vlen.
If the number of arguments is odd, a single @code{uint32_t} of padding is
inserted to maintain alignment.
@node Enums
@subsection Enums
@cindex Enums
@tindex enum
@tindex CTF_K_ENUM
Enumerated types are represented as types of kind @code{CTF_K_ENUM} in a
@code{ctf_stype_t}. The @code{ctt_size} is always the size of an int from the
data model (enum bitfields are implemented via slices). The @code{vlen} is a
count of enumerations, each of which is represented by a @code{ctf_enum_t} in
the vlen:
@verbatim
typedef struct ctf_enum
{
uint32_t cte_name;
int32_t cte_value;
} ctf_enum_t;
@end verbatim
@tindex struct ctf_enum
@tindex ctf_enum_t
@multitable {Offset} {@code{int32_t cte_value}} {Strtab offset of the enumeration name.}
@headitem Offset @tab Name @tab Description
@item 0x0
@tab @code{uint32_t cte_name}
@vindex cte_name
@vindex struct ctf_enum, cte_name
@vindex ctf_enum_t, cte_name
@tab Strtab offset of the enumeration name. Must not be 0.
@item 0x4
@tab @code{int32_t cte_value}
@vindex cte_value
@vindex struct ctf_enum, cte_value
@vindex ctf_enum_t, cte_value
@tab The enumeration value.
@end multitable
Enumeration values larger than @math{2^32} are not yet supported and are omitted
from the enumeration. (v4 will lift this restriction by encoding the value
differently.)
Forward declarations of enums are not implemented with this kind: @pxref{Forward
declarations}.
Enumerated type names, as usual in C, go into their own namespace, and do not
conflict with non-enums, structs, or unions with the same name.
@node Structs and unions
@subsection Structs and unions
@cindex Structures
@cindex Unions
@tindex struct
@tindex union
@tindex CTF_K_STRUCT
@tindex CTF_K_UNION
Structures and unions are represnted as types of kind @code{CTF_K_STRUCT} and
@code{CTF_K_UNION}: their representation is otherwise identical, and it is
perfectly allowed for ``structs'' to contain overlapping fields etc, so we will
treat them together for the rest of this section.
They fill out @code{ctt_size}, and use @code{ctf_type_t} in preference to
@code{ctf_stype_t} if the structure size is greater than @code{CTF_MAX_SIZE}
(0xfffffffe).
@tindex CTF_MAX_LSIZE
The vlen for structures and unions is a count of structure fields, but the type
used to represent a structure field (and thus the size of the variable-length
array element representing the type) depends on the size of the structure: truly
huge structures, greater than @code{CTF_LSTRUCT_THRESH} bytes in length, use a
different type. (@code{CTF_LSTRUCT_THRESH} is 536870912, so such structures are
vanishingly rare: in v4, this representation will change somewhat for greater
compactness. It's inherited from v1, where the limits were much lower.)
@tindex CTF_LSTRUCT_THRESH
Most structures can get away with using @code{ctf_member_t}:
@verbatim
typedef struct ctf_member_v2
{
uint32_t ctm_name;
uint32_t ctm_offset;
uint32_t ctm_type;
} ctf_member_t;
@end verbatim
Huge structures that are represented by @code{ctf_type_t} rather than
@code{ctf_stype_t} have to use @code{ctf_lmember_t}, which splits the offset as
@code{ctf_type_t} splits the size:
@verbatim
typedef struct ctf_lmember_v2
{
uint32_t ctlm_name;
uint32_t ctlm_offsethi;
uint32_t ctlm_type;
uint32_t ctlm_offsetlo;
} ctf_lmember_t;
@end verbatim
Here's what the fields of @code{ctf_member} mean:
@tindex struct ctf_member_v2
@tindex ctf_member_t
@multitable {Offset} {@code{uint32_t ctm_offset}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint32_t ctm_name}
@vindex ctm_name
@vindex struct ctf_member_v2, ctm_name
@vindex ctf_member_t, ctm_name
@tab Strtab offset of the field name.
@item 0x04
@tab @code{uint32_t ctm_offset}
@vindex ctm_offset
@vindex struct ctf_member_v2, ctm_offset
@vindex ctf_member_t, ctm_offset
@tab The offset of this field @emph{in bits}. (Usually, for bitfields, this is
machine-word-aligned and the individual field has an offset in bits, but
the format allows for the offset to be encoded in bits here.)
@item 0x08
@tab @code{uint32_t ctm_type}
@vindex ctm_type
@vindex struct ctf_member_v2, ctm_type
@vindex ctf_member_t, ctm_type
@tab The type ID of the type of the field.
@end multitable
Here's what the fields of the very similar @code{ctf_lmember} mean:
@tindex struct ctf_lmember_v2
@tindex ctf_lmember_t
@multitable {Offset} {@code{uint32_t ctlm_offsethi}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint32_t ctlm_name}
@vindex ctlm_name
@vindex struct ctf_lmember_v2, ctlm_name
@vindex ctf_lmember_t, ctlm_name
@tab Strtab offset of the field name.
@item 0x04
@tab @code{uint32_t ctlm_offsethi}
@vindex ctlm_offsethi
@vindex struct ctf_lmember_v2, ctlm_offsethi
@vindex ctf_lmember_t, ctlm_offsethi
@tab The high 32 bits of the offset of this field in bits.
@item 0x08
@tab @code{uint32_t ctlm_type}
@vindex ctm_type
@vindex struct ctf_lmember_v2, ctlm_type
@vindex ctf_member_t, ctlm_type
@tab The type ID of the type of the field.
@item 0x0c
@tab @code{uint32_t ctlm_offsetlo}
@vindex ctlm_offsetlo
@vindex struct ctf_lmember_v2, ctlm_offsetlo
@vindex ctf_lmember_t, ctlm_offsetlo
@tab The low 32 bits of the offset of this field in bits.
@end multitable
Macros @code{CTF_LMEM_OFFSET}, @code{CTF_OFFSET_TO_LMEMHI} and
@code{CTF_OFFSET_TO_LMEMLO} serve to extract and install the values of the
@code{ctlm_offset} fields, much as with the split size fields in
@code{ctf_type_t}.
Unnamed structure and union fields are simply implemented by collapsing the
unnamed field's members into the containing structure or union: this does mean
that a structure containing an unnamed union can end up being a ``structure''
with multiple members at the same offset. (A future format revision may
collapse @code{CTF_K_STRUCT} and @code{CTF_K_UNION} into the same kind and
decide among them based on whether their members do in fact overlap.)
Structure and union type names, as usual in C, go into their own namespace,
just as enum type names do.
Forward declarations of structures and unions are not implemented with this
kind: @pxref{Forward declarations}.
@node Forward declarations
@subsection Forward declarations
@cindex Forwards
@tindex enum
@tindex struct
@tindex union
@tindex CTF_K_FORWARD
When the compiler encounters a forward declaration of a struct, union, or enum,
it emits a type of kind @code{CTF_K_FORWARD}. If it later encounters a non-
forward declaration of the same thing, it marks the forward as non-root-visible:
before link time, therefore, non-root-visible forwards indicate that a
non-forward is coming.
After link time, forwards are fused with their corresponding non-forwards by the
deduplicator where possible. They are kept if there is no non-forward
definition (maybe it's not visible from any TU at all) or if @code{multiple}
conflicting structures with the same name might match it. Otherwise, all other
forwards are converted to structures, unions, or enums as appropriate, even
across TUs if only one structure could correspond to the forward (after all,
all types across all TUs land in the same dictionary unless they conflict,
so promoting forwards to their concrete type seems most helpful).
A forward has a rather strange representation: it is encoded with a
@code{ctf_stype_t} but the @code{ctt_type} is populated not with a type (if it's
a forward, we don't have an underlying type yet: if we did, we'd have promoted
it and this wouldn't be a forward any more) but with the @code{kind} of the
forward. This means that we can distinguish forwards to structs, enums and
unions reliably and ensure they land in the appropriate namespace even before
the actual struct, union or enum is found.
@node The symtypetab sections
@section The symtypetab sections
@cindex Symtypetab section
@cindex Sections, symtypetab
@cindex Function info section
@cindex Sections, function info
@cindex Data object section
@cindex Sections, data object
@cindex Function info index section
@cindex Sections, function info index
@cindex Data object index section
@cindex Sections, data object index
@tindex CTF_F_IDXSORTED
@tindex CTF_F_DYNSTR
@cindex Bug workarounds, CTF_F_DYNSTR
These are two very simple sections with identical formats, used by consumers to
map from ELF function and data symbols directly to their types. So they are
usually populated only in CTF sections that are embedded in ELF objects.
Their format is very simple: an array of type IDs. Which symbol each type ID
corresponds to depends on whether the optional @emph{index section} associated
with this symtypetab section has any content.
If the index section is nonempty, it is an array of @code{uint32_t} string table
offsets, each giving the name of the symbol whose type is at the same offset in
the corresponding non-index section: users can look up symbols in such a table
by name. The index section and corresponding symtypetab section is usually
ASCIIbetically sorted (indicated by the @code{CTF_F_IDXSORTED} flag in the
header): if it's sorted, it can be bsearched for a symbol name rather than
having to use a slower linear search.
If the data object index section is empty, the entries in the data object and
function info sections are associated 1:1 with ELF symbols of type
@code{STT_OBJECT} (for data object) or @code{STT_FUNC} (for function info) with
a nonzero value: the linker shuffles the symtypetab sections to correspond with
the order of the symbols in the ELF file. Symbols with no name, undefined
symbols and symbols named ``@code{_START_}'' and ``@code{_END_}'' are skipped
and never appear in either section. Symbols that have no corresponding type are
represented by type ID 0. The section may have fewer entries than the symbol
table, in which case no later entries have associated types. This format is
more compact than an indexed form if most entries have types (since there is no
need to record any symbol names), but if the producer and consumer disagree even
slightly about which symbols are omitted, the types of all further symbols will
be wrong!
The compiler always emits indexed symtypetab tables, because there is no symbol
table yet. The linker will always have to read them all in and always works
through them from start to end, so there is no benefit having the compiler sort
them either. The linker (actually, @code{libctf}'s linking machinery) will
automatically sort unsorted indexed sections, and convert indexed sections that
contain a lot of pads into the more compact, unindexed form.
If child dicts are in use, only symbols that use types actually mentioned in the
child appear in the child's symtypetab: symbols that use only types in the
parent appear in the parent's symtypetab instead. So the child's symtypetab will
almost always be very sparse, and thus will usually use the indexed form even in
fully linked objects. (It is, of course, impossible for symbols to exist that
use types from multiple child dicts at once, since it's impossible to declare a
function in C that uses types that are only visible in two different, disjoint
translation units.)
@node The variable section
@section The variable section
@cindex Variable section
@cindex Sections, variable
The variable section is a simple array mapping names (strtab entries) to type
IDs, intended to provide a replacement for the data object section in dynamic
situations in which there is no static ELF strtab but the consumer instead hands
back names. The section is sorted into ASCIIbetical order by name for rapid
lookup, like the CTF archive name table.
The section is an array of these structures:
@verbatim
typedef struct ctf_varent
{
uint32_t ctv_name;
uint32_t ctv_type;
} ctf_varent_t;
@end verbatim
@tindex struct ctf_varent
@tindex ctf_varent_t
@multitable {Offset} {@code{uint32_t ctv_name}} {Strtab offset of the name}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint32_t ctv_name}
@vindex ctv_name
@vindex struct ctf_varent, ctv_name
@vindex ctf_varent_t, ctv_name
@tab Strtab offset of the name
@item 0x04
@tab @code{uint32_t ctv_type}
@vindex ctv_type
@vindex struct ctf_varent, ctv_type
@vindex ctf_varent_t, ctv_type
@tab Type ID of this type
@end multitable
There is no analogue of the function info section yet: v4 will probably drop
this section in favour of a way to put both indexed (thus, named) and nonindexed
symbols into the symtypetab sections at the same time.
@node The label section
@section The label section
@cindex Label section
@cindex Sections, label
The label section is a currently-unused facility allowing the tiling of the type
space with names taken from the strtab. The section is an array of these
structures:
@verbatim
typedef struct ctf_lblent
{
uint32_t ctl_label;
uint32_t ctl_type;
} ctf_lblent_t;
@end verbatim
@tindex struct ctf_lblent
@tindex ctf_lblent_t
@multitable {Offset} {@code{uint32_t ctl_label}} {Strtab offset of the label}
@headitem Offset @tab Name @tab Description
@item 0x00
@tab @code{uint32_t ctl_label}
@vindex ctl_label
@vindex struct ctf_lblent, ctl_label
@vindex ctf_lblent_t, ctl_label
@tab Strtab offset of the label
@item 0x04
@tab @code{uint32_t ctl_type}
@vindex ctl_type
@vindex struct ctf_lblent, ctl_type
@vindex ctf_lblent_t, ctl_type
@tab Type ID of the last type covered by this label
@end multitable
Semantics will be attached to labels soon, probably in v4 (the plan is to use
them to allow multiple disjoint namespaces in a single CTF file, removing many
uses of CTF archives, in particular in the @code{.ctf} section in ELF objects).
@node The string section
@section The string section
@cindex String section
@cindex Sections, string
This section is a simple ELF-format strtab, starting with a zero byte (thus
ensuring that the string with offset 0 is the null string, as assumed elsewhere
in this spec). The strtab is usually ASCIIbetically sorted to somewhat improve
compression efficiency.
Where the strtab is unusual is the @emph{references} to it. CTF has two
string tables, the internal strtab and an external strtab associated
with the CTF dictionary at open time: usually, this is the ELF dynamic
strtab (@code{.dynstr}) of a CTF dictionary embedded in an ELF file. We
distinguish between these strtabs by the most significant bit, bit 31,
of the 32-bit strtab references: if it is 0, the offset is in the
internal strtab: if 1, the offset is in the external strtab.
@tindex CTF_F_DYNSTR
@cindex Bug workarounds, CTF_F_DYNSTR
There is a bug workaround in this area: in format v3 (the first version
to have working support for external strtabs), the external strtab is
@code{.strtab} unless the @code{CTF_F_DYNSTR} flag is set on the
dictionary (@pxref{CTF file-wide flags}). Format v4 will introduce a
header field that explicitly names the external strtab, making this flag
unnecessary.
@node Data models
@section Data models
@cindex Data models
The data model is a simple integer which indicates the ABI in use on this
platform. Right now, it is very simple, distinguishing only between 32- and
64-bit types: a model of 1 indicates ILP32, 2 indicats LP64. The mapping from
ABI integer to type sizes is hardwired into @code{libctf}: currently, we use
this to hardwire the size of pointers, function pointers, and enumerated types,
This is a very kludgy corner of CTF and will probably be replaced with explicit
header fields to record this sort of thing in future.
@node Limits of CTF
@section Limits of CTF
@cindex Limits
The following limits are imposed by various aspects of CTF version 3:
@table @code
@item CTF_MAX_TYPE
Maximum type identifier (maximum number of types accessible with parent and
child containers in use): 0xfffffffe
@item CTF_MAX_PTYPE
Maximum type identifier in a parent dictioanry: maximum number of types in any
one dictionary: 0x7fffffff
@item CTF_MAX_NAME
Maximum offset into a string table: 0x7fffffff
@item CTF_MAX_VLEN
Maximum number of members in a struct, union, or enum: maximum number of
function args: 0xffffff
@item CTF_MAX_SIZE
Maximum size of a @code{ctf_stype_t} in bytes before we fall back to
@code{ctf_type_t}: 0xfffffffe bytes
@end table
Other maxima without associated macros:
@itemize
@item
Maximum value of an enumerated type: 2^32
@item
Maximum size of an array element: 2^32
@end itemize
These maxima are generally considered to be too low, because C programs can and
do exceed them: they will be lifted in format v4.
@node Index
@unnumbered Index
@printindex cp
@bye
|