aboutsummaryrefslogtreecommitdiff
path: root/clang/docs/AllocToken.rst
diff options
context:
space:
mode:
Diffstat (limited to 'clang/docs/AllocToken.rst')
-rw-r--r--clang/docs/AllocToken.rst207
1 files changed, 207 insertions, 0 deletions
diff --git a/clang/docs/AllocToken.rst b/clang/docs/AllocToken.rst
new file mode 100644
index 0000000..bda8466
--- /dev/null
+++ b/clang/docs/AllocToken.rst
@@ -0,0 +1,207 @@
+=================
+Allocation Tokens
+=================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+Clang provides support for allocation tokens to enable allocator-level heap
+organization strategies. Clang assigns mode-dependent token IDs to allocation
+calls; the runtime behavior depends entirely on the implementation of a
+compatible memory allocator.
+
+Possible allocator strategies include:
+
+* **Security Hardening**: Placing allocations into separate, isolated heap
+ partitions. For example, separating pointer-containing types from raw data
+ can mitigate exploits that rely on overflowing a primitive buffer to corrupt
+ object metadata.
+
+* **Memory Layout Optimization**: Grouping related allocations to improve data
+ locality and cache utilization.
+
+* **Custom Allocation Policies**: Applying different management strategies to
+ different partitions.
+
+Token Assignment Mode
+=====================
+
+The default mode to calculate tokens is:
+
+* ``typehashpointersplit``: This mode assigns a token ID based on the hash of
+ the allocated type's name, where the top half ID-space is reserved for types
+ that contain pointers and the bottom half for types that do not contain
+ pointers.
+
+Other token ID assignment modes are supported, but they may be subject to
+change or removal. These may (experimentally) be selected with ``-mllvm
+-alloc-token-mode=<mode>``:
+
+* ``typehash``: This mode assigns a token ID based on the hash of the allocated
+ type's name.
+
+* ``random``: This mode assigns a statically-determined random token ID to each
+ allocation site.
+
+* ``increment``: This mode assigns a simple, incrementally increasing token ID
+ to each allocation site.
+
+Allocation Token Instrumentation
+================================
+
+To enable instrumentation of allocation functions, code can be compiled with
+the ``-fsanitize=alloc-token`` flag:
+
+.. code-block:: console
+
+ % clang++ -fsanitize=alloc-token example.cc
+
+The instrumentation transforms allocation calls to include a token ID. For
+example:
+
+.. code-block:: c
+
+ // Original:
+ ptr = malloc(size);
+
+ // Instrumented:
+ ptr = __alloc_token_malloc(size, <token id>);
+
+The following command-line options affect generated token IDs:
+
+* ``-falloc-token-max=<N>``
+ Configures the maximum number of tokens. No max by default (tokens bounded
+ by ``SIZE_MAX``).
+
+ .. code-block:: console
+
+ % clang++ -fsanitize=alloc-token -falloc-token-max=512 example.cc
+
+Runtime Interface
+-----------------
+
+A compatible runtime must be provided that implements the token-enabled
+allocation functions. The instrumentation generates calls to functions that
+take a final ``size_t token_id`` argument.
+
+.. code-block:: c
+
+ // C standard library functions
+ void *__alloc_token_malloc(size_t size, size_t token_id);
+ void *__alloc_token_calloc(size_t count, size_t size, size_t token_id);
+ void *__alloc_token_realloc(void *ptr, size_t size, size_t token_id);
+ // ...
+
+ // C++ operators (mangled names)
+ // operator new(size_t, size_t)
+ void *__alloc_token__Znwm(size_t size, size_t token_id);
+ // operator new[](size_t, size_t)
+ void *__alloc_token__Znam(size_t size, size_t token_id);
+ // ... other variants like nothrow, etc., are also instrumented.
+
+Fast ABI
+--------
+
+An alternative ABI can be enabled with ``-fsanitize-alloc-token-fast-abi``,
+which encodes the token ID hint in the allocation function name.
+
+.. code-block:: c
+
+ void *__alloc_token_0_malloc(size_t size);
+ void *__alloc_token_1_malloc(size_t size);
+ void *__alloc_token_2_malloc(size_t size);
+ ...
+ void *__alloc_token_0_Znwm(size_t size);
+ void *__alloc_token_1_Znwm(size_t size);
+ void *__alloc_token_2_Znwm(size_t size);
+ ...
+
+This ABI provides a more efficient alternative where
+``-falloc-token-max`` is small.
+
+Instrumenting Non-Standard Allocation Functions
+-----------------------------------------------
+
+By default, AllocToken only instruments standard library allocation functions.
+This simplifies adoption, as a compatible allocator only needs to provide
+token-enabled variants for a well-defined set of standard functions.
+
+To extend instrumentation to custom allocation functions, enable broader
+coverage with ``-fsanitize-alloc-token-extended``. Such functions require being
+marked with the `malloc
+<https://clang.llvm.org/docs/AttributeReference.html#malloc>`_ or `alloc_size
+<https://clang.llvm.org/docs/AttributeReference.html#alloc-size>`_ attributes
+(or a combination).
+
+For example:
+
+.. code-block:: c
+
+ void *custom_malloc(size_t size) __attribute__((malloc));
+ void *my_malloc(size_t size) __attribute__((alloc_size(1)));
+
+ // Original:
+ ptr1 = custom_malloc(size);
+ ptr2 = my_malloc(size);
+
+ // Instrumented:
+ ptr1 = __alloc_token_custom_malloc(size, token_id);
+ ptr2 = __alloc_token_my_malloc(size, token_id);
+
+Disabling Instrumentation
+-------------------------
+
+To exclude specific functions from instrumentation, you can use the
+``no_sanitize("alloc-token")`` attribute:
+
+.. code-block:: c
+
+ __attribute__((no_sanitize("alloc-token")))
+ void* custom_allocator(size_t size) {
+ return malloc(size); // Uses original malloc
+ }
+
+Note: Independent of any given allocator support, the instrumentation aims to
+remain performance neutral. As such, ``no_sanitize("alloc-token")``
+functions may be inlined into instrumented functions and vice-versa. If
+correctness is affected, such functions should explicitly be marked
+``noinline``.
+
+The ``__attribute__((disable_sanitizer_instrumentation))`` is also supported to
+disable this and other sanitizer instrumentations.
+
+Suppressions File (Ignorelist)
+------------------------------
+
+AllocToken respects the ``src`` and ``fun`` entity types in the
+:doc:`SanitizerSpecialCaseList`, which can be used to omit specified source
+files or functions from instrumentation.
+
+.. code-block:: bash
+
+ [alloc-token]
+ # Exclude specific source files
+ src:third_party/allocator.c
+ # Exclude function name patterns
+ fun:*custom_malloc*
+ fun:LowLevel::*
+
+.. code-block:: console
+
+ % clang++ -fsanitize=alloc-token -fsanitize-ignorelist=my_ignorelist.txt example.cc
+
+Conditional Compilation with ``__SANITIZE_ALLOC_TOKEN__``
+-----------------------------------------------------------
+
+In some cases, one may need to execute different code depending on whether
+AllocToken instrumentation is enabled. The ``__SANITIZE_ALLOC_TOKEN__`` macro
+can be used for this purpose.
+
+.. code-block:: c
+
+ #ifdef __SANITIZE_ALLOC_TOKEN__
+ // Code specific to -fsanitize=alloc-token builds
+ #endif