aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
AgeCommit message (Collapse)AuthorFilesLines
2025-06-10[PGO][Offload] Fix offload coverage mapping (#143490)Ethan Luis McDonough1-6/+0
This pull request fixes coverage mapping on GPU targets. - It adds an address space cast to the coverage mapping generation pass. - It reads the profiled function names from the ELF directly. Reading it from public globals was causing issues in cases where multiple device-code object files are linked together.
2025-06-10[llvm] annotate interfaces in llvm/Transforms for DLL export (#143413)Andrew Rogers1-1/+2
## Purpose This patch is one in a series of code-mods that annotate LLVM’s public interface for export. This patch annotates the `llvm/Transforms` library. These annotations currently have no meaningful impact on the LLVM build; however, they are a prerequisite to support an LLVM Windows DLL (shared library) build. ## Background This effort is tracked in #109483. Additional context is provided in [this discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307), and documentation for `LLVM_ABI` and related annotations is found in the LLVM repo [here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst). The bulk of these changes were generated automatically using the [Interface Definition Scanner (IDS)](https://github.com/compnerd/ids) tool, followed formatting with `git clang-format`. The following manual adjustments were also applied after running IDS on Linux: - Removed a redundant `operator<<` from Attributor.h. IDS only auto-annotates the 1st declaration, and the 2nd declaration being un-annotated resulted in an "inconsistent linkage" error on Windows when building LLVM as a DLL. - `#include` the `VirtualFileSystem.h` in PGOInstrumentation.h and remove the local declaration of the `vfs::FileSystem` class. This is required because exporting the `PGOInstrumentationUse` constructor requires the class be fully defined because it is used by an argument. - Add #include "llvm/Support/Compiler.h" to files where it was not auto-added by IDS due to no pre-existing block of include statements. - Add `LLVM_TEMPLATE_ABI` and `LLVM_EXPORT_TEMPLATE` to exported instantiated templates. ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang
2025-05-03[Instrumentation] Remove an unused local variable (NFC) (#138383)Kazu Hirata1-2/+0
2025-04-23[InstrProfiling] Avoid unnecessary bitcast (NFC)Nikita Popov1-4/+2
Not needed with opaque pointers.
2025-04-15[nfc] move `isPresplitCoroSuspendExitEdge` to Analysis/CFG (#135849)Mircea Trofin1-0/+1
2025-03-06[IR] Store Triple in Module (NFC) (#129868)Nikita Popov1-1/+1
The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-3/+3
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2024-11-06[Instrumentation] Remove unused includes (NFC) (#115117)Kazu Hirata1-1/+0
Identified with misc-include-cleaner.
2024-10-22[PGO][SampledInstr] Correct off by 1s and allow 100% sampling (#113350)Michael O'Farrell1-39/+56
This corrects a couple off by ones related to the sampling of **instrumented** counters, and enables setting 100% rates for burst sampling (burst duration = period). Off by ones: Prior to this change it was impossible to set a period of 65535 because this was converted to fast sampling which rollsover at USHRT_MAX + 1 (65536). Similarly the burst durations would collect burst duration + 1 counts as they used an ULE comparison. 100% sampling: Although this is not useful for a productionized use case, it does allow for more deterministic testing with the sampling checks in place. After all the off by ones are fixed, allowing for 100% sampling is a matter of letting burst duration = period.
2024-10-16[LLVM] Add `Intrinsic::getDeclarationIfExists` (#112428)Rahul Joshi1-6/+6
Add `Intrinsic::getDeclarationIfExists` to lookup an existing declaration of an intrinsic in a `Module`.
2024-10-15[Coverage][WebAssembly] Add initial support for WebAssembly/WASI (#111332)Yuta Saito1-2/+3
Currently, WebAssembly/WASI target does not provide direct support for code coverage. This patch set fixes several issues to unlock the feature. The main changes are: 1. Port `compiler-rt/lib/profile` to WebAssembly/WASI. 2. Adjust profile metadata sections for Wasm object file format. - [CodeGen] Emit `__llvm_covmap` and `__llvm_covfun` as custom sections instead of data segments. - [lld] Align the interval space of custom sections at link time. - [llvm-cov] Copy misaligned custom section data if the start address is not aligned. - [llvm-cov] Read `__llvm_prf_names` from data segments 3. [clang] Link with profile runtime libraries if requested See each commit message for more details and rationale. This is part of the effort to add code coverage support in Wasm target of Swift toolchain.
2024-10-03[MC/DC] Rework tvbitmap.update to get rid of the inlined function (#110792)NAKAMURA Takumi1-77/+33
Per the discussion in #102542, it is safe to insert BBs under `lowerIntrinsics()` since #69535 has made tolerant of modifying BBs. So, I can get rid of using the inlined function `rmw_or`, introduced in #96040.
2024-09-19[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)Jay Foad1-2/+2
It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.
2024-09-15[Instrumentation] Move out to Utils (NFC) (#108532)Antonio Frighetto1-1/+1
Utility functions have been moved out to Utils. Minor opportunity to drop the header where not needed.
2024-08-22[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)Ethan Luis McDonough1-10/+34
This pull request is a revised version of #76587. This pull request fixes some build issues that were present in the previous version of this change. > This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: > > - Adds blank registration functions to device RTL > - Gives PGO globals protected visibility when targeting a supported GPU > - Handles any addrspace casts for PGO calls > - Implements PGO global extraction in GPU plugins (currently only dumps info) > > These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.
2024-08-16[InstrProf] Support conditional counter updates (#102542)gulfemsavrun1-0/+16
This patch adds support for conditional counter updates in single byte counters mode to reduce the write contention by first checking whether the counter is set before overwriting it. --------- Co-authored-by: Juan Manuel Martinez Caamaño <jmartinezcaamao@gmail.com>
2024-07-31[MC/DC][Coverage] Introduce "Bitmap Bias" for continuous mode (#96126)NAKAMURA Takumi1-11/+15
`counter_bias` is incompatible to Bitmap. The distance between Counters and Bitmap is different between on-memory sections and profraw image. Reference to `__llvm_profile_bitmap_bias` is generated only if `-fcoverge-mcdc` `-runtime-counter-relocation` are specified. The current implementation rejected their options. ``` Runtime counter relocation is presently not supported for MC/DC bitmaps ```
2024-07-22[PGO] Sampled instrumentation in PGO to speed up instrumentation binary (#69535)xur-llvm1-22/+223
In comparison to non-instrumented binaries, PGO instrumentation binaries can be significantly slower. For highly threaded programs, this slowdown can reach 10x due to data races or false sharing within counters. This patch incorporates sampling into the PGO instrumentation process to enhance the speed of instrumentation binaries. The fundamental concept is similar to the one proposed in https://reviews.llvm.org/D63949. Three sampling modes are introduced: 1. Simple Sampling: When '-sampled-instr-bust-duration' is set to 1. 2. Fast Burst Sampling: When not using simple sampling, and '-sampled-instr-period' is set to 65535. This is the default mode of sampling. 3. Full Burst Sampling: When neither simple nor fast burst sampling is used. Utilizing this sampled instrumentation significantly improves the binary's execution speed. Measurements show up to 5x speedup with default settings. Fast burst sampling now results in only around 20% to 30% slowdown (compared to 8 to 10x slowdown without sampling). Out tests show that profile quality remains good with sampling, with edge counts typically showing more than 90% overlap. For applications whose behavior changes due to binary speed, sampling instrumentation can enhance performance. Observations have shown some apps experiencing up to a ~2% improvement in PGO. A potential drawback of this patch is the increased binary size and compilation time. The Sampling method in this patch does not improve single threaded program instrumentation binary speed.
2024-07-20InstrProf: Mark BiasLI as invariant. (#95588)NAKAMURA Takumi1-0/+3
Bias doesn't change after startup. The test is enhanced for optimized sequences and atomic ops.
2024-06-28Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587)"Ethan Luis McDonough1-34/+10
This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.
2024-06-28[PGO][OpenMP] Instrumentation for GPU devices (#76587)Ethan Luis McDonough1-10/+34
This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: - Adds blank registration functions to device RTL - Gives PGO globals protected visibility when targeting a supported GPU - Handles any addrspace casts for PGO calls - Implements PGO global extraction in GPU plugins (currently only dumps info) These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.
2024-06-26[MC/DC][Coverage] Make tvbitmapupdate capable of atomic write (#96042)NAKAMURA Takumi1-0/+24
This also introduces "Test and conditional Read-Modify-Write". The flow to `atomicrmw or` is marked as `unlikely`.
2024-06-22[MC/DC][Coverage] Split out Read-modfy-Write to rmw_or(ptr,i8) (#96040)NAKAMURA Takumi1-11/+52
`rmw_or` is defined as "private alwaysinline". At the moment, it has just only simple "Read, Or, and Write", which is just same as the current implementation.
2024-06-19InstProfiling: Give the name to profc_bias. NFC. (#95587)NAKAMURA Takumi1-1/+1
2024-06-19InstrProfiling: Split creating Bias offset to getOrCreateBiasVar(Name). NFC. ↵NAKAMURA Takumi1-16/+28
(#95692)
2024-06-16Cleanup MC/DC intrinsics for #82448 (#95496)NAKAMURA Takumi1-35/+0
3rd arg of `tvbitmap.update` was made unused. Remove 3rd arg. Sweep `condbitmap.update`, since it is no longer used.
2024-06-14Reapply: [MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)NAKAMURA Takumi1-8/+7
By storing possible test vectors instead of combinations of conditions, the restriction is dramatically relaxed. This introduces two options to `cc1`: * `-fmcdc-max-conditions=32767` * `-fmcdc-max-test-vectors=2147483646` This change makes coverage mapping, profraw, and profdata incompatible with Clang-18. - Bitmap semantics changed. It is incompatible with previous format. - `BitmapIdx` in `Decision` points to the end of the bitmap. - Bitmap is packed per function. - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`. RFC: https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798 -- Change(s) since llvmorg-19-init-14288-g7ead2d8c7e91 - Update compiler-rt/test/profile/ContinuousSyncMode/image-with-mcdc.c
2024-06-14Revert "[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)"Hans Wennborg1-7/+8
This broke the lit tests on Mac: https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/1096/ > By storing possible test vectors instead of combinations of conditions, > the restriction is dramatically relaxed. > > This introduces two options to `cc1`: > > * `-fmcdc-max-conditions=32767` > * `-fmcdc-max-test-vectors=2147483646` > > This change makes coverage mapping, profraw, and profdata incompatible > with Clang-18. > > - Bitmap semantics changed. It is incompatible with previous format. > - `BitmapIdx` in `Decision` points to the end of the bitmap. > - Bitmap is packed per function. > - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`. > > RFC: > https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798 This reverts commit 7ead2d8c7e9114b3f23666209a1654939987cb30.
2024-06-13[MC/DC][Coverage] Loosen the limit of NumConds from 6 (#82448)NAKAMURA Takumi1-8/+7
By storing possible test vectors instead of combinations of conditions, the restriction is dramatically relaxed. This introduces two options to `cc1`: * `-fmcdc-max-conditions=32767` * `-fmcdc-max-test-vectors=2147483646` This change makes coverage mapping, profraw, and profdata incompatible with Clang-18. - Bitmap semantics changed. It is incompatible with previous format. - `BitmapIdx` in `Decision` points to the end of the bitmap. - Bitmap is packed per function. - `llvm-cov` can understand `profdata` generated by `llvm-profdata-18`. RFC: https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
2024-04-01[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write ↵Mingming Liu1-6/+151
(#66825) (The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691) * For InstrFDO value profiling, implement instrumentation and lowering for virtual table address. * This is controlled by `-enable-vtable-value-profiling` and off by default. * When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads. * Implement profile reader and writer support * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols. * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't happen since IR is used to construct InstrProfSymtab. * Indexed profile writer collects the list of vtable names, and stores that to index profiles. * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type. * `llvm-profdata show -show-vtables <args> <profile>` is implemented. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-03-04[nfc][InstrProfiling]For comdat setting helper function, move comment closer ↵Mingming Liu1-21/+29
to the code (#83757)
2024-03-04[nfc][InstrProfiling]Compute a boolean state as a constant and use it ↵Mingming Liu1-24/+24
everywhere (#83756)
2024-02-25LLVMInstrumentation: Simplify mcdc.tvbitmap.update with GEP.NAKAMURA Takumi1-10/+2
2024-01-07[InstrProfiling] No runtime registration for ELF, COFF, Mach-O and XCOFF ↵Petr Hosek1-6/+4
(#77225) Whether runtime registration is needed is not dependent on the OS but the file format. For ELF, COFF, Mach-O or XCOFF, we can always use the linker support. This is important for baremetal platforms such as RTOS and UEFI platforms where there is no OS but we still don't want to use runtime registration and rely on linker support instead.
2023-12-14[Profile] Add binary profile correlation for code coverage. (#69493)Zequan Wu1-22/+49
## Motivation Since we don't need the metadata sections at runtime, we can somehow offload them from memory at runtime. Initially, I explored [debug info correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113), which is used for PGO with value profiling disabled. However, it currently only works with DWARF and it's be hard to add such artificial debug info for every function in to CodeView which is used on Windows. So, offloading profile metadata sections at runtime seems to be a platform independent option. ## Design The idea is to use new section names for profile name and data sections and mark them as metadata sections. Under this mode, the new sections are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime and can be stripped away as a post-linking step. After the process exits, the generated raw profiles will contains only headers + counters. llvm-profdata can be used correlate raw profiles with the unstripped binary to generate indexed profile. ## Data For chromium base_unittests with code coverage on linux, the binary size overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and the raw profile files size reduce from 128M to 68M (46.9%) ``` $ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped FILE SIZE VM SIZE -------------- -------------- +121% +30.4Mi +121% +30.4Mi .text [NEW] +14.6Mi [NEW] +14.6Mi __llvm_prf_data [NEW] +10.6Mi [NEW] +10.6Mi __llvm_prf_names [NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts +95% +1.75Mi +95% +1.75Mi .eh_frame +108% +400Ki +108% +400Ki .eh_frame_hdr +9.5% +211Ki +9.5% +211Ki .rela.dyn +9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro +5.0% +87.3Ki +5.0% +87.3Ki .rodata [ = ] 0 +13% +47.0Ki .bss +40% +1.78Ki +40% +1.78Ki .got +12% +1.49Ki +12% +1.49Ki .gcc_except_table [ = ] 0 +65% +1.23Ki .relro_padding +62% +1.20Ki [ = ] 0 [Unmapped] +13% +448 +19% +448 .init_array +8.8% +192 [ = ] 0 [ELF Section Headers] +0.0% +136 +0.0% +80 [7 Others] +0.1% +96 +0.1% +96 .dynsym +1.2% +96 +1.2% +96 .rela.plt +1.5% +80 +1.2% +64 .plt [ = ] 0 -99.2% -3.68Ki [LOAD #5 [RW]] +195% +64.0Mi +194% +64.0Mi TOTAL $ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped FILE SIZE VM SIZE -------------- -------------- +121% +30.4Mi +121% +30.4Mi .text [NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts +95% +1.75Mi +95% +1.75Mi .eh_frame +108% +400Ki +108% +400Ki .eh_frame_hdr +9.5% +211Ki +9.5% +211Ki .rela.dyn +9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro +5.0% +87.3Ki +5.0% +87.3Ki .rodata [ = ] 0 +13% +47.0Ki .bss +40% +1.78Ki +40% +1.78Ki .got +12% +1.49Ki +12% +1.49Ki .gcc_except_table +13% +448 +19% +448 .init_array +0.1% +96 +0.1% +96 .dynsym +1.2% +96 +1.2% +96 .rela.plt +1.2% +64 +1.2% +64 .plt +2.9% +64 [ = ] 0 [ELF Section Headers] +0.0% +40 +0.0% +40 .data +1.2% +32 +1.2% +32 .got.plt +0.0% +24 +0.0% +8 [5 Others] [ = ] 0 -22.9% -872 [LOAD #5 [RW]] -74.5% -1.44Ki [ = ] 0 [Unmapped] [ = ] 0 -76.5% -1.45Ki .relro_padding +118% +38.8Mi +117% +38.8Mi TOTAL ``` A few things to note: 1. llvm-profdata doesn't support filter raw profiles by binary id yet, so when a raw profile doesn't belongs to the binary being digested by llvm-profdata, merging will fail. Once this is implemented, llvm-profdata should be able to only merge raw profiles with the same binary id as the binary and discard the rest (with mismatched/missing binary id). The workflow I have in mind is to have scripts invoke llvm-profdata to get all binary ids for all raw profiles, and selectively choose the raw pnrofiles with matching binary id and the binary to llvm-profdata for merging. 2. Note: In COFF, currently they are still loaded into memory but not used. I didn't do it in this patch because I noticed that `.lcovmap` and `.lcovfunc` are loaded into memory. A separate patch will address it. 3. This should works with PGO when value profiling is disabled as debug info correlation currently doing, though I haven't tested this yet.
2023-12-12[NFC][InstrProf] Rename internal `InstrProfiling` to `InstrLowerer` (#75139)Mircea Trofin1-38/+38
Captures its responsibility a bit better.
2023-12-11[NFC][InstrProf] Move `InstrProfiling` to the .cpp file (#75018)Mircea Trofin1-0/+149
2023-12-10[NFC][InstrProf] Refactor InstrProfiling lowering pass (#74970)Mircea Trofin1-67/+59
Akin other passes - refactored the name to `InstrProfilingLoweringPass` to better communicate what it does, and split the pass part and the transformation part to avoid needing to initialize object state during `::run`. A subsequent PR will move `InstrLowering` to the .cpp file and rename it to `InstrLowerer`.
2023-12-08Reland [InstrProf][X86] Mark non-directly accessed globals as large (#74778)Arthur Eubanks1-0/+4
We'd like to make various instrprof globals large to make them not contribute to relocation pressure since there are no direct accesses to them in the module. Similar to what was done for asan_globals in #74514. This affects the __llvm_prf_vals, __llvm_prf_vnds, and __llvm_prf_names sections. The reland fixes platform.ll.
2023-12-08Revert "[InstrProf][X86] Mark non-directly accessed globals as large (#74778)"Arthur Eubanks1-4/+0
This reverts commit 5507f70cc205a7ec21d264a64c703b3d314b998c. Breaks bots, e.g. https://lab.llvm.org/buildbot/#/builders/232/builds/16374
2023-12-08[InstrProf][X86] Mark non-directly accessed globals as large (#74778)Arthur Eubanks1-0/+4
We'd like to make various instrprof globals large to make them not contribute to relocation pressure since there are no direct accesses to them in the module. Similar to what was done for asan_globals in #74514. This affects the __llvm_prf_vals, __llvm_prf_vnds, and __llvm_prf_names sections.
2023-11-30[coro][pgo] Don't promote pgo counters in the suspend basic block (#71263)Mircea Trofin1-1/+7
If a suspend happens in the resume part (this can happen in the case of chained coroutines), and that's part of a loop, the pre-split CFG has the suspend block as an exit of that loop. PGO Counter Promotion will then try to commit the temporary counter to the global in that "exit" block (it also does that in the other loop exit BBs, which also includes the "destroy" case). This interferes with symmetric transfer. We don't need to commit the counter in the suspend case - it's not a loop exit from the perspective of the behavior of the program. The regular loop exit, together with the "destroy" case, completely cover any updates that may need to happen to the global counter.
2023-11-29Fix stale comment (#73846)David Li1-1/+1
Fix stale comment.
2023-11-17[llvm][InstrProfiling] Remove ptr-to-ptr bitcasts (NFC)Youngsuk Kim1-10/+5
Opaque ptr cleanup effort (NFC).
2023-11-15[Instrumentation] Remove unneeded pointer casts and migrate away from ↵Fangrui Song1-2/+2
getInt8PtrTy. NFC After opaque pointer migration, getInt8PtrTy() is considered legacy. Replace it with getPtrTy(), and while here, remove some unneeded pointer casts.
2023-11-14[InstrProfiling] Ensure data variables are always created for inlined ↵Alan Phipps1-14/+9
functions (#72069) Fixes a bug introduced by commit f95b2f1acf11 ("Reland [InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)") The InstrProfiling pass was refactored when introducing support for MC/DC such that the creation of the data variable was abstracted and called only once per function from ::run(). Because ::run() only iterated over functions there were not fully inlined, and because it only created the data variable for the first intrinsic that it saw, data variables corresponding to functions fully inlined into other instrumented callers would end up without a data variable, resulting in loss of coverage information. This patch does the following: 1.) Move the call of createDataVariable() to getOrCreateRegionCounters() so that the creation of the data variable will happen indirectly either from ::new() or during profile intrinsic lowering when it is needed. This effectively restores the behavior prior to the refactor and ensures that all data variables are created when needed (and not duplicated). 2.) Process all MC/DC bitmap parameter intrinsics in ::run() prior to calling getOrCreateRegionCounters(). This ensures bitmap regions are created for each function including functions that are fully inlined. It also ensures that the bitmap region is created for each function prior to the creation of the data variable because it is referenced by the data variable. Again, duplication is prevented if the same parameter intrinsic is inlined into multiple functions. 3.) No longer pass the MC/DC intrinsic to createDataVariable(). This decouples the creation of the data variable from a specific MC/DC intrinsic. Instead, with #2 above, store the number of bitmap bytes required in the PerFunctionProfileData in the ProfileDataMap along with the function's CounterRegion and BitmapRegion variables. This ties the bitmap information directly to the function to which it belongs, and the data variable created for that function can reference that.
2023-11-12[llvm][InstrProfiling] Remove no-op ptr-to-ptr bitcasts (NFC)JOE19941-4/+3
Opaque ptr cleanup effort (NFC).
2023-11-11[InstrProfiling] Don't attempt to create duplicate data variables. (#71998)Alan Phipps1-0/+4
Fixes a bug introduced by commit f95b2f1acf11 ("Reland [InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)") createDataVariable() needs to check that a data variable wasn't already created before creating it. Previously, this was done inadvertantly in getOrCreateRegionCounters(), which checked that the RegionCounters was not created multiple times before creating the counter section and the data variable. When the creation of the data variable was abstracted into its own function (createDataVariable()), there was no corresponding check. This was failing on a case in which an instrumented function was being inlined into multiple functions and a duplicate data variable was created, which led to a segfault in emitNameData(). Test case added based on the repro that also ensures a single data variable was created in this case.
2023-11-07[NFC] Remove Type::getInt8PtrTy (#71029)Paulo Matos1-5/+5
Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.
2023-11-06[Transforms] Use StringRef::starts_with/ends_with instead of ↵Simon Pilgrim1-1/+1
startswith/endswith. NFC. startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)