aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-27[NFC][AArch64] Regenerate regression tests.Eli Friedman2-157/+186
2024-03-27[NFC][TLI] Move VecFuncs to statics to reduce stack usage (#86829)Alex MacLean1-62/+68
`TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib` has a lot of data in local stack arrays, which MSVC keeps on the stack even in release builds. To reduce stack usage, the data arrays (which are const), are moved outside the function as statics. This drops the method stack usage to be negligible.
2024-03-27[Libomptarget] Make dynamic loading libffi more verbose. (#86891)Ye Luo1-3/+13
2024-03-27[BOLT] Set EntryDiscriminator in YAML profile for indirect callsAmir Ayupov2-19/+30
Indirect call handling missed setting an `EntryDiscriminator` while it's set for direct calls and tail calls. Improve YAML profile accuracy by unifying the destination setting between direct and indirect calls into `setCSIDestination` method. Depends on: https://github.com/llvm/llvm-project/pull/86848 Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s Reviewers: ayermolo, maksfb, rafaelauler Reviewed By: maksfb Pull Request: https://github.com/llvm/llvm-project/pull/82128
2024-03-27[lldb] Avoid deadlock by unlocking before invoking callbacks (#86888)Jonas Devlieghere1-39/+45
Avoid deadlocks in the Alarm class by releasing the lock before invoking callbacks. This deadlock manifested itself in the ProgressManager: 1. On the main thread, the ProgressManager acquires its lock in ProgressManager::Decrement and calls Alarm::Create. 2. On the main thread, the Alarm acquires its lock in Alarm::Create. 3. On the alarm thread, the Alarm acquires its lock after waiting on the condition variable and calls ProgressManager::Expire. 4. On the alarm thread, the ProgressManager acquires its lock in ProgressManager::Expire. Note how the two threads are acquiring the locks in different orders. Deadlocks can be avoided by always acquiring locks in the same order, but since the two mutexes here are private implementation details, belong to different classes, that's not straightforward. Luckily, we don't need to have the Alarm mutex locked when invoking the callbacks. That exactly how this patch solves the issue.
2024-03-27[Clang][Sema] Allow flexible arrays in unions and alone in structs (#84428)Kees Cook9-30/+303
GNU and MSVC have extensions where flexible array members (or their equivalent) can be in unions or alone in structs. This is already fully supported in Clang through the 0-sized array ("fake flexible array") extension or when C99 flexible array members have been syntactically obfuscated. Clang needs to explicitly allow these extensions directly for C99 flexible arrays, since they are common code patterns in active use by the Linux kernel (and other projects). Such projects have been using either 0-sized arrays (which is considered deprecated in favor of C99 flexible array members) or via obfuscated syntax, both of which complicate their code bases. For example, these do not error by default: ``` union one { int a; int b[0]; }; union two { int a; struct { struct { } __empty; int b[]; }; }; ``` But this does: ``` union three { int a; int b[]; }; ``` Remove the default error diagnostics for this but continue to provide warnings under Microsoft or GNU extensions checks. This will allow for a seamless transition for code bases away from 0-sized arrays without losing existing code patterns. Add explicit checking for the warnings under various constructions. Additionally fixes a CodeGen bug with flexible array members in unions in C++, which was found when adding a testcase for: ``` union { char x[]; } z = {0}; ``` which only had Sema tests originally. Fixes #84565
2024-03-27[CI][NFC] Fix shellcheck warnings in CI scripts (#86877)Marc Auberer2-8/+8
This fixes all shellcheck warnings we have in `monolithic-linux.sh` and `monolithic-windows.sh`. All of them have to do with [SC2086](https://www.shellcheck.net/wiki/SC2086) - Double quote to prevent globbing and word splitting.
2024-03-27[BOLT] Fix enumeration of secondary entry pointsAmir Ayupov2-3/+74
Make them start with 1 instead of 0 (reserved for primary entry point). Test Plan: ``` bin/llvm-lit -a tools/bolt/test/X86/yaml-secondary-entry-discriminator.s ``` Reviewers: rafaelauler, ayermolo, maksfb, dcci Reviewed By: maksfb Pull Request: https://github.com/llvm/llvm-project/pull/86848
2024-03-27[CLANG] Fix potential integer overflow value in getRVVTypeSize() (#86810)smanna121-2/+2
In getRVVTypeSize(clang::ASTContext &, clang::BuiltinType const *) potential integer overflow occurs on expression VScale->first * MinElts with type unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type uint64_t (64 bits, unsigned). To avoid integer overflow, this patch changes the types of variables MinElts and EltSize to uint64_t instead of the cast. The change matches what was originally done in https://github.com/llvm/llvm-project/commit/7372c0d46d2185017c509eb30910b102b4f9cdaa. Looks like the revert happened in https://github.com/llvm/llvm-project/commit/c92ad411f2f94d8521cd18abcb37285f9a390ecb
2024-03-27[NFC][CLANG] Fix null pointer dereferences (#86760)smanna121-2/+2
This patch replaces getAs<> with castAs<> to resolve potential static analyzer bugs for 1. Dereferencing Proto1->param_type_begin(), which is known to be nullptr 2. Dereferencing Proto2->param_type_begin(), which is known to be nullptr 3. Dereferencing a pointer issue with nullptr Proto1 when calling param_type_end() 4. Dereferencing a pointer issue with nullptr Proto2 when calling param_type_end() in clang::Sema::getMoreSpecializedTemplate().
2024-03-27[libc] add flag for FP_*LOGB0/NAN values (#86723)Michael Jones2-2/+8
These values vary by system, this flag allows you to toggle their value.
2024-03-27[RISCV] Add test coverage for (add (shl Z, c1), Y, (shl Z, c2)) variantsPhilip Reames1-0/+90
Basically, testing for interaction of shNadd matching with one step of reassociation in the add.
2024-03-27[SLP][NFC]Improve compile time by size analysis limit and reduction sizeAlexey Bataev1-11/+21
limit. Used RecursionMaxDepth to limit number of lookups in BoUpSLP::getVectorElementSize and limited reduction width for bool reduced values.
2024-03-27[clang-installapi] Remove unnecessary copy (#86808)smanna121-1/+1
Reported by Static Analyzer Tool: In clang::installapi::InstallAPIVisitor::VisitFunctionDecl(clang::FunctionDecl const *): Using the auto keyword without an & causes the copy of an object of type DynTypedNode.
2024-03-27[GlobalISel] Update `MachineIRBuilder::buildAtomicRMW` interface (#86851)Shilei Tian2-31/+34
2024-03-27[AMDGPU] Fix missing `IsExact` flag when expanding vector binary operator ↵Shilei Tian2-0/+111
(#86712)
2024-03-27[SLP]Fix PR86763: do not truncate reductions to the demanded bits size.Alexey Bataev3-5/+11
Need to adjust ReductionBitWIdth after minbitwidth analysis, if the demanded bits analysis sjows tht its size is less than the size of the vectorized value. It prevents incorrect sign-zero extension transformation after.
2024-03-27[lld-macho] Implement support for ObjC relative method lists (#86231)alx3211-7/+583
The MachO format supports relative offsets for ObjC method lists. This support is present already in ld64. With this change we implement this support in lld also. Relative method lists can be identified by a specific flag (0x80000000) in the method list header. When this flag is present, the method list will contain 32-bit relative offsets to the current Program Counter (PC), instead of absolute pointers. Additionally, when relative method lists are used, the offset to the selector name will now be relative and point to the selector reference (selref) instead of the name itself.
2024-03-27[SLP][NFC]Add a test with the wrong result extension after reduction,Alexey Bataev1-0/+33
NFC.
2024-03-27Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709)"Florian Hahn9-54/+9
This reverts commit df75183d70e029352a49c93f275db703c81a65c1. Revert for now as this appears to cause failures on some buildbots, e.g.: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio
2024-03-27[ELF] Change duplicate symbol errors to use errorOrWarnFangrui Song2-3/+6
so that --noinhibit-exec downgrades the error to a warning, which matches GNU ld. Most recoverable errors should use errorOrWarn.
2024-03-27[lldb] [ObjC runtime] Don't cast to signed when left shifting (#86605)Jason Molenda1-1/+1
This is fixing a report from ubsan which I don't think is super high value, but our testsuite hits it on TestDataFormatterObjCNSContainer.py so I'd like to work around it. We are getting ``` runtime error: left shift of negative value -8827055269646171913 3159 int64_t data_payload_signed = 3160 ((int64_t)((int64_t)unobfuscated -> 3161 << m_objc_debug_taggedpointer_ext_payload_lshift) >> 3162 m_objc_debug_taggedpointer_ext_payload_rshift); ``` At this point `unobfuscated` is 0x85800000000000f7 and `m_objc_debug_taggedpointer_ext_payload_lshift` is 9, so `(int64_t)0x85800000000000f7<<9` shifts off the "sign" bit and then some zeroes etc, and that's how we get this error. We're only trying to extract some bits in the middle of the doubleword, so the fact that we're "losing" the sign is not a bug. Change the inner cast to (uint64_t).
2024-03-27[Driver] Avoid repeated ToolChain.getTriple() calls. NFCFangrui Song1-8/+7
2024-03-27[HWASAN] Don't instrument loads from global if globals are not tagged (#86774)Vitaly Buka3-17/+7
2024-03-27[NFC][HWASAN] Promote InstrumentGlobals to member (#86773)Vitaly Buka1-2/+5
2024-03-27Revert "[libc][math][c23] Add remaining linux/* entrypoints for ↵Nick Desaulniers4-60/+16
{,u}fromfp{,x}* (#86692)" This reverts commit cd17082b24079a31eff0057abe407da5cfb7b0fc because the newly added tests fail on 32b ARM. Link: #86692 Link: https://lab.llvm.org/buildbot/#/builders/229/builds/24458
2024-03-27[llvm-exegesis] Improve error handling for shm_open callsAiden Grossman1-1/+6
This patch adds error handling for shm_open failures in one case where they were not handled before and also makes an error handler in another case report the value of errno for diagnosis.
2024-03-27[TEST][HWASAN] Fix test after #86771Vitaly Buka1-0/+2
2024-03-27[TEST][HWASAN] Fix test after #86771Vitaly Buka1-2/+2
2024-03-27[LegalizeDAG] Freeze index when converting insert_elt/insert_subvector to ↵Craig Topper6-63/+82
load/store on stack. We try clamp the index to be within the bounds of the stack object we create, but if we don't freeze it, poison can propagate into the clamp code. This can cause the access to leave the bounds of the stack object. We have other instances of this issue in type legalization and extract_elt/subvector, but posting this patch first for direction check. Fixes #86717
2024-03-27[AArch64] Pre-commit test for #86717. NFCCraig Topper1-0/+22
2024-03-27Finish revert "[SystemZ][z/OS] TXT records in the GOFF reader (#74526)"Aiden Grossman1-148/+0
This finishes the revert started in aeb8628c218f8224e08dddcdd3199a445d8607a8 which didn't completely back out the original patch.
2024-03-27Fix the -Wmissing-designated-field-initializers on the clang-ppc64le-rhel botAmy Kwan2-2/+2
2024-03-27[Libomptarget][NFC] Remove trivially true checks on function pointers (#86804)Joseph Huber3-62/+19
Summary: Previously we had an interface that checked these functions pointers to see if they are implemented by the plugin. This was removed as currently every single function is implemented as a part of the common interface. These checks are now always true and do nothing.
2024-03-27[libc][math][c23] Add remaining linux/* entrypoints for {,u}fromfp{,x}* (#86692)OverMighty4-16/+60
2024-03-27[CodeGen][arm64e] Add methods and data members to Address, which are needed ↵Akira Hatanaka50-1234/+1640
to authenticate signed pointers (#86721) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit broke msan bots because LValue::IsKnownNonNull was uninitialized.
2024-03-27[RISCV] Model vd as a src for some Zvk* instructions in MC layer. (#86710)Craig Topper2-21/+39
Some Zvk instructions use vd as a source regardless of tail policy. Model this in the MC layer. We already do this for FMA for example.
2024-03-27[Target][RISCV] Add HwMode support to subregister index size/offset. (#86368)Craig Topper10-50/+242
This is needed to provide proper size and offset for the GPRPair subreg indices on RISC-V. The size of a GPR already uses HwMode. Previously we said the subreg indices have unknown size and offset, but this stops DwarfExpression::addMachineReg from being able to find the registers that make up the pair. I believe this fixes https://github.com/llvm/llvm-project/issues/85864 but need to verify.
2024-03-27Revert "[SystemZ][z/OS] TXT records in the GOFF reader (#74526)"Neumann Hon4-178/+14
This reverts commit 009f88fc0e3a036be97ef7b222b90af342bae0b7. Reverting PR due to test failure.
2024-03-27[libc++][NFC] Remove whitespace that doesn't belongLouis Dionne1-1/+1
2024-03-27Recommit "[VPlan] Replace disjoint or with add instead of dropping disjoint. ↵Florian Hahn5-2/+33
(#83821)" Recommit with a fix for the use-after-free causing the revert. This reverts the revert commit f872043e055f4163c3c4b1b86ca0354490174987. Original commit message: Dropping disjoint from an OR may yield incorrect results, as some analysis may have converted it to an Add implicitly (e.g. SCEV used for dependence analysis). Instead, replace it with an equivalent Add. This is possible as all users of the disjoint OR only access lanes where the operands are disjoint or poison otherwise. Note that replacing all disjoint ORs with ADDs instead of dropping the flags is not strictly necessary. It is only needed for disjoint ORs that SCEV treated as ADDs, but those are not tracked. There are other places that may drop poison-generating flags; those likely need similar treatment. Fixes https://github.com/llvm/llvm-project/issues/81872 PR: https://github.com/llvm/llvm-project/pull/83821
2024-03-27[Libomptarget] Move API implementations into GenericPluginTy (#86683)Joseph Huber2-147/+444
Summary: The plan is to remove the entire plugin interface and simply use the `GenericPluginTy` inside of `libomptarget` by statically linking against it. This means that inside of `libomptarget` we will simply do `Plugin.data_alloc` without the dynamically loaded interface. To reduce the amount of code required, this patch simply moves all of the RTL implementation functions inside of the Generic device. Now the `__tgt_rtl_` interface is simply a shallow wrapper that will soon go away. There is some redundancy here, this will be improved later. For now what is important is minimizing the changes to the API.
2024-03-27[Offload] Change unregister library to use `atexit` instead of destructor ↵Joseph Huber2-37/+41
(#86830) Summary: The 'new driver' sets up the lifetime of a registered liftime using global constructors and destructors. Currently, this is put at priority 1 which isn't strictly conformant as it will conflict with system utilities. We now use 101 as this is the loweest suggested for non-system constructors and will still run before user constructors. Secondly, there were issues with the CUDA runtime when destructed with a global destructor. Because the global ones are in any order and potentially run before other things we were hitting an edge case where the OpenMP runtime was uninitialized *after* `_dl_fini` was called. This would result in us erroring when we call into a destroyed `libcuda.so` instance. using `atexit` is what CUDA / HIP use and it prevents this from happening. Most everything uses `atexit` except system utilities and because of the constructor priority it will be unregistered *after* everything else but not after `_fl_fini`.
2024-03-27[clang-format] Fix anonymous reference parameter with default value (#86254)rayroudc2-2/+23
When enabling alignment of consecutive declarations and reference right alignment, the needed space between `& ` and ` = ` is removed in the following use case. Problem (does not compile) ``` int a(const Test &= Test()); double b(); ``` Expected: ``` int a(const Test & = Test()); double b(); ``` Test command: ``` echo "int a(const Test& = Test()); double b();" | clang-format -style="{AlignConsecutiveDeclarations: true, ReferenceAlignment: Right}" ```
2024-03-27[C99] Claim conformance to digraphs/iso646Aaron Ballman2-1/+91
2024-03-27[nfc][PGO]Factor out profile scaling into a standalone helper function (#83780)Mingming Liu5-45/+197
- Put the helper function in `ProfDataUtil.h/cpp`, which is already a dependency of `Instructions.cpp` - The helper function could be re-used to update profiles of `InvokeInst` (in a follow-up pull request)
2024-03-27[HLSL] enforce unsigned types for reversebits (#86720)Farzon Lotfi5-261/+25
fixes #86719 - `SemaChecking.cpp` - Adds unsigned semaChecks to `__builtin_elementwise_bitreverse` - `hlsl_intrinsics.h` - remove signed `reversebits` apis
2024-03-27[AArch64] Clear kill flags when removing FMOVDr. (#86308)David Green2-1/+63
The uses of OldDef/NewDef may not be killed in the same place they previously were after they are replaced, and so need to be cleared.
2024-03-27Revert "[scudo] Use getMonotonicTimeFast for tryLock." (#86590)ChiaHungDuan1-3/+3
This reverts commit 36ca9a29025a2f678096e9545fa2ec44e8432592. We were using the `time` as the seed while choosing a new TSD. To make the access of TSDs evenly distributed, we require a higher precision in `time`. Otherwise, many threads may result in having the same random access pattern on TSDs because they share the same `time` in certain period. On Linux, CLOCK_MONOTONIC_COARSE usually adopts 4 ms precision. This is way higher than the average accessing time of TSD (which is usually less than 1 us). As a result, when multiple threads try to select a new TSD in a 4 ms interval, they share the same `time` seed and end up choosing and congesting on the same TSD.
2024-03-27[CSSPGO] Reject high checksum mismatched profile (#84097)Lei Wang2-0/+78
Error out the build if the checksum mismatch is extremely high, it's better to drop the profile rather than apply the bad profile. Note that the check is on a module level, the user could make big changes to functions in one single module but those changes might not be performance significant to the whole binary, so we want to be conservative, only expect to catch big perf regression. To do this, we select a set of the "hot" functions for the check. We use two parameter(`hot-func-cutoff-for-staleness-error` and `min-functions-for-staleness-error`) to control the function selection to make sure the selected are hot enough and the num of function is not small. Tuned the parameters on our internal services, it works to catch big perf regression due to the high mismatch .