aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-02-13fixupusers/minglotus-6/spr/main.instrprofmingmingl1894-38586/+72495
2024-02-13[libc++][modules] Re-add build dir CMakeLists.txt. (#81370)Mark de Wever3-0/+112
This CMakeLists.txt is used to build modules without build system support. This was removed in d06ae33ec32122bb526fb35025c1f0cf979f1090. This is used in the documentation how to use modules. Made some minor changes to make it work with the std.compat module using the std module. Note the CMakeLists.txt in the build dir should be removed once build system support is generally available.
2024-02-14InstCombine: Enable SimplifyDemandedUseFPClass and remove flag (#81108)Matt Arsenault3-22/+45
This completes the unrevert of ef388334ee5a3584255b9ef5b3fefdb244fa3fd7.
2024-02-13[StatepointLowering] Use Constant instead of TargetConstant for undef value ↵Danila Malyutin2-1/+33
(#81635) Prevents isel errors when trying to lower gc relocate of undef value (which turns into CopyToReg of TargetConstant). Such relocates may occur after DCE (e.g. after GVN removes some dead blocks) if there are not passes like instcombine scheduled after to clean them up. Fixes #80294 --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-02-13Revert "[clang] Remove #undef alloca workaround" (#81649)Prabhuk1-0/+4
Reverts llvm/llvm-project#81534 llvm/llvm-project#81534 breaks building (Fuchsia) Clang toolchain on Windows. Log: https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8756186536543250705/+/u/clang/install/stdout Builder: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8756186536543250705/overview ``` FAILED: tools/clang/tools/extra/clang-include-fixer/tool/CMakeFiles/clang-include-fixer.dir/ClangIncludeFixer.cpp.obj C:\b\s\w\ir\x\w\cipd\bin\clang-cl.exe /nologo -TP -DCLANG_REPOSITORY_STRING=\"https://llvm.googlesource.com/llvm-project\" -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\b\s\w\ir\x\w\llvm_build\tools\clang\tools\extra\clang-include-fixer\tool -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang\include -IC:\b\s\w\ir\x\w\llvm_build\tools\clang\include -IC:\b\s\w\ir\x\w\recipe_cleanup\tensorflow-venv\store\python_venv-q9i5kpsp0iun0ktmqgab125ti8\contents\Lib\site-packages\tensorflow\include -IC:\b\s\w\ir\x\w\llvm_build\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\.. -imsvcC:\b\s\w\ir\x\w\zlib_install_target\include -imsvcC:\b\s\w\ir\x\w\zstd_install\include /DWIN32 /D_WINDOWS /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4 -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -no-canonical-prefixes /O2 /Ob2 -std:c++17 -MT /EHs-c- /GR- -UNDEBUG /showIncludes /Fotools\clang\tools\extra\clang-include-fixer\tool\CMakeFiles\clang-include-fixer.dir\ClangIncludeFixer.cpp.obj /Fdtools\clang\tools\extra\clang-include-fixer\tool\CMakeFiles\clang-include-fixer.dir\ -c -- C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\ClangIncludeFixer.cpp In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\ClangIncludeFixer.cpp:11: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\..\IncludeFixer.h:15: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Sema/ExternalSemaSource.h:15: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/ExternalASTSource.h:18: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/DeclBase.h:18: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/DeclarationName.h:18: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/IdentifierTable.h:18: In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h:63: C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(151,1): error: redefinition of enumerator 'BI_alloca' 151 | LANGBUILTIN(_alloca, "v*z", "n", ALL_MS_LANGUAGES) | ^ C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(15,54): note: expanded from macro 'LANGBUILTIN' 15 | # define LANGBUILTIN(ID, TYPE, ATTRS, BUILTIN_LANG) BUILTIN(ID, TYPE, ATTRS) | ^ C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h(62,34): note: expanded from macro 'BUILTIN' 62 | #define BUILTIN(ID, TYPE, ATTRS) BI##ID, | ^ <scratch space>(72,1): note: expanded from here 72 | BI_alloca | ^ C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(150,1): note: previous definition is here 150 | LIBBUILTIN(alloca, "v*z", "fn", STDLIB_H, ALL_GNU_LANGUAGES) | ^ C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(11,61): note: expanded from macro 'LIBBUILTIN' 11 | # define LIBBUILTIN(ID, TYPE, ATTRS, HEADER, BUILTIN_LANG) BUILTIN(ID, TYPE, ATTRS) | ^ C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h(62,34): note: expanded from macro 'BUILTIN' 62 | #define BUILTIN(ID, TYPE, ATTRS) BI##ID, | ^ <scratch space>(71,1): note: expanded from here 71 | BI_alloca | ^ ```
2024-02-13[InstCombine] Extend `(lshr/shl (shl/lshr -1, x), x)` -> `(lshr/shl -1, x)` ↵Noah Goldstein7-49/+61
for multi-use We previously did this iff the inner `(shl/lshr -1, x)` was one-use. No instructions are added even if the inner `(shl/lshr -1, x)` is multi-use and this canonicalization both makes the resulting instruction easier to analyze and shrinks its dependency chain. Closes #81576
2024-02-13[NFC][InstrProf]Factor out getCanonicalName to compute the canonical name ↵Mingming Liu2-19/+41
given a pgo name. (#81547) - Also update the `InstrProf::addFuncWithName` to call the newly added `getCanonicalName`.
2024-02-13[libc] Remove leftover target dependent intrinsicJoseph Huber1-8/+0
Summary: I forgot to remove these because I thought I did it already. This caused the build to fail when actually linked.
2024-02-13[mlir][ROCDL] Add synchronization primitives (#80888)Giuseppe Rossini3-0/+44
This PR adds two LLVM intrinsics to MLIR: - llvm.amdgcn.s.setprio which sets the priority of a wave for the GPU scheduler - llvm.amdgcn.sched.barrier which sets a software barrier so that the scheduler cannot move instructions around
2024-02-13[libc] Remove remaining GPU architecture dependent instructions (#81612)Joseph Huber5-32/+14
Summary: Recent patches have added solutions to the remaining sources of divergence. This patch simply removes the last occures of things like `has_builtin`, `ifdef` or builtins with feature requirements. The one exception here is `nanosleep`, but I made changes in the `__nvvm_reflect` pass to make usage like this actually work at O0. Depends on https://github.com/llvm/llvm-project/pull/81331
2024-02-13[lldb][NFCI] Add header guard to PlatformRemoteAppleXR.h (#81565)Alex Langford1-0/+5
2024-02-13[flang][cuda] Lower cluster_dims values (#81636)Valentin Clement (バレンタイン クレメン)5-2/+36
This PR adds a new attribute to carry over the information from `cluster_dims`. The new attribute `CUDAClusterDimsAttr` holds 3 integer attributes and is added to `func.func` operation.
2024-02-13[RISCV] Enable the TypePromotion pass from AArch64/ARM.Craig Topper7-87/+190
This pass looks for unsigned icmps that have illegal types and tries to widen the use/def graph to improve the placement of the zero extends that type legalization would need to insert. I've explicitly disabled it for i32 by adding a check for isSExtCheaperThanZExt to the pass. The generated code isn't perfect, but my data shows a net dynamic instruction count improvement on spec2017 for both base and Zba+Zbb+Zbs.
2024-02-13[RISCV] Copy typepromotion-overflow.ll from AArch64. NFCCraig Topper1-0/+388
2024-02-13[clang] Remove #undef alloca workaround (#81534)Arthur Eubanks1-4/+0
Added in 26670dcba1609574cba5942aff78ff97b567c5f3 to workaround #4885. Windows CI and a local Windows build are happy with this change, so it seems like this has been properly fixed at some point. If this does break somebody, this can be easily reverted. (Also, Linux does the same `#define alloca` in system headers, so I'm not sure why it'd be different on Windows) This is tech debt that caused breakages, see comments on #71709.
2024-02-13[IRGen][AArch64][RISCV] Generalize bitcast between i1 predicate vector and ↵Craig Topper7-130/+94
i8 fixed vector. (#76548) Instead of only handling vscale x 16 x i1 predicate vectors, handle any scalable i1 vector where the known minimum is divisible by 8. This is used on RISC-V where we have multiple sizes of predicate types.
2024-02-13[TableGen] Trivial simplification in computeRegUnitSets. NFC.Jay Foad1-4/+2
2024-02-13[TableGen] Remove trivial helper function hasRegUnit. NFC.Jay Foad1-8/+2
2024-02-13[libc] Round up time for GPU nanosleep implementation (#81630)Joseph Huber1-7/+8
Summary: The GPU `nanosleep` tests would occasionally fail. This was due to the fact that we used integer division to determine how many ticks we had to sleep for. This would then truncate, leaving us with a value just slightly below the requested value. This would then occasionally leave us with a return value of `-1`. This patch just changes the code to round up by 1 so we always sleep for at least the requested value.
2024-02-13[flang][cuda] Lower launch_bounds values (#81537)Valentin Clement (バレンタイン クレメン)5-7/+64
This PR adds a new attribute to carry over the information from `launch_bounds`. The new attribute `CUDALaunchBoundsAttr` holds 2 to 3 integer attrinbutes and is added to `func.func` operation.
2024-02-13[libc] Rework the RPC interface to accept runtime wave sizes (#80914)Joseph Huber6-131/+108
Summary: The RPC interface needs to handle an entire warp or wavefront at once. This is currently done by using a compile time constant indicating the size of the buffer, which right now defaults to some value on the client (GPU) side. However, there are currently attempts to move the `libc` library to a single IR build. This is problematic as the size of the wave fronts changes between ISAs on AMDGPU. The builitin `__builtin_amdgcn_wavefrontsize()` will return the appropriate value, but it is only known at runtime now. In order to support this, this patch restructures the packet. Now instead of having an array of arrays, we simply have a large array of buffers and slice it according to the runtime value if we don't know it ahead of time. This also somewhat has the advantage of making the buffer contiguous within a page now that the header has been moved out of it.
2024-02-13[mlir][nfc] Add tests for linalg.mmt4d (#81422)Andrzej Warzyński4-0/+132
linalg.mmt4d was added a while back (https://reviews.llvm.org/D105244), but there are virtually no tests in-tree. In the spirit of documenting through test, this PR adds a few basic examples.
2024-02-13[clang][docs] Fix warning in LanguageExtensionsDavid Spickett1-1/+1
build-llvm/tools/clang/docs/LanguageExtensions.rst:2768: WARNING: Title underline too short.
2024-02-13[lldb-dap][NFC] Add Breakpoint struct to share common logic. (#80753)Zequan Wu13-402/+459
This adds a layer between `SounceBreakpoint`/`FunctionBreakpoint` and `BreakpointBase` to have better separation and encapsulation so we are not directly operating on `SBBreakpoint`. I basically moved the `SBBreakpoint` and the methods that requires it from `BreakpointBase` to `Breakpoint`. This allows adding support for data watchpoint easier by sharing the logic inside `BreakpointBase`.
2024-02-13[clang][Driver] Small correction to print-runtime-dirDavid Spickett1-1/+1
2024-02-13[DirectX][NFC] Change specification of overload types and attribute in ↵S. Bharadwaj Yadavalli2-71/+133
DXIL.td (#81184) - Specify overload types of DXIL Operation as list of types instead of a string. - Add supported DXIL type record definitions to `DXIL.td` leveraging `LLVMType` to avoid duplicate definitions. - Spell out DXIL Operation Attribute specification string. - Make corresponding changes to process the records in DXILEmitter.cpp
2024-02-13[TableGen] Do not speculatively grow RegUnitSets. NFC.Jay Foad1-17/+8
This seems to be a trick to avoid copying a RegUnitSet, but it can be done more simply using std::move.
2024-02-13[LLVM] Add `__builtin_readsteadycounter` intrinsic and builtin for realtime ↵Joseph Huber35-72/+229
clocks (#81331) Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclecounter`. The difference is that this implementation targets a separate counter that some targets have which returns a fixed frequency clock that can be used to determine elapsed time, this is different compared to the cycle counter which often has variable frequency. This patch only adds support for the NVPTX and AMDGPU targets. This is done as a new and separate builtin rather than an argument to `readcyclecounter` to avoid needing to change existing code and to make the separation more explicit.
2024-02-13[clang][Driver][HLSL] Fix formatting of clang-dxc options group titleDavid Spickett1-2/+2
Some extra `<>` and a missing full stop.
2024-02-13[Flang] Add __powerpc__ macro to set c_intmax_t to c_int64_t rather than ↵Daniel Chen3-1/+29
c_int128_t as PowerPC only supports up to c_int64_t. (#81222) PowerPC only supports up to `c_int64_t`. Add macro `__powerpc__` and preprocess it for setting `c_intmax_t` in `iso_c_binding` intrinsic module.
2024-02-13[NFC][LLVM][AsmWriter] Extract logic to write out ConstantFP from ↵Paul Walker1-88/+94
WriteConstantInternal. This makes is easier to extend the code to support vector types.
2024-02-13[TableGen] Use emplace_back instead of resize to size() + 1. NFC.Jay Foad3-23/+18
2024-02-13[SLP] Add X86 version of non-power-of-2 vectorization tests.Florian Hahn4-0/+956
Extra X86 tests for https://github.com/llvm/llvm-project/pull/77790.
2024-02-13ci: Temporarily disable the buildkite job on Windows (#81538)Tom Stellard1-1/+4
The failure rate is too high. See https://discourse.llvm.org/t/rfc-future-of-windows-pre-commit-ci/76840
2024-02-13[DAGCombine] Fix multi-use miscompile in load combine (#81586)Nikita Popov2-3/+2
The load combine replaces a number of original loads with one new loads and also replaces the output chains of the original loads with the output chain of the new load. This is incorrect if the original load is retained (due to multi-use), as it may get incorrectly reordered. Fix this by using makeEquivalentMemoryOrdering() instead, which will create a TokenFactor with both chains. Fixes https://github.com/llvm/llvm-project/issues/80911.
2024-02-13[ARM] __ARM_ARCH macro definition fix (#81493)James Westwood8-22/+112
This patch changes how the macro __ARM_ARCH is defined to match its defintion in the ACLE. In ACLE 5.4.1, __ARM_ARCH is defined as equal to the major architecture version for ISAs up to and including v8. From v8.1 onwards, its definition is changed to include minor versions, such that for an architecture vX.Y, __ARM_ARCH = X*100 + Y. Before this patch, LLVM defined __ARM_ARCH using only the major architecture version for all architecture versions. This patch adds functionality to define __ARM_ARCH correctly for architectures greater than or equal to v8.1.
2024-02-13[GitHub][workflows] Ask reviewers to merge PRs when author cannot (#81142)David Spickett2-0/+104
This uses https://pygithub.readthedocs.io/en/stable/github_objects/Repository.html?highlight=get_collaborator_permission#github.Repository.Repository.get_collaborator_permission. Which does https://docs.github.com/en/rest/collaborators/collaborators?apiVersion=2022-11-28#get-repository-permissions-for-a-user and returns the top level "permission" key. This is less detailed than the user/permissions key but should be fine for this use case. When a review is submitted we check: * If it's an approval. * Whether we have already left a merge on behalf comment (by looking for a hidden HTML comment). * Whether the author has permissions to merge their own PR. * Whether the reviewer has permissions to merge. If needed we leave a comment tagging the reviewer. If the reviewer also doesn't have merge permission, then it asks them to find someone else who does.
2024-02-13Make use of std::inserter. NFC.Jay Foad2-5/+5
2024-02-13Fix warning by removing unused variable (#81604)Mats Petersson1-1/+1
Apparently, some compilers [correctly] warn that the variable that was created prior to this change is unused. This reemoves the variable.
2024-02-13[TableGen] Use std::move instead of swap. NFC. (#81606)Jay Foad5-12/+11
Historically TableGen has used `A.swap(B)` to move containers without the expense of copying them. Perhaps this predated rvalue references. In any case `A = std::move(B)` seems like a more direct way to implement this when only A is required after the operation.
2024-02-13Reapply "[DebugInfo][RemoveDIs] Turn on non-instrinsic debug-info by default"OCHyams1-1/+1
This reapplies commit bdde5f9 by undoing the revert bc66e0c. The previous reapplication 5c9f768 was reverted due to a crash (reproducer in comments for 5c9f768) which was fixed in #81595. As noted in the original commit, this commit may break downstream tests. If this commit is breaking your downstream tests, please see comment 12 in [0], which documents the kind of variation in tests we'd expect to see from this change and what to do about it. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
2024-02-13[Object][COFF][NFC] Make writeImportLibrary NativeExports argument optional. ↵Jacek Caban5-18/+26
(#81600) It's not interesting for majority of downstream users.
2024-02-13MCDCTypes.h: Add ctors, fixup for #81227NAKAMURA Takumi1-0/+4
2024-02-13[gn build] Port f65577830073LLVM GN Syncbot1-0/+1
2024-02-13[OpenACC] Implement AST for OpenACC Compute Constructs (#81188)Erich Keane24-6/+352
'serial', 'parallel', and 'kernel' constructs are all considered 'Compute' constructs. This patch creates the AST type, plus the required infrastructure for such a type, plus some base types that will be useful in the future for breaking this up. The only difference between the three is the 'kind'( plus some minor clause legalization rules, but those can be differentiated easily enough), so rather than representing them as separate AST nodes, it seems to make sense to make them the same. Additionally, no clause AST functionality is being implemented yet, as that fits better in a separate patch, and this is enough to get the 'naked' constructs implemented. This is otherwise an 'NFC' patch, as it doesn't alter execution at all, so there aren't any tests. I did this to break up the review workload and to get feedback on the layout.
2024-02-13[clang][Interp] Handle Requires- and ConceptSpecializationExprsTimm Bäder3-0/+16
Just emit their satisfaction state, which is what the current interpreter does as well.
2024-02-13[TableGen] Use vectors instead of sets for testing intersection. NFC. (#81602)Jay Foad2-10/+8
In a few places we test whether sets (i.e. sorted ranges) intersect by computing the set_intersection and then testing whether it is empty. For this purpose it should be more efficient to use a std:vector instead of a std::set to hold the result of the set_intersection, since insertion is simpler.
2024-02-13[MC/DC] Refactor: Make `MCDCParams` as `std::variant` (#81227)NAKAMURA Takumi7-100/+153
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make sure them not initialized as zero. FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
2024-02-13[SystemZ][z/OS][libcxx] mark aligned allocation tests XFAIL on z/OS (#80735)Abhina Sree4-0/+16
zOS doesn't support aligned allocation, so mark these testcases as unsupported. Continuation of https://reviews.llvm.org/D102798
2024-02-13[clang][Interp] Handle CXXUuidofExprsTimm Bäder6-1/+74
Allocate storage and initialize it with the given APValue contents.