Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
This CMakeLists.txt is used to build modules without build system
support. This was removed in d06ae33ec32122bb526fb35025c1f0cf979f1090.
This is used in the documentation how to use modules.
Made some minor changes to make it work with the std.compat module using
the std module.
Note the CMakeLists.txt in the build dir should be removed once build
system support is generally available.
|
|
This completes the unrevert of ef388334ee5a3584255b9ef5b3fefdb244fa3fd7.
|
|
(#81635)
Prevents isel errors when trying to lower gc relocate of undef value
(which turns into CopyToReg of TargetConstant). Such relocates may occur
after DCE (e.g. after GVN removes some dead blocks) if there are not
passes like instcombine scheduled after to clean them up.
Fixes #80294
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
|
|
Reverts llvm/llvm-project#81534
llvm/llvm-project#81534 breaks building (Fuchsia) Clang toolchain on
Windows.
Log:
https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8756186536543250705/+/u/clang/install/stdout
Builder:
https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8756186536543250705/overview
```
FAILED: tools/clang/tools/extra/clang-include-fixer/tool/CMakeFiles/clang-include-fixer.dir/ClangIncludeFixer.cpp.obj
C:\b\s\w\ir\x\w\cipd\bin\clang-cl.exe /nologo -TP -DCLANG_REPOSITORY_STRING=\"https://llvm.googlesource.com/llvm-project\" -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\b\s\w\ir\x\w\llvm_build\tools\clang\tools\extra\clang-include-fixer\tool -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang\include -IC:\b\s\w\ir\x\w\llvm_build\tools\clang\include -IC:\b\s\w\ir\x\w\recipe_cleanup\tensorflow-venv\store\python_venv-q9i5kpsp0iun0ktmqgab125ti8\contents\Lib\site-packages\tensorflow\include -IC:\b\s\w\ir\x\w\llvm_build\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\.. -imsvcC:\b\s\w\ir\x\w\zlib_install_target\include -imsvcC:\b\s\w\ir\x\w\zstd_install\include /DWIN32 /D_WINDOWS /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4 -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -no-canonical-prefixes /O2 /Ob2 -std:c++17 -MT /EHs-c- /GR- -UNDEBUG /showIncludes /Fotools\clang\tools\extra\clang-include-fixer\tool\CMakeFiles\clang-include-fixer.dir\ClangIncludeFixer.cpp.obj /Fdtools\clang\tools\extra\clang-include-fixer\tool\CMakeFiles\clang-include-fixer.dir\ -c -- C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\ClangIncludeFixer.cpp
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\ClangIncludeFixer.cpp:11:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\..\IncludeFixer.h:15:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Sema/ExternalSemaSource.h:15:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/ExternalASTSource.h:18:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/DeclBase.h:18:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/DeclarationName.h:18:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/IdentifierTable.h:18:
In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h:63:
C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(151,1): error: redefinition of enumerator 'BI_alloca'
151 | LANGBUILTIN(_alloca, "v*z", "n", ALL_MS_LANGUAGES)
| ^
C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(15,54): note: expanded from macro 'LANGBUILTIN'
15 | # define LANGBUILTIN(ID, TYPE, ATTRS, BUILTIN_LANG) BUILTIN(ID, TYPE, ATTRS)
| ^
C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h(62,34): note: expanded from macro 'BUILTIN'
62 | #define BUILTIN(ID, TYPE, ATTRS) BI##ID,
| ^
<scratch space>(72,1): note: expanded from here
72 | BI_alloca
| ^
C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(150,1): note: previous definition is here
150 | LIBBUILTIN(alloca, "v*z", "fn", STDLIB_H, ALL_GNU_LANGUAGES)
| ^
C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(11,61): note: expanded from macro 'LIBBUILTIN'
11 | # define LIBBUILTIN(ID, TYPE, ATTRS, HEADER, BUILTIN_LANG) BUILTIN(ID, TYPE, ATTRS)
| ^
C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h(62,34): note: expanded from macro 'BUILTIN'
62 | #define BUILTIN(ID, TYPE, ATTRS) BI##ID,
| ^
<scratch space>(71,1): note: expanded from here
71 | BI_alloca
| ^
```
|
|
for multi-use
We previously did this iff the inner `(shl/lshr -1, x)` was
one-use. No instructions are added even if the inner `(shl/lshr -1,
x)` is multi-use and this canonicalization both makes the resulting
instruction easier to analyze and shrinks its dependency chain.
Closes #81576
|
|
given a pgo name. (#81547)
- Also update the `InstrProf::addFuncWithName` to call the newly added
`getCanonicalName`.
|
|
Summary:
I forgot to remove these because I thought I did it already. This caused
the build to fail when actually linked.
|
|
This PR adds two LLVM intrinsics to MLIR:
- llvm.amdgcn.s.setprio which sets the priority of a wave for the GPU
scheduler
- llvm.amdgcn.sched.barrier which sets a software barrier so that the
scheduler cannot move instructions around
|
|
Summary:
Recent patches have added solutions to the remaining sources of
divergence. This patch simply removes the last occures of things like
`has_builtin`, `ifdef` or builtins with feature requirements. The one
exception here is `nanosleep`, but I made changes in the
`__nvvm_reflect` pass to make usage like this actually work at O0.
Depends on https://github.com/llvm/llvm-project/pull/81331
|
|
|
|
This PR adds a new attribute to carry over the information from
`cluster_dims`. The new attribute `CUDAClusterDimsAttr` holds 3 integer
attributes and is added to `func.func` operation.
|
|
This pass looks for unsigned icmps that have illegal types and tries
to widen the use/def graph to improve the placement of the zero
extends that type legalization would need to insert.
I've explicitly disabled it for i32 by adding a check for
isSExtCheaperThanZExt to the pass.
The generated code isn't perfect, but my data shows a net
dynamic instruction count improvement on spec2017 for both base and
Zba+Zbb+Zbs.
|
|
|
|
Added in 26670dcba1609574cba5942aff78ff97b567c5f3 to workaround #4885.
Windows CI and a local Windows build are happy with this change, so it
seems like this has been properly fixed at some point. If this does
break somebody, this can be easily reverted. (Also, Linux does the same
`#define alloca` in system headers, so I'm not sure why it'd be
different on Windows)
This is tech debt that caused breakages, see comments on #71709.
|
|
i8 fixed vector. (#76548)
Instead of only handling vscale x 16 x i1 predicate vectors, handle any
scalable i1 vector where the known minimum is divisible by 8.
This is used on RISC-V where we have multiple sizes of predicate
types.
|
|
|
|
|
|
Summary:
The GPU `nanosleep` tests would occasionally fail. This was due to the
fact that we used integer division to determine how many ticks we had to
sleep for. This would then truncate, leaving us with a value just
slightly below the requested value. This would then occasionally leave
us with a return value of `-1`. This patch just changes the code to
round up by 1 so we always sleep for at least the requested value.
|
|
This PR adds a new attribute to carry over the information from
`launch_bounds`. The new attribute `CUDALaunchBoundsAttr` holds 2 to 3
integer attrinbutes and is added to `func.func` operation.
|
|
Summary:
The RPC interface needs to handle an entire warp or wavefront at once.
This is currently done by using a compile time constant indicating the
size of the buffer, which right now defaults to some value on the client
(GPU) side. However, there are currently attempts to move the `libc`
library to a single IR build. This is problematic as the size of the
wave fronts changes between ISAs on AMDGPU. The builitin
`__builtin_amdgcn_wavefrontsize()` will return the appropriate value,
but it is only known at runtime now.
In order to support this, this patch restructures the packet. Now
instead of having an array of arrays, we simply have a large array of
buffers and slice it according to the runtime value if we don't know it
ahead of time. This also somewhat has the advantage of making the buffer
contiguous within a page now that the header has been moved out of it.
|
|
linalg.mmt4d was added a while back (https://reviews.llvm.org/D105244),
but there are virtually no tests in-tree. In the spirit of documenting
through test, this PR adds a few basic examples.
|
|
build-llvm/tools/clang/docs/LanguageExtensions.rst:2768: WARNING: Title underline too short.
|
|
This adds a layer between `SounceBreakpoint`/`FunctionBreakpoint` and
`BreakpointBase` to have better separation and encapsulation so we are
not directly operating on `SBBreakpoint`.
I basically moved the `SBBreakpoint` and the methods that requires it
from `BreakpointBase` to `Breakpoint`. This allows adding support for
data watchpoint easier by sharing the logic inside `BreakpointBase`.
|
|
|
|
DXIL.td (#81184)
- Specify overload types of DXIL Operation as list of types instead of a
string.
- Add supported DXIL type record definitions to `DXIL.td` leveraging
`LLVMType` to avoid duplicate definitions.
- Spell out DXIL Operation Attribute specification string.
- Make corresponding changes to process the records in DXILEmitter.cpp
|
|
This seems to be a trick to avoid copying a RegUnitSet, but it can be
done more simply using std::move.
|
|
clocks (#81331)
Summary:
This patch adds a new intrinsic and builtin function mirroring the
existing `__builtin_readcyclecounter`. The difference is that this
implementation targets a separate counter that some targets have which
returns a fixed frequency clock that can be used to determine elapsed
time, this is different compared to the cycle counter which often has
variable frequency.
This patch only adds support for the NVPTX and AMDGPU targets.
This is done as a new and separate builtin rather than an argument to
`readcyclecounter` to avoid needing to change existing code and to make
the separation more explicit.
|
|
Some extra `<>` and a missing full stop.
|
|
c_int128_t as PowerPC only supports up to c_int64_t. (#81222)
PowerPC only supports up to `c_int64_t`. Add macro `__powerpc__` and
preprocess it for setting `c_intmax_t` in `iso_c_binding` intrinsic
module.
|
|
WriteConstantInternal.
This makes is easier to extend the code to support vector types.
|
|
|
|
Extra X86 tests for https://github.com/llvm/llvm-project/pull/77790.
|
|
The failure rate is too high.
See
https://discourse.llvm.org/t/rfc-future-of-windows-pre-commit-ci/76840
|
|
The load combine replaces a number of original loads with one new loads
and also replaces the output chains of the original loads with the
output chain of the new load. This is incorrect if the original load is
retained (due to multi-use), as it may get incorrectly reordered.
Fix this by using makeEquivalentMemoryOrdering() instead, which will
create a TokenFactor with both chains.
Fixes https://github.com/llvm/llvm-project/issues/80911.
|
|
This patch changes how the macro __ARM_ARCH is defined to match its
defintion in the ACLE. In ACLE 5.4.1, __ARM_ARCH is defined as equal to
the major architecture version for ISAs up to and including v8. From
v8.1 onwards, its definition is changed to include minor versions, such
that for an architecture vX.Y, __ARM_ARCH = X*100 + Y. Before this
patch, LLVM defined __ARM_ARCH using only the major architecture version
for all architecture versions. This patch adds functionality to define
__ARM_ARCH correctly for architectures greater than or equal to v8.1.
|
|
This uses
https://pygithub.readthedocs.io/en/stable/github_objects/Repository.html?highlight=get_collaborator_permission#github.Repository.Repository.get_collaborator_permission.
Which does
https://docs.github.com/en/rest/collaborators/collaborators?apiVersion=2022-11-28#get-repository-permissions-for-a-user
and returns the top level "permission" key.
This is less detailed than the user/permissions key but should be fine
for this
use case.
When a review is submitted we check:
* If it's an approval.
* Whether we have already left a merge on behalf comment (by looking for
a hidden HTML comment).
* Whether the author has permissions to merge their own PR.
* Whether the reviewer has permissions to merge.
If needed we leave a comment tagging the reviewer. If the reviewer also
doesn't have merge permission, then it asks them to find someone else
who does.
|
|
|
|
Apparently, some compilers [correctly] warn that the variable that was
created prior to this change is unused.
This reemoves the variable.
|
|
Historically TableGen has used `A.swap(B)` to move containers without
the expense of copying them. Perhaps this predated rvalue references. In
any case `A = std::move(B)` seems like a more direct way to implement
this when only A is required after the operation.
|
|
This reapplies commit bdde5f9 by undoing the revert bc66e0c.
The previous reapplication 5c9f768 was reverted due to a crash
(reproducer in comments for 5c9f768) which was fixed in #81595.
As noted in the original commit, this commit may break downstream tests.
If this commit is breaking your downstream tests, please see comment 12 in
[0], which documents the kind of variation in tests we'd expect to see from
this change and what to do about it.
[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
|
|
(#81600)
It's not interesting for majority of downstream users.
|
|
|
|
|
|
'serial', 'parallel', and 'kernel' constructs are all considered
'Compute' constructs. This patch creates the AST type, plus the required
infrastructure for such a type, plus some base types that will be useful
in the future for breaking this up.
The only difference between the three is the 'kind'( plus some minor
clause legalization rules, but those can be differentiated easily
enough), so rather than representing them as separate AST nodes, it
seems
to make sense to make them the same.
Additionally, no clause AST functionality is being implemented yet, as
that fits better in a separate patch, and this is enough to get the
'naked' constructs implemented.
This is otherwise an 'NFC' patch, as it doesn't alter execution at all,
so there aren't any tests. I did this to break up the review workload
and to get feedback on the layout.
|
|
Just emit their satisfaction state, which is what the current
interpreter does as well.
|
|
In a few places we test whether sets (i.e. sorted ranges) intersect by
computing the set_intersection and then testing whether it is empty. For
this purpose it should be more efficient to use a std:vector instead of
a std::set to hold the result of the set_intersection, since insertion
is simpler.
|
|
Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make
sure them not initialized as zero.
FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
|
|
zOS doesn't support aligned allocation, so mark these testcases as
unsupported.
Continuation of https://reviews.llvm.org/D102798
|
|
Allocate storage and initialize it with the given APValue contents.
|