aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-08-09[𝘀𝗽𝗿] changes to main this commit is based onusers/smeenai/sprmain.hmaptool-implement-simple-string-deduplicationShoaib Meenai1-2/+5
Created using spr 1.3.4 [skip ci]
2024-08-09clang-tidy: readability-redundant-smartptr-get does not remove (#97964) ↵akshaykumars6143-0/+56
(#100177) added a check to remove '->' if exists added testcase and modified Release Notes
2024-08-08Fix test on WindowsVitaly Buka1-3/+3
2024-08-08Revert "[clang] Reland: Instantiate concepts with sugared template arguments ↵Matheus Izvekov18-74/+59
(#101782)" (#102551)
2024-08-08[mlir][LLVM] Improve lowering of `llvm.byval` function arguments (#100028)Diego Caballero5-12/+184
When a function argument is annotated with the `llvm.byval` attribute, [LLVM expects](https://llvm.org/docs/LangRef.html#parameter-attributes) the function argument type to be an `llvm.ptr`. For example: ``` func.func (%args0 : llvm.ptr {llvm.byval = !llvm.struct<(i32)>} { ... } ``` Unfortunately, this makes the type conversion context-dependent, which is something that the type conversion infrastructure (i.e., `LLVMTypeConverter` in this particular case) doesn't support. For example, we may want to convert `MyType` to `llvm.struct<(i32)>` in general, but to an `llvm.ptr` type only when it's a function argument passed by value. To fix this problem, this PR changes the FuncToLLVM conversion logic to generate an `llvm.ptr` when the function argument has a `llvm.byval` attribute. An `llvm.load` is inserted into the function to retrieve the value expected by the argument users.
2024-08-08[BOLT][docs] Fix typo (#98640)Peter Jung1-1/+1
Typo: `chwon` --> `chown` Signed-off-by: Peter Jung <admin@ptr1337.dev>
2024-08-08[libc][math][c23] Fix setpayloadsig smoke test on RV32 (#102538)Job Henandez Lara1-2/+13
2024-08-08[RISCV] Add some Zfinx instructions to hasAllNBitUsers.Craig Topper8-40/+36
2024-08-08[mlir][vector] Handle corner cases in DropUnitDimsFromTransposeOp. (#102518)Han-Chung Wang2-0/+19
https://github.com/llvm/llvm-project/commit/da8778e499d8049ac68c2e152941a38ff2bc9fb2 breaks the lowering of vector.transpose that all the dimensions are unit dimensions. The revision fixes the issue and adds a test. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>
2024-08-08[flang][cuda] Make CUFRegisterAllocator callable from C/Fortran (#102543)Valentin Clement (バレンタイン クレメン)3-5/+11
2024-08-08Reapply "[ctx_prof] Fix the pre-thinlink "use" case (#102511)"Mircea Trofin6-24/+37
This reverts commit 967185eeb85abb77bd6b6cdd2b026d5c54b7d4f3. The problem was link dependencies, moved `UseCtxProfile` to `Analysis`.
2024-08-08[msan] Support most Arm NEON vector shift instructions (#102507)Thurston Dang2-636/+801
This adds support for the Arm NEON vector shift instructions that follow the same pattern as x86 (handleVectorShiftIntrinsic). VSLI is not supported because it does not follow the 2-argument pattern expected by handleVectorShiftIntrinsic. This patch also updates the arm64-vshift.ll MSan test that was introduced in https://github.com/llvm/llvm-project/commit/5d0a12d3e9b1606c36430cf908da20d19d101e04
2024-08-08[rtsan] Fix warnings after #101232Fangrui Song2-6/+2
2024-08-08[BOLT][DWARF] Add parallelization for processing of DWO debug information ↵Sayhaan Siddiqui47-62/+88
(#100282) Enables parallelization for the processing of DWO CUs.
2024-08-08[gn build] Port 8acf8852e9d4LLVM GN Syncbot1-0/+1
2024-08-08[LLVM][rtsan] Add RealtimeSanitizer transform pass (#101232)Chris Apple7-0/+170
Split from #100596. Introduce the RealtimeSanitizer transform, which inserts the rtsan_enter/exit functions at the appropriate places in an instrumented function.
2024-08-08Revert "[LLDB] Impove ObjectFileELF's .dynamic parsing and usage. (#101237)"Leonard Chan7-493/+126
This reverts commit 28ba8a56b6fb9ec61897fa84369f46e43be94c03. Reverting since this broke the buildbot at https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/9352/.
2024-08-08Revert "Fix prctl to handle PR_GET_PDEATHSIG. (#101749)"Kirill Stoimenov2-13/+2
Broke the build. This reverts commit d46c26b8102dee763d72bf98341bc95b21767196.
2024-08-08[RISCV] Remove unused function argument in RISCVOptWInstrs. NFCCraig Topper1-3/+2
2024-08-08[libc] Make str_to_float independent of fenv (#102369)Michael Jones3-10/+5
The str_to_float conversion code doesn't need the features provided by fenv and the dependency is creating a blocker for hand-in-hand. This patch uses a workaround to remove this dependency.
2024-08-08[libc][math] Add scalbln{,f,l,f128} math functions (#102219)aaryanshukla27-1/+434
Co-authored-by: OverMighty <its.overmighty@gmail.com>
2024-08-08[libc][gpu] Add Sinf Benchmarks (#102532)jameshu158692-20/+41
This PR adds benchmarking for `sinf()` using the same set up as `sin()` but with a smaller range for floats.
2024-08-08Fix prctl to handle PR_GET_PDEATHSIG. (#101749)Kirill Stoimenov2-2/+13
2024-08-08Revert "[ctx_prof] Fix the pre-thinlink "use" case (#102511)"Aiden Grossman5-30/+19
This reverts commit 1a6d60e0162b3ef767c87c95512dd453bf4f4746. Broke some buildbots.
2024-08-08[llvm][ELF] Add statistics on various section sizes (#102363)Arthur Eubanks2-2/+67
Useful with other infrastructure that consume LLVM statistics to get an idea of distribution of section sizes. The breakdown of various section types is subject to change, this is just an initial go at gather some sort of stats. Example stats compiling X86ISelLowering.cpp (-g1): ``` "elf-object-writer.AllocROBytes": 308268, "elf-object-writer.AllocRWBytes": 6240, "elf-object-writer.AllocTextBytes": 1659203, "elf-object-writer.DebugBytes": 3180386, "elf-object-writer.OtherBytes": 5862, "elf-object-writer.RelocationBytes": 2623440, "elf-object-writer.StrtabBytes": 228599, "elf-object-writer.SymtabBytes": 120336, "elf-object-writer.UnwindBytes": 85216, ```
2024-08-08[AMDGPU][True16][CodeGen] support v_mov_b16 and v_swap_b16 in true16 format ↵Brox Chen9-73/+192
(#102198) support v_swap_b16 in true16 format. update tableGen pattern and folding for v_mov_b16. --------- Co-authored-by: guochen2 <guochen2@amd.com>
2024-08-08[ctx_prof] Fix the pre-thinlink "use" case (#102511)Mircea Trofin5-19/+30
Didn't notice in #101338 that the instrumentation in `llvm/test/Transforms/PGOProfile/ctx-prof-use-prelink.ll` was actually incorrect.
2024-08-08[scudo] Added test fixture for cache tests. (#102230)Joshua Baehring2-5/+108
The test fixture simplifies some of the logic for allocations and mmap-based allocations are separated from the cache to allow for more direct cache tests. Additionally, a couple of end to end tests for the cache and the LRU algorithm are added.
2024-08-09AMDGPU: Preserve atomicrmw name when specializing address space (#102470)Matt Arsenault4-9/+20
2024-08-09AMDGPU: Avoid creating unnecessary block split in atomic expansion (#102440)Matt Arsenault6-54/+34
This was creating a new block to insert the is.shared check, but we can just do that in the original block.
2024-08-08[libc] [gpu] Fix Minor Benchmark UI Issues (#102529)jameshu158692-9/+11
Previously, `AmdgpuSinTwoPow_128` and others were too large for their table cells. This PR shortens the name to `AmdSin...` There were also some `-` missing in the separator. This PR instead creates the separator string using the length of the headers.
2024-08-08Revert "[NVPTX] support switch statement with brx.idx" (#102530)Artem Belevich6-169/+8
Reverts llvm/llvm-project#102400 Causes LLVM to crash on some tests.
2024-08-08[libc][newhdrgen] add_function by alphabetical order (#102527)aaryanshukla1-1/+9
- add_function now adds the function by alphabetical order
2024-08-08[NVPTX] support switch statement with brx.idx (#102400)Alex MacLean6-8/+169
Add custom lowering for `BR_JT` DAG nodes to the `brx.idx` PTX instruction ([PTX ISA 9.7.13.4. Control Flow Instructions: brx.idx] (https://docs.nvidia.com/cuda/parallel-thread-execution/#control-flow-instructions-brx-idx)). Depending on the heuristics in DAG selection, `switch` statements may now be lowered using `brx.idx`
2024-08-08[flang][cuda] Do not lower device variables in main program as globals (#102512)Valentin Clement (バレンタイン クレメン)3-4/+20
Flang considers arrays in main program larger than 32 bytes having the SAVE attribute and lowers them as globals. In CUDA Fortran, device variables are not allowed to have the SAVE attribute and should be allocated dynamically in the main program scope. This patch updates lowering so CUDA Fortran device variables are not considered with the SAVE attribute.
2024-08-08[libc] [gpu] Add Generic, NvSin, and OcmlSinf64 Throughput Benchmark (#101917)jameshu158696-80/+128
This PR implements https://github.com/lntue/llvm-project/commit/2a158426d4b90ffaa3eaecc9bc10e5aed11f1bcf to provide better throughput benchmarking for libc `sin()` and `__nv_sin()`. These changes have not been tested on AMDGPU yet, only compiled.
2024-08-08[lldb/Interpreter] Fix ambiguous partial command resolution (#101934)Med Ismail Bennani6-20/+116
This patch is a follow-up to #97263 that fix ambigous abbreviated command resolution. When multiple commands are resolved, instead of failing to pick a command to run, this patch changes to resolution logic to check if there is a single alias match and if so, it will run the alias instead of the other matches. This has as a side-effect that we don't need to make aliases for every substring of aliases to support abbrivated alias resolution. Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
2024-08-08TTI: Check legalization cost of mulfix ISD nodes (#100520)Matt Arsenault1-24/+29
2024-08-08TTI: Check legalization cost of mul overflow ISD nodes (#100519)Matt Arsenault1-31/+36
2024-08-08[Clang][AST][NFC] Store template parameter position for TemplateTypeParmType ↵Krystian Stasiowski2-41/+37
in TypeBit (#102481) `TemplateTypeParmType` currently stores the depth, index, and whether a template type parameter is a pack in a union of `CanonicalTTPTInfo` and `TemplateTypeParmDecl*`, and only the canonical type stores the position information. These bits can be stored for all `TemplateTypeParmTypes` in `TypeBits` to avoid unnecessary indirection when accessing the position information.
2024-08-08TTI: Check legalization cost of add/sub overflow ISD nodes (#100518)Matt Arsenault5-344/+358
2024-08-08[AMDGPU] Clear load addresses between functions (#102515)Alexis Engelke2-0/+42
SLoadAddresses previously held data across different functions and used these for dominance queries of blocks in different functions. This is not intended; clear the state at the end of the pass.
2024-08-08AMDGPU: Support VALU add instructions in localstackalloc (#101692)Matt Arsenault6-4/+1642
Pre-enable this optimization before allowing folds of frame indexes into add instructions. Disables this fold when using scratch instructions for now. I see some code size improvements with it, but the optimization needs to be smarter about the uses depending on the register classes.
2024-08-08[ELF] scanRelocations: support .crel.eh_frameFangrui Song2-4/+14
Follow-up to #98115. For EhInputSection, RelocationScanner::scan calls sortRels, which doesn't support the CREL iterator. We should set supportsCrel to false to ensure that the initial_location fields in .eh_frame FDEs are relocated.
2024-08-08AMDGPU: Directly handle all atomicrmw cases in SIISelLowering (#102439)Matt Arsenault4-37/+85
2024-08-08[flang] Improve error message output (#102324)Peter Klausler2-2/+2
When a local character variable with non-constant length has an initializer, it's an error in a couple of ways (SAVE variable with unknown size, static initializer that isn't constant due to conversion to an unknown length). The error that f18 reports is the latter, but the message contains a formatted representation of the initialization expression that exposes a non-Fortran %SET_LENGTH() operation. Print the original expression in the message instead.
2024-08-08[flang] Warn on useless IOMSG= (#102250)Peter Klausler6-2/+28
An I/O statement with IOMSG= but neither ERR= nor IOSTAT= deserves a warning to the effect that it's not useful.
2024-08-08[flang] Catch structure constructor in its own type definition (#102241)Peter Klausler3-8/+5
The check for a structure constructor to a forward-referenced derived type wasn't tripping for constructors in the type definition itself. Set the forward reference flag unconditionally at the beginning of name resolution for the type definition.
2024-08-08[flang] Fix searches for polymorphic components (#102212)Peter Klausler6-11/+42
FindPolymorphicAllocatableUltimateComponent needs to be FindPolymorphicAllocatablePotentialComponent. The current search is missing cases where a derived type has an allocatable component whose type has a polymorphic allocatable component.
2024-08-08[flang] Disallow references to some IEEE procedures in DO CONCURRENT (#102082)Peter Klausler3-29/+37
There's a numbered constraint that prohibits calls to some IEEE arithmetic and exception procedures within the body of a DO CONCURRENT construct. Clean up the implementation to catch missing cases.