Age | Commit message (Collapse) | Author | Files | Lines |
|
Some `LangOptions` duplicate their `CodeGenOptions` counterparts. My
understanding is that this was done solely because some infrastructure
(like preprocessor initialization, serialization, module compatibility
checks, etc.) were only possible/convenient for `LangOptions`. This PR
implements the missing support for `CodeGenOptions`, which makes it
possible to remove some duplicate `LangOptions` fields and simplify the
logic. Motivated by https://github.com/llvm/llvm-project/pull/146342.
|
|
This PR resubmits the changes from #136098, which was previously
reverted due to a build failure during the linking stage:
```
undefined reference to `llvm::DebugInfoCorrelate'
undefined reference to `llvm::ProfileCorrelate'
```
The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp`
references symbols from the `Instrumentation` component, but the
`LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for
`LLVMFrontendDriver` did not include it. As a result, linking failed in
configurations where these components were not transitively linked.
### Fix:
This updated patch explicitly adds `Instrumentation` to
`LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt`
file to ensure the required symbols are properly resolved.
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com>
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
|
|
compiler" (#142159)
Reverts llvm/llvm-project#136098
|
|
(#136098)
This patch implements IR-based Profile-Guided Optimization support in
Flang through the following flags:
- `-fprofile-generate` for instrumentation-based profile generation
- `-fprofile-use=<dir>/file` for profile-guided optimization
Resolves #74216 (implements IR PGO support phase)
**Key changes:**
- Frontend flag handling aligned with Clang/GCC semantics
- Instrumentation hooks into LLVM PGO infrastructure
- LIT tests verifying:
- Instrumentation metadata generation
- Profile loading from specified path
- Branch weight attribution (IR checks)
**Tests:**
- Added gcc-flag-compatibility.f90 test module verifying:
- Flag parsing boundary conditions
- IR-level profile annotation consistency
- Profile input path normalization rules
- SPEC2006 benchmark results will be shared in comments
For details on LLVM's PGO framework, refer to [Clang PGO
Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).
This implementation was developed by [XSCC Compiler
Team](https://github.com/orgs/OpenXiangShan/teams/xscc).
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Tom Eccles <t@freedommail.info>
|
|
We get a lot of issues that basically boil down to "I passed malformed
LLVM IR to clang and it crashed". Clang does not perform IR verification
by default in (non-assertion-enabled) release builds, and that's
sensible for IR that Clang itself produces, which is expected to always
be valid. However, if people pass in their own handwritten IR, we should
report if it is malformed, instead of crashing. We should also report it
in a way that does not produce a crash trace and ask for a bug report,
as currently happens in assertions-enabled builds. This aligns the
behavior with how opt/llc work.
|
|
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
|
|
With the changes in 48d0eb518, the CodeGenOptions used to emit .pcm
files with -fmodule-format=obj (-gmodules) were the ones from the
original invocation, rather than the ones specifically crafted for
outputting the pcm. This was causing the pcm to be written with only the
debug info and without the __clangast section in some cases (e.g. -O2).
This unforunately was not covered by existing tests, because compiling
and loading a module within a single compilation load the ast content
from the in-memory module cache rather than reading it from the pcm file
that was written. This broke bootstrapping a build of clang with modules
enabled on Darwin.
rdar://143418834
|
|
The code generation time is unclear in the -ftime-report output:
* The two clang timers "Code Generation Time" and "LLVM IR Generation
Time" are in the default group "Miscellaneous Ungrouped Timers".
* There is also a "Clang front-end time" group, which actually includes
code generation time.
```
===-------------------------------------------------------------------------===
Miscellaneous Ungrouped Timers
===-------------------------------------------------------------------------===
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.0611 ( 1.7%) 0.0099 ( 4.4%) 0.0710 ( 1.9%) 0.0713 ( 1.9%) LLVM IR Generation Time
3.5140 ( 98.3%) 0.2165 ( 95.6%) 3.7306 ( 98.1%) 3.7342 ( 98.1%) Code Generation Time
3.5751 (100.0%) 0.2265 (100.0%) 3.8016 (100.0%) 3.8055 (100.0%) Total
...
===-------------------------------------------------------------------------===
Clang front-end time report
===-------------------------------------------------------------------------===
Total Execution Time: 3.9108 seconds (3.9146 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
3.6802 (100.0%) 0.2306 (100.0%) 3.9108 (100.0%) 3.9146 (100.0%) Clang front-end timer
3.6802 (100.0%) 0.2306 (100.0%) 3.9108 (100.0%) 3.9146 (100.0%) Total
```
This patch
* renames "Clang front-end time report" (FrontendAction time) to "Clang
time report",
* renames "Clang front-end" to "Front end",
* moves "LLVM IR Generation" into the group,
* replaces "Code Generation time" with "Optimizer" (middle end) and
"Machine code generation" (back end).
```
% clang -c sqlite3.i -w -ftime-report -mllvm -sort-timers=0
...
===-------------------------------------------------------------------------===
Clang time report
===-------------------------------------------------------------------------===
Total Execution Time: 1.5922 seconds (1.5972 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.5107 ( 35.9%) 0.0105 ( 6.2%) 0.5211 ( 32.7%) 0.5222 ( 32.7%) Front end
0.2464 ( 17.3%) 0.0340 ( 20.0%) 0.2804 ( 17.6%) 0.2814 ( 17.6%) LLVM IR generation
0.6240 ( 43.9%) 0.1235 ( 72.7%) 0.7475 ( 47.0%) 0.7503 ( 47.0%) Machine code generation
0.0413 ( 2.9%) 0.0018 ( 1.0%) 0.0431 ( 2.7%) 0.0433 ( 2.7%) Optimizer
1.4224 (100.0%) 0.1698 (100.0%) 1.5922 (100.0%) 1.5972 (100.0%) Total
```
Pull Request: https://github.com/llvm/llvm-project/pull/122225
|
|
Prepare for -ftime-report change (#122225).
|
|
|
|
|
|
Identified with misc-include-cleaner.
|
|
Some `FileManager` APIs still return `{File,Directory}Entry` instead of
the preferred `{File,Directory}EntryRef`. These are documented to be
deprecated, but don't have the attribute that warns on their usage. This
PR marks them as such with `LLVM_DEPRECATED()` and replaces their usage
with the recommended counterparts. NFCI.
|
|
Try to avoid excess layer of indirection when possible.
p.s. Remove a call to raw_string_ostream::flush() which is a no-op.
|
|
This reverts commit 2eeeff842f993a694159183a2834b4d305549cad.
See the post commit discussion in
https://github.com/llvm/llvm-project/commit/2eeeff842f993a694159183a2834b4d305549cad
|
|
Close https://github.com/llvm/llvm-project/issues/72383
The implementation rationale is, I don't want to pass
`-fmodules-embed-all-files` all the time since we can't test it in lit
tests (we're using `clang_cc1`). So I tried to set it in FrontendActions
for modules.
|
|
When BPF object files are linked with bpftool, every symbol must be
accompanied by BTF info. Ensure that extern functions referenced by
global variable initializers are included in BTF.
The primary motivation is "static" initialization of PROG maps:
```c
extern int elsewhere(struct xdp_md *);
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__type(key, int);
__type(value, int);
__array(values, int (struct xdp_md *));
} prog_map SEC(".maps") = { .values = { elsewhere } };
```
BPF backend needs debug info to produce BTF. Debug info is not
normally generated for external variables and functions. Previously, it
was solved differently for variables (collecting variable declarations
in ExternalDeclarations vector) and functions (logic invoked during
codegen in CGExpr.cpp).
This patch generalises ExternalDefclarations to include both function
and variable declarations. This change ensures that function references
are not missed no matter the context. Previously external functions
referenced in constant expressions lacked debug info.
|
|
We have reworked the bitcode linking option to no longer link twice if
post-optimization linking is requested. As such, we no longer need to
conditionally link bitcodes supplied via -mlink-bitcode-file, as there
is no danger of linking them twice
|
|
createWithDefaultAttr().
Functions created with createWithDefaultAttr() need to have the
correct target-{cpu,features} attributes to avoid miscompilations
such as using the wrong relocation type to access globals (missing
tagged-globals feature), clobbering registers specified via -ffixed-*
(missing reserve-* feature), and so on.
There's already a number of attributes copied from the module flags
onto functions created by createWithDefaultAttr(). I don't think
module flags are the right choice for the target attributes because
we don't need the conflict resolution logic between modules with
different target attributes, nor does it seem sensible to add it:
there's no unambiguously "correct" set of target attributes when
merging two modules with different attributes, and nor should there
be; it's perfectly valid for two modules to be compiled with different
target attributes, that's the whole reason why they are per-function.
This also implies that it's unnecessary to serialize the attributes in
bitcode, which implies that they shouldn't be stored on the module. We
can also observe that for the most part, createWithDefaultAttr()
is called from compiler passes such as sanitizers, coverage and
profiling passes that are part of the compile time pipeline, not
the LTO pipeline. This hints at a solution: we need to store the
attributes in a non-serialized location associated with the ambient
compilation context. Therefore in this patch I elected to store the
attributes on the LLVMContext.
There are calls to createWithDefaultAttr() in the NVPTX and AMDGPU
backends, and those calls would happen at LTO time. For those callers,
the bug still potentially exists and it would be necessary to refactor
them to create the functions at compile time if this issue is relevant
on those platforms.
Fixes #93633.
Reviewers: fmayer, MaskRay, eugenis
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/96721
|
|
After 11a6799740f8 "[clang][CodeGen] Omit pre-opt link when post-opt is
link requested (#85672)" I'm seeing a new warning:
> BackendConsumer.h:37:22: error: private field 'FileMgr' is not used
[-Werror,-Wunused-private-field]
Remove the field since it's no longer used.
|
|
Currently, when the -relink-builtin-bitcodes-postop option is used we
link builtin bitcodes twice: once before optimization, and again after
optimization.
With this change, we omit the pre-opt linking when the option is set,
and we rename the option to the following:
-Xclang -mlink-builtin-bitcodes-postopt
(-Xclang -mno-link-builtin-bitcodes-postopt)
The goal of this change is to reduce compile time. We do lose the
theoretical benefits of pre-opt linking, but in practice these are small
than the overhead of linking twice. However we may be able to address
this in a future patch by adjusting the position of the builtin-bitcode
linking pass.
Compilations not setting the option are unaffected
|
|
This is the driver part of
https://github.com/llvm/llvm-project/pull/75894.
This patch introduces '-fexperimental-modules-reduced-bmi' to enable
generating the reduced BMI.
This patch did:
- When `-fexperimental-modules-reduced-bmi` is specified but
`--precompile` is not specified for a module unit, we'll skip the
precompile phase to avoid unnecessary two-phase compilation phases. Then
if `-c` is specified, we will generate the reduced BMI in CodeGenAction
as a by-product.
- When `-fexperimental-modules-reduced-bmi` is specified and
`--precompile` is specified, we will generate the reduced BMI in
GenerateModuleInterfaceAction as a by-product.
- When `-fexperimental-modules-reduced-bmi` is specified for a
non-module unit. We don't do anything nor try to give a warn. This is
more user friendly so that the end users can try to test and experiment
with the feature without asking help from the build systems.
The core design idea is that users should be able to enable this easily
with the existing cmake mechanisms.
The future plan for the flag is:
- Add this to clang19 and make it opt-in for 1~2 releases. It depends on
the testing feedback to decide how long we like to make it opt-in.
- Then we can announce the existing BMI generating may be deprecated and
suggesting people (end users or build systems) to enable this for 1~2
releases.
- Finally we will enable this by default. When that time comes, the term
`BMI` will refer to the reduced BMI today and the existing BMI will only
be meaningful to build systems which loves to support two phase
compilations.
I'll send release notes and document in seperate commits after this get
landed.
|
|
Add missing error check. This resolves "error: variable 'Err' set but
not used" warnings
|
|
(#81693)
We recently implemented a new option allowing relinking of bitcode
modules via the "-mllvm -relink-builtin-bitcode-postop"
option.
This implementation relied on llvm::CloneModule() in order to pass
copies to modules and preserve the original modules for later relinking.
However, cloning modules has been found to be prohibitively expensive,
significantly increasing compilation time for large bitcode libraries.
In this patch, we shift the relink option implementation to instead link
the original modules initially, and reload modules from the file system
if relinking is requested. This approach results in significantly
reduced overhead.
We accomplish this by creating a new ReloadModules() routine that can be
called from a BackendConsumer class, to mimic the behavior of
ASTConsumer's loadLinkModules(), but without access to the
CompilerInstance.
Because loading the bitcodes from the filesystem requires access to the
FileManager class, we also forward a reference to the CompilerInstance
class to the BackendConsumer. This mirrors what is already done for
several CompilerInstance members, such as TargetOptions and
CodeGenOptions.
Finally, we needed to add a const specifier to the
FileManager::getBufferForFile() routine to allow it to be called using
the const reference returned from CompilerInstance::getFileManager()
|
|
|
|
In LLVMContext::diagnose, set `HasErrors` for `DS_Error` so that all
derived `DiagnosticHandler` have correct `HasErrors` information.
An alternative is to set `HasErrors` in
`DiagnosticHandler::handleDiagnostics`, but all derived
`handleDiagnostics` would have to call the base function.
|
|
(#75726)
Non-LTO compiles set the buffer name to "<inline asm>"
(`AsmPrinter::addInlineAsmDiagBuffer`) and pass diagnostics to
`ClangDiagnosticHandler` (through the `MCContext` handler in
`MachineModuleInfoWrapperPass::doInitialization`) to ensure that
the exit code is 1 in the presence of errors. In contrast, LTO compiles
spuriously succeed even if error messages are printed.
```
% cat a.c
void _start() {}
asm("unknown instruction");
% clang -c a.c
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 error generated.
% clang -c -flto a.c; echo $? # -flto=thin is the same
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
error: invalid instruction mnemonic 'unknown'
unknown instruction
^~~~~~~
0
```
`CollectAsmSymbols` parses inline assembly and is transitively called by
both `ModuleSummaryIndexAnalysis::run` and `WriteBitcodeToFile`, leading
to duplicate diagnostics.
This patch updates `CollectAsmSymbols` to be similar to non-LTO
compiles.
```
% clang -c -flto=thin a.c; echo $?
<inline asm>:1:1: error: invalid instruction mnemonic 'unknown'
1 | unknown instruction
| ^
1 errors generated.
1
```
The `HasErrors` check does not prevent duplicate warnings but assembler
warnings are very uncommon.
|
|
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
|
|
Now that we have a commandline option dictating a second link step,
(-relink-builtin-bitcode-postop), we can condition the module creation
when linking in bitcode modules. This aims to improve performance by
avoiding unnecessary linking
|
|
In this patch, we create a new ModulePass that mimics the LinkInModules
API from CodeGenAction.cpp, and a new command line option to enable the
pass. As part of the implementation, we needed to refactor the
BackendConsumer class definition into a new separate header (instead of
embedded in CodeGenAction.cpp). With this new pass, we can now re-link
bitcodes supplied via the -mlink-built-in bitcodes as part of the
RunOptimizationPipeline.
With the re-linking pass, we now handle cases where new device library
functions are introduced as part of the optimization pipeline.
Previously, these newly introduced functions (for example a fused sincos
call) would result in a linking error due to a missing function
definition. This new pass can be initiated via:
-mllvm -relink-builtin-bitcode-postop
Also note we intentionally exclude bitcodes supplied via the
-mlink-bitcode-file option from the second linking step
|
|
When using -fsplit-lto-unit (explicitly specified or due to using
-fsanitize=cfi/-fwhole-program-vtables), the emitted LLVM IR contains a module
flag metadata `"EnableSplitLTOUnit"`. If a module contains both type metadata
and `"EnableSplitLTOUnit"`, `ThinLTOBitcodeWriter.cpp` will write two modules
into the bitcode file. Compiling the bitcode (not ThinLTO backend compilation)
will lead to an error due to `parseIR` requiring a single module.
```
% clang -flto=thin a.cc -c -o a.bc
% clang -c a.bc
% clang -fsplit-lto-unit -flto=thin a.cc -c -o a.bc
% clang -c a.bc
error: Expected a single module
1 error generated.
```
There are multiple ways to have just one module in a bitcode file
output: `-Xclang -fno-lto-unit`, not using features like `-fsanitize=cfi`,
using `-fsanitize=cfi` with `-fno-split-lto-unit`. I think whether a
bitcode input file contains 2 modules (internal implementation strategy)
should not be a criterion to require an additional driver option when
the user seek for a non-LTO compile action.
Let's place the extra module (if present) into CodeGenOptions::LinkBitcodeFiles
(originally for -cc1 -mlink-bitcode-file). Linker::linkModules will link the two
modules together. This patch makes the following commands work:
```
clang -S -emit-llvm a.bc
clang -S a.bc
clang -c a.bc
```
Reviewed By: ormris
Differential Revision: https://reviews.llvm.org/D154923
|
|
Clang provides the `-mlink-bitcode-file` and `-mlink-builtin-bitcode`
options to insert LLVM-IR into the current TU. These are usefuly
primarily for including LLVM-IR files that require special handling to
be correct and cannot be linked normally, such as GPU vendor libraries
like `libdevice.10.bc`. Currently these options can only be used if the
source input goes through the AST consumer path. This patch makes the
changes necessary to also support this when the input is LLVM-IR. This
will allow the following operation:
```
clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc
```
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D152391
|
|
Migration of clang tests to opaque pointers is finished, so remove
the -no-opaque-pointers flag.
Differential Revision: https://reviews.llvm.org/D152447
|
|
const std::string&
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using `llvm::demangle`. Add a
release note.
Differential Revision: https://reviews.llvm.org/D149104
|
|
CUDA support can be enabled in clang-repl with --cuda flag.
Device code linking is not yet supported. inline must be used with all
__device__ functions.
Differential Revision: https://reviews.llvm.org/D146389
|
|
This reverts commit 80e7eed6a610ab3c7289e6f9b7ec006bc7d7ae31.
|
|
CUDA support can be enabled in clang-repl with --cuda flag.
Device code linking is not yet supported. inline must be used with all
__device__ functions.
Differential Revision: https://reviews.llvm.org/D146389
|
|
const std::string&"
This reverts commit c117c2c8ba4afd45a006043ec6dd858652b2ffcc.
itaniumDemangle calls std::strlen with the results of
std::string_view::data() which may not be NUL-terminated. This causes
lld/test/wasm/why-extract.s to fail when "expensive checks" are enabled
via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON. See D149675 for further
discussion. Back this out until the individual demanglers are converted
to use std::string_view.
|
|
std::string&
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using ``llvm::demangle``. Add a
release note.
Reviewed By: MaskRay, #lld-macho
Differential Revision: https://reviews.llvm.org/D149104
|
|
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
|
|
Reported by Coverity:
Big parameter passed by value
Copying large values is inefficient, consider passing by reference; Low, medium, and high size thresholds for detection can be adjusted.
1. Inside "SemaConcept.cpp" file, in subsumes<clang::Sema::MaybeEmitAmbiguousAtomicConstraintsDiagnostic(clang::NamedDecl *, llvm::ArrayRef<clang::Expr const *>, clang::NamedDecl *, llvm::ArrayRef<clang::Expr const *>)::[lambda(clang::AtomicConstraint const &, clang::AtomicConstraint const &) (instance 2)]>(llvm::SmallVector<llvm::SmallVector<clang::AtomicConstraint *, 2u>, 4u>, llvm::SmallVector<llvm::SmallVector<clang::AtomicConstraint *, 2u>, 4u>, T1): A large function call parameter exceeding the low threshold is passed by value.
i. pass_by_value: Passing parameter PDNF of type NormalForm (size 144 bytes) by value, which exceeds the low threshold of 128 bytes.
ii. pass_by_value: Passing parameter QCNF of type NormalForm (size 144 bytes) by value, which exceeds the low threshold of 128 bytes.
2. Inside "CodeGenAction.cpp" file, in clang::reportOptRecordError(llvm::Error, clang::DiagnosticsEngine &, clang::CodeGenOptions): A very large function call parameter exceeding the high threshold is passed by value.
i. pass_by_value: Passing parameter CodeGenOpts of type clang::CodeGenOptions const (size 1560 bytes) by value, which exceeds the high threshold of 512 bytes.
3. Inside "SemaCodeComplete.cpp" file, in HandleCodeCompleteResults(clang::Sema *, clang::CodeCompleteConsumer *, clang::CodeCompletionContext, clang::CodeCompletionResult *, unsigned int): A large function call parameter exceeding the low threshold is passed by value.
i. pass_by_value: Passing parameter Context of type clang::CodeCompletionContext (size 200 bytes) by value, which exceeds the low threshold of 128 bytes.
4. Inside "SemaConcept.cpp" file, in <unnamed>::SatisfactionStackRAII::SatisfactionStackRAII(clang::Sema &, clang::NamedDecl const *, llvm::FoldingSetNodeID): A large function call parameter exceeding the low threshold is passed by value.
i. pass_by_value: Passing parameter FSNID of type llvm::FoldingSetNodeID (size 144 bytes) by value, which exceeds the low threshold of 128 bytes.
Reviewed By: erichkeane, aaron.ballman
Differential Revision: https://reviews.llvm.org/D147708
|
|
Make the access to profile data going through virtual file system so the
inputs can be remapped. In the context of the caching, it can make sure
we capture the inputs and provided an immutable input as profile data.
Reviewed By: akyrtzi, benlangmuir
Differential Revision: https://reviews.llvm.org/D139052
|
|
This patch replaces (llvm::|)Optional< with std::optional<. I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.
I'll post a separate patch to actually replace llvm::Optional with
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|
|
This patch replaces:
return Optional<T>();
with:
return None;
to make the migration from llvm::Optional to std::optional easier.
Specifically, I can deprecate None (in my source tree, that is) to
identify all the instances of None that should be replaced with
std::nullopt.
Note that "return None" far outnumbers "return Optional<T>();". There
are more than 2000 instances of "return None" in our source tree.
All of the instances in this patch come from functions that return
Optional<T> except Archive::findSym and ASTNodeImporter::import, where
we return Expected<Optional<T>>. Note that we can construct
Expected<Optional<T>> from any parameter convertible to Optional<T>,
which None certainly is.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Differential Revision: https://reviews.llvm.org/D138464
|
|
|
|
Explicitly call `LLVMContext::setOpaquePointers` in `CodeGenAction`
before loading any IR files. With this we use the mode specified on the
command-line rather than lazily initializing it based on the contents of
the IR.
This helps when using `-fthinlto-index` which may end up mixing files
with typed and opaque pointer types which fails when the first file
happened to use typed pointers since we cannot downgrade IR with opaque
pointer types to typed pointer types.
Differential Revision: https://reviews.llvm.org/D137475
|
|
Print source location info and demangle the name, compared
to the default behavior.
Several observations:
1. Specially handling this seems to give source locations
without enabling debug info, and also gives columns compared
to the backend diagnostic.
2. We're duplicating diagnostic effort in DiagnosticInfo
and clang. This feels wrong, but clang can demangle and I guess
have better debug info available? Should clang really have any of this
code? For the purposes of this diagnostic, the important piece
is just reading the source location out of the llvm::Function.
3. lld is not duplicating the same effort as clang with LTO, and
just directly printing the DiagnosticInfo as-is. e.g.
$ clang -fgpu-rdc
lld: error: local memory (480000) exceeds limit (65536) in function '_Z12use_huge_ldsIiEvv'
lld: error: local memory (960000) exceeds limit (65536) in function '_Z12use_huge_ldsIdEvv'
$ clang -fno-gpu-rdc
backend-resource-limit-diagnostics.hip:8:17: error: local memory (480000) exceeds limit (65536) in 'void use_huge_lds<int>()'
__global__ void use_huge_lds() {
^
backend-resource-limit-diagnostics.hip:8:17: error: local memory (960000) exceeds limit (65536) in 'void use_huge_lds<double>()'
2 errors generated when compiling for gfx90a.
4. Backend errors are not observed with -save-temps and -fno-gpu-rdc or -flto,
and the compile incorrectly succeeds.
5. The backend version prints error: <location info>; clang prints <location info>: error:
6. -emit-codegen-only is totally broken for AMDGPU. MC
gets a null target streamer. I do not understand why this
is a thing. This just creates a horrible edge case.
Just work around this by emitting actual code instead of blocking
this patch.
|
|
...instead of calling `llvm::sys::fs::current_path()` directly.
Differential Revision: https://reviews.llvm.org/D130443
|