Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
This patch removes spurious includes of `llvm/IR` files, and unnecessary
link components in the LLVM dialect.
The only major dependencies still coming from LLVM are
`llvm::DataLayout`, which is used by `verifyDataLayoutString` and some
`dwarf` symbols in some attributes. Both of them should likely be
removed in the future.
Finally, I also removed one constructor from `LLVM::AssumeOp` that used
[OperandBundleDefT](https://llvm.org/doxygen/classllvm_1_1OperandBundleDefT.html)
without good reason and introduced a header unnecessarily.
|
|
See https://github.com/llvm/llvm-project/pull/147168 for more info.
|
|
Reverts llvm/llvm-project#143441
Reverting due to CI failure
https://lab.llvm.org/buildbot/#/builders/53/builds/18055.
|
|
Atomic Control Options are used to specify architectural characteristics
to help lowering of atomic operations. The options used are:
`-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`,
`-f[no-]atomic-ignore-denormal-mode`.
Legacy option `-m[no-]unsafe-fp-atomics` is aliased to
`-f[no-]ignore-denormal-mode`.
More details can be found in
https://github.com/llvm/llvm-project/pull/102569. This PR implements the
frontend support for these options with OpenMP atomic in flang.
Backend changes are available in the draft PR:
https://github.com/llvm/llvm-project/pull/143769 which will be raised
after this merged.
|
|
-Wfatal-errors (#148748)
This PR changes how `-Werror` promotes warnings to errors so that it
interoperates with `-Wfatal-error`. It maintains the property that
warnings and other messages promoted to errors are displayed as there
original message.
|
|
Adds the flag `-Wfatal-errors` which truncates the error messages at 1 error.
|
|
algorithm (#146641)
This patch adds an option to select the method for computing complex
number division. It uses `LoweringOptions` to determine whether to lower
complex division to a runtime function call or to MLIR's `complex.div`,
and `CodeGenOptions` to select the computation algorithm for
`complex.div`. The available option values and their corresponding
algorithms are as follows:
- `full`: Lower to a runtime function call. (Default behavior)
- `improved`: Lower to `complex.div` and expand to Smith's algorithm.
- `basic`: Lower to `complex.div` and expand to the algebraic algorithm.
See also the discussion in the following discourse post:
https://discourse.llvm.org/t/optimization-of-complex-number-division/83468
---------
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
|
|
Fixes `#146802`
#146582 started using the `Reserved` Frame Pointer kind for Arm64
Windows, but this revealed a bug in Flang where it copied the
`-mframe-pointer=reserved` flag from Clang, but didn't correctly handle
it in its own command line parser and subsequent compilation pipeline.
This change adds support for `-mframe-pointer=reserved` and adds a test
to make sure that functions are correctly marked when the flag is set.
|
|
Summary:
This patch is mostly an NFC that renames the existing `-fopenmp-targets`
into `--offload-targets`. Doing this early to simplify a follow-up patch
that will hopefully allow this syntax to be used more generically over
the existing `--offload` syntax (which I think is mostly unmaintained
now.). Following in the well-trodden path of trying to pull language
specific offload options into generic ones, but right now this is still
just OpenMP specific.
|
|
Adds a hint to the warning message to disable a warning and updates the
tests to expect this.
Also fixes a bug in the storage of canonical spelling of error flags so
that they are not used after free.
|
|
Reland #145901 with a fix for shared library builds.
So far flang generates runtime derived type info global definitions (as
opposed to declarations) for all the types used in the current
compilation unit even when the derived types are defined in other
compilation units. It is using linkonce_odr to achieve derived type
descriptor address "uniqueness" aspect needed to match two derived type
inside the runtime.
This comes at a big compile time cost because of all the extra globals
and their definitions in apps with many and complex derived types.
This patch adds and experimental option to only generate the rtti
definition for the types defined in the current compilation unit and to
only generate external declaration for the derived type descriptor
object of types defined elsewhere.
Note that objects compiled with this option are not compatible with
object files compiled without because files compiled without it may drop
the rtti for type they defined if it is not used in the compilation unit
because of the linkonce_odr aspect.
I am adding the option so that we can better measure the extra cost of
the current approach on apps and allow speeding up some compilation
where devirtualization does not matter (and the build config links to
all module file object anyway).
|
|
RFC:
https://discourse.llvm.org/t/rfc-removing-the-openmp-experimental-warning-for-llvm-21/86455
Fixes: #110008
|
|
For historical versions that are unsupported, emit a warning and assume
the currently default version.
For values of N that are not integers or that don't correspond to any
OpenMP version (old or newer), emit an error.
|
|
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch ensures a few `cl::opt` declarations
are properly annotated with `LLVM_ABI`. The annotations currently have
no meaningful impact on the LLVM build; however, they are a prerequisite
to support an LLVM Windows DLL (shared library) build.
## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).
## Overview
- Remove local `extern` declarations of `llvm::PrintPipelinePasses`
because it is already correctly declared with an `LLVM_ABI` annotation
in `llvm\Passes\PassBuilder.h`. Leaving these declarations results in a
gcc compile warning unless they are also annotated with `LLVM_ABI`.
- Similarly, remove local `extern` declarations of
`ProfileSummaryCutoffHot` and `UseContextLessSummary` from
`llvm/tools/llvm-profgen/ProfileGenerator.cpp` since they are declared
with `LLVM_ABI` in `llvm\ProfileData\ProfileCommon.h`.
- Explicitly annotate the extern declaration of `ProfileCorrelate` in
`clang/lib/CodeGen/BackendUtil.cpp` since it is not declared in a
header. The definition of `ProfileCorrelate` in
`llvm\lib\Transforms\Instrumentation\InstrProfiling.cpp` is already
annotated with `LLVM_ABI`.
## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:
- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
|
|
(#144559)
This just adds some convenience methods to feature control and rewrites
old code in terms of those methods. Also cleans up some names that I
just realize were overloads of another method.
|
|
This PR resubmits the changes from #136098, which was previously
reverted due to a build failure during the linking stage:
```
undefined reference to `llvm::DebugInfoCorrelate'
undefined reference to `llvm::ProfileCorrelate'
```
The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp`
references symbols from the `Instrumentation` component, but the
`LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for
`LLVMFrontendDriver` did not include it. As a result, linking failed in
configurations where these components were not transitively linked.
### Fix:
This updated patch explicitly adds `Instrumentation` to
`LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt`
file to ensure the required symbols are properly resolved.
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com>
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
|
|
This patch adds support for the -mrecip command line option. The parsing
of this options is equivalent to Clang's and it is implemented by
setting the "reciprocal-estimates" function attribute.
Also move the ParseMRecip(...) function to CommonArgs, so that Flang is
able to make use of it as well.
---------
Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
|
|
This change allows the flang CLI to accept `-W[no-]<feature>` flags matching the clang syntax and enable and disable usage and language feature warnings.
|
|
This patch moves the CommonArgs utilities into a location visible by the
Frontend Drivers, so that the Frontend Drivers may share option parsing
code with the Compiler Driver. This is useful when the Frontend Drivers
would like to verify that their incoming options are well-formed and
also not reinvent the option parsing wheel.
We already see code in the Clang/Flang Drivers that is parsing and
verifying its incoming options. E.g. OPT_ffp_contract. This option is
parsed in the Compiler Driver, Clang Driver, and Flang Driver, all with
slightly different parsing code. It would be nice if the Frontend
Drivers were not required to duplicate this Compiler Driver code. That
way there is no/low maintenance burden on keeping all these parsing
functions in sync.
Along those lines, the Frontend Drivers will now have a useful mechanism
to verify their incoming options are well-formed. Currently, the
Frontend Drivers trust that the Compiler Driver is not passing back junk
in some cases. The Language Drivers may even accept junk with no error
at all. E.g.:
`clang -cc1 -mprefer-vector-width=junk test.c'
With this patch, we'll now be able to tighten up incomming options to
the Frontend drivers in a lightweight way.
---------
Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
Co-authored-by: Shafik Yaghmour <shafik.yaghmour@intel.com>
|
|
This patch adds aarch64 specific processor defines when targeting
aarch64, similar to the ones for ppc64 and x86_64
|
|
compiler" (#142159)
Reverts llvm/llvm-project#136098
|
|
(#136098)
This patch implements IR-based Profile-Guided Optimization support in
Flang through the following flags:
- `-fprofile-generate` for instrumentation-based profile generation
- `-fprofile-use=<dir>/file` for profile-guided optimization
Resolves #74216 (implements IR PGO support phase)
**Key changes:**
- Frontend flag handling aligned with Clang/GCC semantics
- Instrumentation hooks into LLVM PGO infrastructure
- LIT tests verifying:
- Instrumentation metadata generation
- Profile loading from specified path
- Branch weight attribution (IR checks)
**Tests:**
- Added gcc-flag-compatibility.f90 test module verifying:
- Flag parsing boundary conditions
- IR-level profile annotation consistency
- Profile input path normalization rules
- SPEC2006 benchmark results will be shared in comments
For details on LLVM's PGO framework, refer to [Clang PGO
Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).
This implementation was developed by [XSCC Compiler
Team](https://github.com/orgs/OpenXiangShan/teams/xscc).
---------
Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Tom Eccles <t@freedommail.info>
|
|
This patch adds support for the -mprefer-vector-width= command line
option. The parsing of this options is equivalent to Clang's and it is
implemented by setting the "prefer-vector-width" function attribute.
Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
|
|
|
|
Enable the use of -floop-interchange from the flang driver.
Enable in flang LLVM's loop interchange at levels -O2, -O3, -Ofast, and -Os.
|
|
This commit adds AMDLIBM support to fveclib targets. The support is
already present in clang and this patch extends it to flang.
|
|
The OpenMP version is stored in LangOptions in SemanticsContext. Use the
fallback version where SemanticsContext is unavailable (mostly in case
of debug dumps).
RFC:
https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
Reland with a fix for build break in f18-parse-demo.
|
|
This reverts commit 41aa67488c3ca33334ec79fb5216145c3644277c.
Breaks build: https://lab.llvm.org/buildbot/#/builders/140/builds/22826
|
|
The OpenMP version is stored in LangOptions in SemanticsContext. Use the
fallback version where SemanticsContext is unavailable (mostly in case
of debug dumps).
RFC:
https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
|
|
This is a fix for the issue https://github.com/llvm/llvm-project/issues/137126
that turned out to be a driver issue.
FrontendActions has a loop to process multiple input files and `flang -fc1`
accept multiple files, but the semantic, lowering, and llvm codegen
actions were not re-entrant, and crash or weird behaviors occurred
when processing multiple files with `-fc1`.
This patch makes the actions reentrant by cleaning-up the contexts/modules
if needed on entry.
|
|
|
|
There are various places where the -fveclib option is parsed to
determine whether its value is correct for the target. Unfortunately
these places assume case-insensitivity and subsequently use "LIBMVEC"
where the driver mandates "libmvec", thus rendering the diagnosistic
useless.
This PR corrects the naming along with similar incorrect uses within the
test files.
|
|
|
|
Evaluate (#131137)
Precompiling larger headers can save a lot of compile time across
various compilation units.
For the time being, disable precompiled headers for ccache builds on Windows
due to issues with reliably passing the appropriate options to ccache.
Selected compile time & memory improvements are as follows:
flang/lib/Parser/Fortran-parsers.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:47.31 -> 0:41.68
Maximum resident set size (kbytes): 2062140 -> 1745584
flang/lib/Lower/Bridge.cpp:
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:19.16 -> 0:45.86
Maximum resident set size (kbytes): 3849144 -> 2443476
flang/lib/Lower/PFTBuilder.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.24 -> 1:00.99
Maximum resident set size (kbytes): 4218368 -> 2923128
flang/lib/Lower/Allocatable.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:53.03 -> 0:22.50
Maximum resident set size (kbytes): 3092840 -> 2116908
flang/lib/Semantics/Semantics.cpp
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:18.75 -> 1:00.20
Maximum resident set size (kbytes): 3527744 -> 2545308
While the newly added precompiled headers are as follows:
Parser:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.62
Maximum resident set size (kbytes): 1034608
Lower:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.33
Maximum resident set size (kbytes): 3615240
Semantics:
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:26.69
Maximum resident set size (kbytes): 2403776
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
|
|
This patch defines `fir::SafeTempArrayCopyAttrInterface` and the
corresponding
OpenACC/OpenMP related attributes in FIR dialect. The actual
implementations
are just placeholders right now, and array repacking becomes a no-op
if `-fopenacc/-fopenmp` is used for the compilation.
|
|
I am making a CG pass to depend on `FIROpenACCSupport` in #134346.
This introduces a cyclic dependency between `FIROpenACCSupport`
and `FIRCodeGen`. This patch splits `FIRCodeGen` into
`FIRCodeGenDialect` (for FIR CG dialect definition) and `FIRCodeGen`
(for the CG passes).
Now, `FIROpenACCSupport` depends on `FIRCodeGenDialect`,
and `FIRCodeGen` depends on `FIROpenACCSupport`.
|
|
Reverts llvm/llvm-project#130600
Reverting on account of Windows issues with ccache, will bring it back
along with #131137 once those are resolved.
|
|
This patch updates flang to follow clang's behavior when processing the
`-mcode-object-version` option.
It is now used to populate an LLVM module flag called
`amdhsa_code_object_version` expected by the backend and also updates
the driver to add the `--amdhsa-code-object-version` option to the
frontend invocation for device compilation of AMDGPU targets.
|
|
Added options:
* -f[no-]repack-arrays
* -f[no-]stack-repack-arrays
* -frepack-arrays-contiguity=whole/innermost
|
|
Precise OpenMP standards support information is being documented in
#132707
Flang now has good support for OpenMP Version 3.1 and earlier.
|
|
This PR starts the effort to upstream AMD's internal implementation of
`do concurrent` to OpenMP mapping. This replaces #77285 since we
extended this WIP quite a bit on our fork over the past year.
An important part of this PR is a document that describes the current
status downstream, the upstreaming status, and next steps to make this
pass much more useful.
In addition to this document, this PR also contains the skeleton of the
pass (no useful transformations are done yet) and some testing for the
added command line options.
This looks like a huge PR but a lot of the added stuff is documentation.
It is also worth noting that the downstream pass has been validated on
https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived
performance speed-ups that match pure OpenMP, for GPU mapping we are
still working on extending our support for implicit memory mapping and
locality specifiers.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026 (this PR)
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
|
|
function to Triple (#126956)
I'm adding support for SPIR-V, so let's consolidate these checks.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
|
|
Summary:
When we were first porting to COV5, this lead to some ABI issues due to
a change in how we looked up the work group size. Bitcode libraries
relied on the builtins to emit code, but this was changed between
versions. This prevented the bitcode libraries, like OpenMP or libc,
from being used for both COV4 and COV5. The solution was to have this
'none' functionality which effectively emitted code that branched off of
a global to resolve to either version.
This isn't a great solution because it forced every TU to have this
variable in it. The patch in
https://github.com/llvm/llvm-project/pull/131033 removed support for
COV4 from OpenMP, which was the only consumer of this functionality.
Other users like HIP and OpenCL did not use this because they linked the
ROCm Device Library directly which has its own handling (The name was
borrowed from it after all).
So, now that we don't need to worry about backward compatibility with
COV4, we can remove this special handling. Users can still emit COV4
code, this simply removes the special handling used to make the OpenMP
device runtime bitcode version agnostic.
|
|
Add -f[no-]slp-vectorize to the flang driver.
Add corresponding -fvectorize-slp to the flang frontend.
Enable -fslp-vectorize at -O2 and higher in flang to match the current
behaviour in clang.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
|
|
This is an extension of #131357. Hopefully this would be the last one.
|
|
This flag provides extra commentary in the assembly output.
|
|
Most of the high memory usage and compilation time in the frontend units
is due to including large parsing headers.
This commit moves out several of the largest parsing headers into a new
precompiled header linked to flangFrontend.
The new compilation metrics for FrontendActions.cpp are as follows:
User time (seconds): 38.40
System time (seconds): 2.00
Maximum resident set size (kbytes): 2710964 (2.58 GB) (vs 3.78 GB)
ParserActions.cpp:
User time (seconds): 69.37
System time (seconds): 1.81
Maximum resident set size (kbytes): 2599456 (2.47 GB) (vs 4 GB)
Alongside the new precompiled header compilation unit
User time (seconds): 41.61
System time (seconds): 2.72
Maximum resident set size (kbytes): 3107644 (2.96 GB)
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
|
|
Adjust for #130940.
|
|
This PR addresses some of the issues described in
https://github.com/llvm/llvm-project/issues/127617. Key changes:
- Stop assuming fixed-form for `-x f95` unless the input is a `.i` file.
This change ensures compatibility with `-save-temps` workflows while
preventing unintended fixed-form assumptions.
- Ensure `-x f95-cpp-input` enables `-cpp` by default, aligning Flang's
behavior with `gfortran`.
|