Age | Commit message (Collapse) | Author | Files | Lines |
|
Update nvptx documentation:
- Use meaningful terms: "PTX ISA target architecture" and "PTX ISA version".
- Remove invalid claim that "ISA strings must be lower-case".
- Add missing sm_xx entries.
- Fix misa default.
- Add march, copying misa doc.
- Declare misa an march alias.
- Add march-map.
- Fix "for given the specified" typo.
gcc/ChangeLog:
2022-03-29 Tom de Vries <tdevries@suse.de>
* doc/invoke.texi (misa, mptx): Update.
(march, march-map): Add.
|
|
2022-03-29 Chenghua Xu <xuchenghua@loongson.cn>
Lulu Cheng <chenglulu@loongson.cn>
gcc/ChangeLog:
* doc/install.texi: Add LoongArch options section.
* doc/invoke.texi: Add LoongArch options section.
* doc/md.texi: Add LoongArch options section.
contrib/ChangeLog:
* config-list.mk: Add LoongArch triplet.
|
|
These have been misdocumented since C++98 POD was split into C++11 trivial
and standard-layout in r149721.
PR c++/59426
gcc/ChangeLog:
* doc/extend.texi: Refer to __is_trivial instead of __is_pod.
|
|
With TeX output ("make pdf"), @gccoptlist's content end up in a single
line such that TeX does not find the matching '@end ignore' for the
'@ignore' block – failing with a runaway error. Solution is to move
the @ignore block after the closing '}'.
(Follow up to r12-7808-g319ba7e241e7e21f9eb481f075310796f13d2035 )
gcc/
PR analyzer/103533
* doc/invoke.texi (Static Analyzer Options): Move
@ignore block after @gccoptlist's '}' for 'make pdf'.
|
|
PR analyzer/104954 tracks that -fanalyzer was taking a very long time
on a particular source file in the Linux kernel:
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
One issue occurs with the repeated use of dynamic debug lines e.g. via
the DC_LOG_BANDWIDTH_CALCS macro, such as in print_bw_calcs_dceip in
drivers/gpu/drm/amd/display/dc/calcs/calcs_logger.h:
DC_LOG_BANDWIDTH_CALCS("#####################################################################");
DC_LOG_BANDWIDTH_CALCS("struct bw_calcs_dceip");
DC_LOG_BANDWIDTH_CALCS("#####################################################################");
[...snip dozens of lines...]
DC_LOG_BANDWIDTH_CALCS("[bw_fixed] dmif_request_buffer_size: %d",
bw_fixed_to_int(dceip->dmif_request_buffer_size));
When this is configured to use __dynamic_pr_debug, each of these becomes
code like:
do {
static struct _ddebug __attribute__((__aligned__(8)))
__attribute__((__section__("__dyndbg"))) __UNIQUE_ID_ddebug277 = {
[...snip...]
};
if (arch_static_branch(&__UNIQUE_ID_ddebug277.key, false))
__dynamic_pr_debug(&__UNIQUE_ID_ddebug277, [...the message...]);
} while (0);
The analyzer was naively seeing each call to __dynamic_pr_debug, noting
that the __UNIQUE_ID_nnnn object escapes. At each call, as successive
__UNIQUE_ID_nnnn object escapes, there are N escaped objects, and thus N
need clobbering, and so we have O(N^2) clobbering of escaped objects overall,
leading to huge amounts of pointless work: print_bw_calcs_data has 225
uses of DC_LOG_BANDWIDTH_CALCS, many of which are in loops.
This patch adds a way to identify declarations that aren't interesting
to the analyzer, so that we don't attempt to create binding_clusters
for them (i.e. we don't store any state for them in our state objects).
This is implemented by adding a new region::tracked_p, implemented for
declarations by walking the existing IPA data the first time the
analyzer sees a declaration, setting it to false for global vars that
have no loads/stores/aliases, and "sufficiently safe" address-of
ipa-refs.
The patch gives a large speedup of -fanalyzer on the above kernel
source file:
Before After
Total cc1 wallclock time: 180s 36s
analyzer wallclock time: 162s 17s
% spent in analyzer: 90% 47%
gcc/analyzer/ChangeLog:
PR analyzer/104954
* analyzer.opt (-fdump-analyzer-untracked): New option.
* engine.cc (impl_run_checkers): Handle it.
* region-model-asm.cc (region_model::on_asm_stmt): Don't attempt
to clobber regions with !tracked_p ().
* region-model-manager.cc (dump_untracked_region): New.
(region_model_manager::dump_untracked_regions): New.
(frame_region::dump_untracked_regions): New.
* region-model.h (region_model_manager::dump_untracked_regions):
New decl.
* region.cc (ipa_ref_requires_tracking): New.
(symnode_requires_tracking_p): New.
(decl_region::calc_tracked_p): New.
* region.h (region::tracked_p): New vfunc.
(frame_region::dump_untracked_regions): New decl.
(class decl_region): Note that this is also used fo SSA names.
(decl_region::decl_region): Initialize m_tracked.
(decl_region::tracked_p): New.
(decl_region::calc_tracked_p): New decl.
(decl_region::m_tracked): New.
* store.cc (store::get_or_create_cluster): Assert that we
don't try to create clusters for base regions that aren't
trackable.
(store::mark_as_escaped): Don't mark base regions that we're not
tracking.
gcc/ChangeLog:
PR analyzer/104954
* doc/invoke.texi (Static Analyzer Options): Add
-fdump-analyzer-untracked.
gcc/testsuite/ChangeLog:
PR analyzer/104954
* gcc.dg/analyzer/asm-x86-dyndbg-1.c: New test.
* gcc.dg/analyzer/asm-x86-dyndbg-2.c: New test.
* gcc.dg/analyzer/many-unused-locals.c: New test.
* gcc.dg/analyzer/untracked-1.c: New test.
* gcc.dg/analyzer/unused-local-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/ChangeLog:
PR analyzer/103533
* doc/invoke.texi: Document that enabling taint analyzer
checker disables some warnings from `-fanalyzer`.
Signed-off-by: Avinash Sonawane <rootkea@gmail.com>
|
|
gcc/ChangeLog:
* doc/invoke.texi: Document min-pagesize parameter.
|
|
march=sapphirerapids should be based on icelake server not cooperlake.
gcc/ChangeLog:
PR target/104963
* config/i386/i386.h (PTA_SAPPHIRERAPIDS): change it to base on ICX.
* doc/invoke.texi: Update documents for Intel sapphirerapids.
gcc/testsuite/ChangeLog:
PR target/104963
* gcc.target/i386/pr104963.c: New test case.
|
|
A well-formed call to std::move/forward is equivalent to a cast, but the
former being a function call means the compiler generates debug info,
which persists even after the call gets inlined, for an operation that's
never interesting to debug.
This patch addresses this problem by folding calls to std::move/forward
and other cast-like functions into simple casts as part of the frontend's
general expression folding routine. This behavior is controlled by a
new flag -ffold-simple-inlines, and otherwise by -fno-inline, so that
users can enable this folding with -O0 (which implies -fno-inline).
After this patch with -O2 and a non-checking compiler, debug info size
for some testcases from range-v3 and cmcstl2 decreases by as much as ~10%
and overall compile time and memory usage decreases by ~2%.
PR c++/96780
gcc/ChangeLog:
* doc/invoke.texi (C++ Dialect Options): Document
-ffold-simple-inlines.
gcc/c-family/ChangeLog:
* c.opt: Add -ffold-simple-inlines.
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold) <case CALL_EXPR>: Fold calls to
std::move/forward and other cast-like functions into simple
casts.
gcc/testsuite/ChangeLog:
* g++.dg/opt/pr96780.C: New test.
|
|
|
|
exception [PR102586]
As mentioned by Jason in the PR, non-trivially-copyable types (or non-POD
for purposes of layout?) types can be base classes of derived classes in
which the padding in those non-trivially-copyable types can be reused for
some real data members or even the layout can change and data members can
be moved to other positions.
__builtin_clear_padding is right now used for multiple purposes,
in <atomic> where it isn't used yet but was planned as the main spot
it can be used for trivially copyable types only, ditto for std::bit_cast
where we also use it. It is used for OpenMP long double atomics too but
long double is trivially copyable, and lastly for -ftrivial-auto-var-init=.
The following patch restricts the builtin to pointers to trivially-copyable
types, with the exception when it is called directly on an address of a
variable, in that case already the FE can verify it is the complete object
type and so it is safe to clear all the paddings in it.
2022-03-14 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/102586
gcc/
* doc/extend.texi (__builtin_clear_padding): Clearify that for C++
argument type should be pointer to trivially-copyable type unless it
is address of a variable or parameter.
gcc/cp/
* call.cc (build_cxx_call): Diagnose __builtin_clear_padding where
first argument's type is pointer to non-trivially-copyable type unless
it is address of a variable or parameter.
gcc/testsuite/
* g++.dg/cpp2a/builtin-clear-padding1.C: New test.
|
|
gcc/c-family/ChangeLog:
* c-target.def (check_string_object_format_arg): Fix description typo.
gcc/ChangeLog:
* doc/invoke.texi: Fix typos.
* doc/tm.texi.in: Remove duplicated word.
* doc/tm.texi: Regenerate.
libgomp/ChangeLog:
* libgomp.texi: Fix typo.
|
|
|
|
-fwrapv and C++20+ [PR104711]
As mentioned in the PR, different standards have different definition
on what is an UB left shift. They all agree on out of bounds (including
negative) shift count.
The rules used by ubsan are:
C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB
C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB
C++20 and later everything is well defined
Now, for C++20, I've in the P1236R1 implementation added an early
exit for -Wshift-overflow* warning so that it never warns, but apparently
-Wshift-negative-value remained as is. As it is well defined in C++20,
the following patch doesn't enable -Wshift-negative-value from -Wextra
anymore for C++20 and later, if users want for compatibility with C++17
and earlier get the warning, they still can by using -Wshift-negative-value
explicitly.
Another thing is -fwrapv, that is an extension to the standards, so it is up
to us how exactly we define that case. Our ubsan code treats
TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only
diagnosing out of bounds shift count and nothing else and IMHO it is most
sensical to treat -fwrapv signed left shifts the same as C++20 treats
them, https://eel.is/c++draft/expr.shift#2
"The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N,
where N is the width of the type of the result.
[Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled.
— end note]"
with no UB dependent on the E1 values. The UB is only
"The behavior is undefined if the right operand is negative, or greater
than or equal to the width of the promoted left operand."
Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't
consider UB in left shifts dependent on the first operand's value, only
the out of bounds shifts.
While this change isn't a regression, I'd think it is useful for GCC 12,
it doesn't add new warnings, but just removes warnings that aren't
appropriate.
2022-03-09 Jakub Jelinek <jakub@redhat.com>
PR c/104711
gcc/
* doc/invoke.texi (-Wextra): Document that -Wshift-negative-value
is enabled by it only for C++11 to C++17 rather than for C++03 or
later.
(-Wshift-negative-value): Similarly (except here we stated
that it is enabled for C++11 or later).
gcc/c-family/
* c-opts.cc (c_common_post_options): Don't enable
-Wshift-negative-value from -Wextra for C++20 or later.
* c-ubsan.cc (ubsan_instrument_shift): Adjust comments.
* c-warn.cc (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
gcc/c/
* c-fold.cc (c_fully_fold_internal): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
* c-typeck.cc (build_binary_op): Likewise.
gcc/cp/
* constexpr.cc (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS
instead of TYPE_UNSIGNED.
* typeck.cc (cp_build_binary_op): Don't emit
-Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS.
gcc/testsuite/
* c-c++-common/Wshift-negative-value-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-negative-value-2.c: Likewise.
* c-c++-common/Wshift-negative-value-3.c: Likewise.
* c-c++-common/Wshift-negative-value-4.c: Likewise.
* c-c++-common/Wshift-negative-value-7.c: New test.
* c-c++-common/Wshift-negative-value-8.c: New test.
* c-c++-common/Wshift-negative-value-9.c: New test.
* c-c++-common/Wshift-negative-value-10.c: New test.
* c-c++-common/Wshift-overflow-1.c: Remove
dg-additional-options, instead in target selectors of each diagnostic
check for exact C++ versions where it should be diagnosed.
* c-c++-common/Wshift-overflow-2.c: Likewise.
* c-c++-common/Wshift-overflow-5.c: Likewise.
* c-c++-common/Wshift-overflow-6.c: Likewise.
* c-c++-common/Wshift-overflow-7.c: Likewise.
* c-c++-common/Wshift-overflow-8.c: New test.
* c-c++-common/Wshift-overflow-9.c: New test.
* c-c++-common/Wshift-overflow-10.c: New test.
* c-c++-common/Wshift-overflow-11.c: New test.
* c-c++-common/Wshift-overflow-12.c: New test.
|
|
This adds a --param to allow disabling of vectorization of
floating point inductions. Ontop of -Ofast this should allow
549.fotonik3d_r to not miscompare.
2022-03-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/84201
* params.opt (-param=vect-induction-float): Add.
* doc/invoke.texi (vect-induction-float): Document.
* tree-vect-loop.cc (vectorizable_induction): Honor
param_vect_induction_float.
* gcc.dg/vect/pr84201.c: New testcase.
|
|
As C++20 has already been published, we don't need to link to the draft
(which is now the C++23 draft anyway). And there's no need to say it's
part of the C++20 spec, or that there might be defect reports. That's
true for everything in C++20, so calling it out here just for Modules
isn't needed.
gcc/ChangeLog:
* doc/invoke.texi (C++ Modules): Remove anachronism.
|
|
|
|
At the same time, adding -Wtrivial-auto-var-init and update documentation.
-Wtrivial-auto-var-init and update documentation.
for the following testing case:
1 int g(int *);
2 int f1()
3 {
4 switch (0) {
5 int x;
6 default:
7 return g(&x);
8 }
9 }
compiling with -O -ftrivial-auto-var-init causes spurious warning:
warning: statement will never be executed [-Wswitch-unreachable]
5 | int x;
| ^
This is due to the compiler-generated initialization at the point of
the declaration.
We could avoid the warning to exclude the following cases:
when
flag_auto_var_init > AUTO_INIT_UNINITIALIZED
And
1) call to .DEFERRED_INIT
2) call to __builtin_clear_padding if the 2nd argument is present and non-zero
3) a gimple assign store right after the .DEFERRED_INIT call that has the LHS
as RHS
However, we still need to warn users about the incapability of the option
-ftrivial-auto-var-init by adding a new warning option -Wtrivial-auto-var-init
to report cases when it cannot initialize the auto variable. At the same
time, update documentation for -ftrivial-auto-var-init to connect it with
the new warning option -Wtrivial-auto-var-init, and add documentation
for -Wtrivial-auto-var-init.
gcc/ChangeLog:
PR middle-end/102276
* common.opt (-Wtrivial-auto-var-init): New option.
* doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
(-ftrivial-auto-var-init): Update option;
* gimplify.cc (emit_warn_switch_unreachable): New function.
(warn_switch_unreachable_r): Rename to ...
(warn_switch_unreachable_and_auto_init_r): This.
(maybe_warn_switch_unreachable): Rename to ...
(maybe_warn_switch_unreachable_and_auto_init): This.
(gimplify_switch_expr): Update calls to renamed function.
gcc/testsuite/ChangeLog:
PR middle-end/102276
* gcc.dg/auto-init-pr102276-1.c: New test.
* gcc.dg/auto-init-pr102276-2.c: New test.
* gcc.dg/auto-init-pr102276-3.c: New test.
* gcc.dg/auto-init-pr102276-4.c: New test.
|
|
PR gcov-profile/104677
gcc/ChangeLog:
* doc/invoke.texi: Document more .gcda file name generation.
|
|
The code generated by -mcmodel=medany is defined to be
position-independent, but is not guaranteed to function correctly when
linked into position-independent executables or libraries. See the
recent discussion at the psABI specification [1] for more details.
It would be better to reject these invalid sequences when linking, but
as pointed out in a recent LD bug [2] there may be some compatibility
issues related to the PCREL_HI20 relocations used to initialize GP.
Given the complexity here it's unlikely we'll be able to reject these
sequences any time soon, so instead just document that these may not
work.
[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/245
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=28789
gcc/ChangeLog:
* doc/invoke.texi (RISC-V -mcmodel=medany): Document the degree
of position independence that -mcmodel=medany affords.
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The following patch avoids infinite recursion during generic folding.
The (cmp (bswap @0) INTEGER_CST@1) simplification relies on
(bswap @1) actually being simplified, if it is not simplified, we just
move the bswap from one operand to the other and if @0 is also INTEGER_CST,
we apply the same rule next.
The reason why bswap @1 isn't folded to INTEGER_CST is that the INTEGER_CST
has TREE_OVERFLOW set on it and fold-const-call.cc predicate punts in
such cases:
static inline bool
integer_cst_p (tree t)
{
return TREE_CODE (t) == INTEGER_CST && !TREE_OVERFLOW (t);
}
The patch uses ! modifier to ensure the bswap is simplified and
extends support to GENERIC by means of requiring !EXPR_P which
is not perfect but a conservative approximation.
2022-02-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/104644
* doc/match-and-simplify.texi: Amend ! documentation.
* genmatch.cc (expr::gen_transform): Code-generate ! support
for GENERIC.
(parser::parse_expr): Allow ! for GENERIC.
* match.pd (cmp (bswap @0) INTEGER_CST@1): Use ! modifier on
bswap.
* gcc.dg/pr104644.c: New test.
Co-Authored-by: Jakub Jelinek <jakub@redhat.com>
|
|
Since the ISA supported by Intel architectures in the documentation
are inconsistent with the actual, modify them all.
gcc/Changelog:
* doc/invoke.texi: Update documents for Intel architectures.
|
|
The problem in this PR is that we call VPSEL with a mask of vector
type instead of HImode. This happens because operand 3 in vcond_mask
is the pre-computed vector comparison and has vector type.
This patch fixes it by implementing TARGET_VECTORIZE_GET_MASK_MODE,
returning the appropriate VxBI mode when targeting MVE. In turn, this
implies implementing vec_cmp<mode><MVE_vpred>,
vec_cmpu<mode><MVE_vpred> and vcond_mask_<mode><MVE_vpred>, and we can
move vec_cmp<mode><v_cmp_result>, vec_cmpu<mode><mode> and
vcond_mask_<mode><v_cmp_result> back to neon.md since they are not
used by MVE anymore. The new *<MVE_vpred> patterns listed above are
implemented in mve.md since they are only valid for MVE. However this
may make maintenance/comparison more painful than having all of them
in vec-common.md.
In the process, we can get rid of the recently added vcond_mve
parameter of arm_expand_vector_compare.
Compared to neon.md's vcond_mask_<mode><v_cmp_result> before my "arm:
Auto-vectorization for MVE: vcmp" patch (r12-834), it keeps the VDQWH
iterator added in r12-835 (to have V4HF/V8HF support), as well as the
(!<Is_float_mode> || flag_unsafe_math_optimizations) condition which
was not present before r12-834 although SF modes were enabled by VDQW
(I think this was a bug).
Using TARGET_VECTORIZE_GET_MASK_MODE has the advantage that we no
longer need to generate vpsel with vectors of 0 and 1: the masks are
now merged via scalar 'ands' instructions operating on 16-bit masks
after converting the boolean vectors.
In addition, this patch fixes a problem in arm_expand_vcond() where
the result would be a vector of 0 or 1 instead of operand 1 or 2.
Since we want to skip gcc.dg/signbit-2.c for MVE, we also add a new
arm_mve effective target.
Reducing the number of iterations in pr100757-3.c from 32 to 8, we
generate the code below:
float a[32];
float fn1(int d) {
float c = 4.0f;
for (int b = 0; b < 8; b++)
if (a[b] != 2.0f)
c = 5.0f;
return c;
}
fn1:
ldr r3, .L3+48
vldr.64 d4, .L3 // q2=(2.0,2.0,2.0,2.0)
vldr.64 d5, .L3+8
vldrw.32 q0, [r3] // q0=a(0..3)
adds r3, r3, #16
vcmp.f32 eq, q0, q2 // cmp a(0..3) == (2.0,2.0,2.0,2.0)
vldrw.32 q1, [r3] // q1=a(4..7)
vmrs r3, P0
vcmp.f32 eq, q1, q2 // cmp a(4..7) == (2.0,2.0,2.0,2.0)
vmrs r2, P0 @ movhi
ands r3, r3, r2 // r3=select(a(0..3]) & select(a(4..7))
vldr.64 d4, .L3+16 // q2=(5.0,5.0,5.0,5.0)
vldr.64 d5, .L3+24
vmsr P0, r3
vldr.64 d6, .L3+32 // q3=(4.0,4.0,4.0,4.0)
vldr.64 d7, .L3+40
vpsel q3, q3, q2 // q3=vcond_mask(4.0,5.0)
vmov.32 r2, q3[1] // keep the scalar max
vmov.32 r0, q3[3]
vmov.32 r3, q3[2]
vmov.f32 s11, s12
vmov s15, r2
vmov s14, r3
vmaxnm.f32 s15, s11, s15
vmaxnm.f32 s15, s15, s14
vmov s14, r0
vmaxnm.f32 s15, s15, s14
vmov r0, s15
bx lr
.L4:
.align 3
.L3:
.word 1073741824 // 2.0f
.word 1073741824
.word 1073741824
.word 1073741824
.word 1084227584 // 5.0f
.word 1084227584
.word 1084227584
.word 1084227584
.word 1082130432 // 4.0f
.word 1082130432
.word 1082130432
.word 1082130432
This patch adds tests that trigger an ICE without this fix.
The pr100757*.c testcases are derived from
gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
various types and return values different from 0 and 1 to avoid
commonalization with boolean masks. In addition, since we should not
need these masks, the tests make sure they are not present.
Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.
2022-02-22 Christophe Lyon <christophe.lyon@arm.com>
PR target/100757
gcc/
* config/arm/arm-protos.h (arm_get_mask_mode): New prototype.
(arm_expand_vector_compare): Update prototype.
* config/arm/arm.cc (TARGET_VECTORIZE_GET_MASK_MODE): New.
(arm_vector_mode_supported_p): Add support for VxBI modes.
(arm_expand_vector_compare): Remove useless generation of vpsel.
(arm_expand_vcond): Fix select operands.
(arm_get_mask_mode): New.
* config/arm/mve.md (vec_cmp<mode><MVE_vpred>): New.
(vec_cmpu<mode><MVE_vpred>): New.
(vcond_mask_<mode><MVE_vpred>): New.
* config/arm/vec-common.md (vec_cmp<mode><v_cmp_result>)
(vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): Move to ...
* config/arm/neon.md (vec_cmp<mode><v_cmp_result>)
(vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): ... here
and disable for MVE.
* doc/sourcebuild.texi (arm_mve): Document new effective-target.
gcc/testsuite/
PR target/100757
* gcc.target/arm/simd/pr100757-2.c: New.
* gcc.target/arm/simd/pr100757-3.c: New.
* gcc.target/arm/simd/pr100757-4.c: New.
* gcc.target/arm/simd/pr100757.c: New.
* gcc.dg/signbit-2.c: Skip when targeting ARM/MVE.
* lib/target-supports.exp (check_effective_target_arm_mve): New.
|
|
Currently supported internally are 3.1, 6.0, 6.3 and 7.0.
However, -mptx= supports 3.1, 6.3, 7.0 – but not the internal default 6.0.
Add -mptx=6.0 for consistency.
Tested on nvptx.
gcc/ChangeLog:
* config/nvptx/nvptx.opt (mptx): Add 6.0 alias PTX_VERSION_6_0.
* doc/invoke.texi (-mptx): Update for new values and defaults.
Co-Authored-By: Tom de Vries <tdevries@suse.de>
|
|
Shadow Call Stack can be used to protect the return address of a
function at runtime, and clang already supports this feature[1].
To enable SCS in user mode, in addition to compiler, other support
is also required (as discussed in [2]). This patch only adds basic
support for SCS from the compiler side, and provides convenience
for users to enable SCS.
For linux kernel, only the support of the compiler is required.
[1] https://clang.llvm.org/docs/ShadowCallStack.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768
Signed-off-by: Dan Li <ashimida@linux.alibaba.com>
gcc/ChangeLog:
* config/aarch64/aarch64.cc (SLOT_REQUIRED):
Change wb_candidate[12] to wb_push_candidate[12].
(aarch64_layout_frame): Likewise, and
change callee_adjust when scs is enabled.
(aarch64_save_callee_saves):
Change wb_candidate[12] to wb_push_candidate[12].
(aarch64_restore_callee_saves):
Change wb_candidate[12] to wb_pop_candidate[12].
(aarch64_get_separate_components):
Change wb_candidate[12] to wb_push_candidate[12].
(aarch64_expand_prologue): Push x30 onto SCS before it's
pushed onto stack.
(aarch64_expand_epilogue): Pop x30 frome SCS, while
preventing it from being popped from the regular stack again.
(aarch64_override_options_internal): Add SCS compile option check.
(TARGET_HAVE_SHADOW_CALL_STACK): New hook.
* config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled,
wb_pop_candidate[12], and rename wb_candidate[12] to
wb_push_candidate[12].
* config/aarch64/aarch64.md (scs_push): New template.
(scs_pop): Likewise.
* doc/invoke.texi: Document -fsanitize=shadow-call-stack.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add hook have_shadow_call_stack.
* flag-types.h (enum sanitize_code):
Add SANITIZE_SHADOW_CALL_STACK.
* opts.cc (parse_sanitizer_options): Add shadow-call-stack
and exclude SANITIZE_SHADOW_CALL_STACK.
* target.def: New hook.
* toplev.cc (process_options): Add SCS compile option check.
* ubsan.cc (ubsan_expand_null_ifn): Enum type conversion.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/shadow_call_stack_1.c: New test.
* gcc.target/aarch64/shadow_call_stack_2.c: New test.
* gcc.target/aarch64/shadow_call_stack_3.c: New test.
* gcc.target/aarch64/shadow_call_stack_4.c: New test.
* gcc.target/aarch64/shadow_call_stack_5.c: New test.
* gcc.target/aarch64/shadow_call_stack_6.c: New test.
* gcc.target/aarch64/shadow_call_stack_7.c: New test.
* gcc.target/aarch64/shadow_call_stack_8.c: New test.
|
|
Resolves:
PR middle-end/104355 - Misleading and outdated -Warray-bounds documentation
gcc/ChangeLog:
PR middle-end/104355
* doc/invoke.texi (-Warray-bounds): Update documentation.
|
|
gcc:
* doc/install.texi (Specific): Change the www.bitwizard.nl
reference to use https.
|
|
Add -m[no-]direct-extern-access and nodirect_extern_access attribute.
-mdirect-extern-access is the default. With nodirect_extern_access
attribute, GOT is always used to access undefined data and function
symbols with nodirect_extern_access attribute, including in PIE and
non-PIE. With -mno-direct-extern-access:
1. Always use GOT to access undefined data and function symbols,
including in PIE and non-PIE. These will avoid copy relocations
in executables. This is compatible with existing executables and
shared libraries.
2. In executable and shared library, bind symbols with the STV_PROTECTED
visibility locally:
a. The address of data symbol is the address of data body.
b. For systems without function descriptor, the function pointer is
the address of function body.
c. The resulting shared libraries may not be incompatible with
executables which have copy relocations on protected symbols or
use executable PLT entries as function addresses for protected
functions in shared libraries.
3. Update asm_preferred_eh_data_format to select PC relative EH encoding
format with -mno-direct-extern-access to avoid copy relocation.
4. Add ix86_reloc_rw_mask for TARGET_ASM_RELOC_RW_MASK to avoid copy
relocation with -mno-direct-extern-access.
gcc/
PR target/35513
PR target/100593
* config/i386/gnu-property.cc: Include "i386-protos.h".
(file_end_indicate_exec_stack_and_gnu_property): Generate
a GNU_PROPERTY_1_NEEDED note for -mno-direct-extern-access or
nodirect_extern_access attribute.
* config/i386/i386-options.cc
(handle_nodirect_extern_access_attribute): New function.
(ix86_attribute_table): Add nodirect_extern_access attribute.
* config/i386/i386-protos.h (ix86_force_load_from_GOT_p): Add a
bool argument.
(ix86_has_no_direct_extern_access): New.
* config/i386/i386.cc (ix86_has_no_direct_extern_access): New.
(ix86_force_load_from_GOT_p): Add a bool argument to indicate
call operand. Force non-call load from GOT for
-mno-direct-extern-access or nodirect_extern_access attribute.
(legitimate_pic_address_disp_p): Avoid copy relocation in PIE
for -mno-direct-extern-access or nodirect_extern_access attribute.
(ix86_print_operand): Pass true to ix86_force_load_from_GOT_p
for call operand.
(asm_preferred_eh_data_format): Use PC-relative format for
-mno-direct-extern-access to avoid copy relocation. Check
ptr_mode instead of TARGET_64BIT when selecting DW_EH_PE_sdata4.
(ix86_binds_local_p): Set ix86_has_no_direct_extern_access to
true for -mno-direct-extern-access or nodirect_extern_access
attribute. Don't treat protected data as extern and avoid copy
relocation on common symbol with -mno-direct-extern-access or
nodirect_extern_access attribute.
(ix86_reloc_rw_mask): New to avoid copy relocation for
-mno-direct-extern-access.
(TARGET_ASM_RELOC_RW_MASK): New.
* config/i386/i386.opt: Add -mdirect-extern-access.
* doc/extend.texi: Document nodirect_extern_access attribute.
* doc/invoke.texi: Document -m[no-]direct-extern-access.
gcc/testsuite/
PR target/35513
PR target/100593
* g++.target/i386/pr35513-1.C: New file.
* g++.target/i386/pr35513-2.C: Likewise.
* gcc.target/i386/pr35513-1a.c: Likewise.
* gcc.target/i386/pr35513-1b.c: Likewise.
* gcc.target/i386/pr35513-2a.c: Likewise.
* gcc.target/i386/pr35513-2b.c: Likewise.
* gcc.target/i386/pr35513-3a.c: Likewise.
* gcc.target/i386/pr35513-3b.c: Likewise.
* gcc.target/i386/pr35513-4a.c: Likewise.
* gcc.target/i386/pr35513-4b.c: Likewise.
* gcc.target/i386/pr35513-5a.c: Likewise.
* gcc.target/i386/pr35513-5b.c: Likewise.
* gcc.target/i386/pr35513-6a.c: Likewise.
* gcc.target/i386/pr35513-6b.c: Likewise.
* gcc.target/i386/pr35513-7a.c: Likewise.
* gcc.target/i386/pr35513-7b.c: Likewise.
* gcc.target/i386/pr35513-8.c: Likewise.
* gcc.target/i386/pr35513-9a.c: Likewise.
* gcc.target/i386/pr35513-9b.c: Likewise.
* gcc.target/i386/pr35513-10a.c: Likewise.
* gcc.target/i386/pr35513-10b.c: Likewise.
* gcc.target/i386/pr35513-11a.c: Likewise.
* gcc.target/i386/pr35513-11b.c: Likewise.
* gcc.target/i386/pr35513-12a.c: Likewise.
* gcc.target/i386/pr35513-12b.c: Likewise.
|
|
We have recently updated the default for the `-misa-spec=' option, yet
we still have not documented it nor its `--with-isa-spec=' counterpart
in the GCC manuals. Fix that.
gcc/
* doc/install.texi (Configuration): Document `--with-isa-spec='
RISC-V option.
* doc/invoke.texi (Option Summary): List `-misa-spec=' RISC-V
option.
(RISC-V Options): Document it.
|
|
Due to a pasto error in the documentation, vec_replace_unaligned was
implemented with the same function prototypes as vec_replace_elt. It was
intended that vec_replace_unaligned always specify output vectors as having
type vector unsigned char, to emphasize that elements are potentially
misaligned by this built-in function. This patch corrects the
misimplementation.
2022-02-04 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
PR target/100808
* doc/extend.texi (Basic PowerPC Built-in Functions Available on ISA
3.1): Provide consistent type names. Remove unnecessary semicolons.
Fix bad line breaks.
|
|
gcc/ChangeLog:
* doc/cpp.texi (Variadic Macros): Replace C++2a with C++20.
|
|
gcc/ChangeLog:
PR analyzer/104270
* doc/invoke.texi (-ftrivial-auto-var-init=): Add reference to
-Wanalyzer-use-of-uninitialized-value to paragraph documenting that
-ftrivial-auto-var-init= doesn't suppress warnings.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This patch boosts the analysis for complex mul,fma and fms in order to ensure
that it doesn't create an incorrect output.
Essentially it adds an extra verification to check that the two nodes it's going
to combine do the same operations on compatible values. The reason it needs to
do this is that if one computation differs from the other then with the current
implementation we have no way to deal with it since we have to remove the
permute.
When we can keep the permute around we can probably handle these by unrolling.
While implementing this since I have to do the traversal anyway I took advantage
of it by simplifying the code a bit. Previously we would determine whether
something is a conjugate and then try to figure out which conjugate it is and
then try to see if the permutes match what we expect.
Now the code that does the traversal will detect this in one go and return to us
whether the operation is something that can be combined and whether a conjugate
is present.
Secondly because it does this I can now simplify the checking code itself to
essentially just try to apply fixed patterns to each operation.
The patterns represent the order operations should appear in. For instance a
complex MUL operation combines :
Left 1 + Right 1
Left 2 + Right 2
with a permute on the nodes consisting of:
{ Even, Even } + { Odd, Odd }
{ Even, Odd } + { Odd, Even }
By abstracting over these patterns the checking code becomes quite simple.
As part of this I was checking the order of the operands which was left in
"slp" order. as in, the same order they showed up in during SLP, which means
that the accumulator is first. However it looks like I didn't document this
and the x86 optab was implemented assuming the same order as FMA, i.e. that
the accumulator is last.
I have this changed the order to match that of FMA and FMS which corrects the
x86 codegen and will update the Arm targets. This has now also been
documented.
gcc/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* doc/md.texi: Update docs for cfms, cfma.
* tree-data-ref.h (same_data_refs): Accept optional offset.
* tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating
patterns.
(vect_normalize_conj_loc): Remove.
(is_eq_or_top): Change to take two nodes.
(enum _conj_status, compatible_complex_nodes_p,
vect_validate_multiplication): New.
(class complex_add_pattern, complex_add_pattern::matches,
complex_add_pattern::recognize, class complex_mul_pattern,
complex_mul_pattern::recognize, class complex_fms_pattern,
complex_fms_pattern::recognize, class complex_operations_pattern,
complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
new cache.
(complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new
cache and use new validation code.
* tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns,
vect_analyze_slp): Pass along cache.
(compatible_calls_p): Expose.
* tree-vectorizer.h (compatible_calls_p, slp_node_hash,
slp_compat_nodes_map_t): New.
(class vect_pattern): Update signatures include new cache.
gcc/testsuite/ChangeLog:
PR tree-optimization/102819
PR tree-optimization/103169
* g++.dg/vect/pr99149.cc: xfail for now.
* gcc.dg/vect/complex/pr102819-1.c: New test.
* gcc.dg/vect/complex/pr102819-2.c: New test.
* gcc.dg/vect/complex/pr102819-3.c: New test.
* gcc.dg/vect/complex/pr102819-4.c: New test.
* gcc.dg/vect/complex/pr102819-5.c: New test.
* gcc.dg/vect/complex/pr102819-6.c: New test.
* gcc.dg/vect/complex/pr102819-7.c: New test.
* gcc.dg/vect/complex/pr102819-8.c: New test.
* gcc.dg/vect/complex/pr102819-9.c: New test.
* gcc.dg/vect/complex/pr103169.c: New test.
|
|
This flips the default for the errata handling for an old version
(TL;DR: workaround: no multiply instruction last on a cache-line).
Newer versions of the CRIS cpu don't have that bug. While the impact
of the workaround is very marginal (coremark: less than .05% larger,
less than .0005% slower) it's an irritating pseudorandom factor when
assessing the impact of other changes.
Also, fix a wart requiring changes to more than TARGET_DEFAULT to flip
the default.
People building old kernels or operating systems to run on
ETRAX 100 LX are advised to pass "-mmul-bug-workaround".
gcc:
* config/cris/cris.h (TARGET_DEFAULT): Don't include MASK_MUL_BUG.
(MUL_BUG_ASM_DEFAULT): New macro.
(MAYBE_AS_NO_MUL_BUG_ABORT): Define in terms of MUL_BUG_ASM_DEFAULT.
* doc/invoke.texi (CRIS Options, -mmul-bug-workaround): Adjust
accordingly.
|
|
As reported at
https://gcc.gnu.org/pipermail/gcc/2022-February/238216.html,
multiprecision.org now uses https so this updates the documentation
to use https instead of http.
Committed as obvious.
gcc/ChangeLog:
* doc/install.texi:
|
|
As the minimal GCC version that can build the current master
is 4.8, it does not make sense mentioning something for older
versions.
gcc/ChangeLog:
* doc/install.texi: Remove option for GCC < 4.8.
|
|
gcc/ChangeLog:
* doc/invoke.texi: Update -Wbidi-chars documentation.
|
|
|
|
Stephan Bergmann reported that our -Wbidi-chars breaks the build
of LibreOffice because we warn about UCNs even when their usage
is correct: LibreOffice constructs strings piecewise, as in:
aText = u"\u202D" + aText;
and warning about that is overzealous. Since no editor (AFAIK)
interprets UCNs to show them as Unicode characters, there's less
risk in misinterpreting them, and so perhaps we shouldn't warn
about them by default. However, identifiers containing UCNs or
programs generating other programs could still cause confusion,
so I'm keeping the UCN checking. To turn it on, you just need
to use -Wbidi-chars=unpaired,ucn or -Wbidi-chars=any,ucn.
The implementation is done by using the new EnumSet feature.
PR preprocessor/104030
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars): Mark as EnumSet. Also accept =ucn.
gcc/ChangeLog:
* doc/invoke.texi: Update documentation for -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): Add
bidirectional_ucn. Set values explicitly.
* internal.h (cpp_reader): Adjust warn_bidi_p.
* lex.cc (maybe_warn_bidi_on_close): Don't warn about UCNs
unless UCN checking is on.
(maybe_warn_bidi_on_char): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-10.c: Turn on UCN checking.
* c-c++-common/Wbidi-chars-11.c: Likewise.
* c-c++-common/Wbidi-chars-14.c: Likewise.
* c-c++-common/Wbidi-chars-16.c: Likewise.
* c-c++-common/Wbidi-chars-17.c: Likewise.
* c-c++-common/Wbidi-chars-4.c: Likewise.
* c-c++-common/Wbidi-chars-5.c: Likewise.
* c-c++-common/Wbidi-chars-6.c: Likewise.
* c-c++-common/Wbidi-chars-7.c: Likewise.
* c-c++-common/Wbidi-chars-8.c: Likewise.
* c-c++-common/Wbidi-chars-9.c: Likewise.
* c-c++-common/Wbidi-chars-ranges.c: Likewise.
* c-c++-common/Wbidi-chars-18.c: New test.
* c-c++-common/Wbidi-chars-19.c: New test.
* c-c++-common/Wbidi-chars-20.c: New test.
* c-c++-common/Wbidi-chars-21.c: New test.
* c-c++-common/Wbidi-chars-22.c: New test.
* c-c++-common/Wbidi-chars-23.c: New test.
|
|
and feraiseexcept [PR94193]
This optimizations were originally in glibc, but was removed
and suggested that they were a good fit as gcc builtins[1].
feclearexcept and feraiseexcept were extended (in comparison to the
glibc version) to accept any combination of the accepted flags, not
limited to just one flag bit at a time anymore.
The builtin expanders needs knowledge of the target libc's FE_*
values, so they are limited to expand only to suitable libcs.
[1] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html
https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html
2020-08-13 Raoni Fassina Firmino <raoni@linux.ibm.com>
gcc/
PR target/94193
* builtins.cc (expand_builtin_fegetround): New function.
(expand_builtin_feclear_feraise_except): New function.
(expand_builtin): Add cases for BUILT_IN_FEGETROUND,
BUILT_IN_FECLEAREXCEPT and BUILT_IN_FERAISEEXCEPT.
* config/rs6000/rs6000.md (fegetroundsi): New pattern.
(feclearexceptsi): New Pattern.
(feraiseexceptsi): New Pattern.
* doc/extend.texi: Add a new introductory paragraph about the
new builtins.
* doc/md.texi: (fegetround@var{m}): Document new optab.
(feclearexcept@var{m}): Document new optab.
(feraiseexcept@var{m}): Document new optab.
* optabs.def (fegetround_optab): New optab.
(feclearexcept_optab): New optab.
(feraiseexcept_optab): New optab.
gcc/testsuite/
PR target/94193
* gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-1.c: New test.
* gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-2.c: New test.
* gcc.target/powerpc/builtin-fegetround.c: New test.
Signed-off-by: Raoni Fassina Firmino <raoni@linux.ibm.com>
|
|
|
|
On Sat, Jan 22, 2022 at 01:47:08AM +0100, Jakub Jelinek via Gcc-patches wrote:
> I think with the 2) patch I achieve what we want for Fortran, for 1)
> the only behavior from gcc 11 is that
> -fsanitize-coverage=trace-cmp,trace-cmp is now rejected.
> This is mainly from the desire to disallow
> -fconvert=big-endian,little-endian or -Wbidi-chars=bidirectional,any
> etc. where it would be confusing to users what exactly it means.
> But it is the only from these options that actually acts as an Enum
> bit set, each enumerator can be specified with all the others.
> So one option would be stop requiring the EnumSet implies Set properties
> must be specified and just require that either they are specified on all
> EnumValues, or on none of them; the latter case would be for
> -fsanitize-coverage= and the non-Set case would mean that all the
> EnumValues need to have disjoint Value bitmasks and that they can
> be all specified and unlike the Set case also repeated.
> Thoughts on this?
Here is an incremental patch to the first two patches of the series
that implements EnumBitSet that fully restores the -fsanitize-coverage
GCC 11 behavior.
2022-01-24 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/104158
* opt-functions.awk (var_set): Handle EnumBitSet property.
* optc-gen.awk: Don't disallow RejectNegative if EnumBitSet is
specified.
* opts.h (enum cl_enum_var_value): New type.
* opts-common.cc (decode_cmdline_option): Use CLEV_* values.
Handle CLEV_BITSET.
(cmdline_handle_error): Handle CLEV_BITSET.
* opts.cc (test_enum_sets): Also test EnumBitSet requirements.
* doc/options.texi (EnumBitSet): Document.
* common.opt (fsanitize-coverage=): Use EnumBitSet instead of
EnumSet.
(trace-pc, trace-cmp): Drop Set properties.
* gcc.dg/sancov/pr104158-7.c: Adjust for repeating of arguments
being allowed.
|
|
The following patch is infrastructure support for at least 3 different
options that need changes:
1) PR104158 talks about a regression with the -fsanitizer-coverage=
option; in GCC 11 and older and on trunk prior to r12-1177, this
option behaved similarly to -f{,no-}sanitizer{,-recover}= options,
namely that the option allows negative and argument of the option
is a list of strings, each of them has some enumerator and
-fsanitize-coverage= enabled those bits in the underlying
flag_sanitize_coverage, while -fno-sanitize-coverage= disabled them.
So, -fsanitize-coverage=trace-pc,trace-cmp was equivalent to
-fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp and both
set flag_sanitize_coverage to
(SANITIZE_COV_TRACE_PC | SANITIZE_COV_TRACE_CMP)
Also, e.g.
-fsanitize-coverage=trace-pc,trace-cmp -fno-sanitize-coverage=trace-pc
would in the end set flag_sanitize_coverage to
SANITIZE_COV_TRACE_CMP (first set both bits, then subtract one)
The r12-1177 change, I think done to improve argument misspelling
diagnostic, changed the option incompatibly in multiple ways,
-fno-sanitize-coverage= is now rejected, only a single argument
is allowed, not multiple and
-fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp
enables just SANITIZE_COV_TRACE_CMP and not both (each option
overrides the previous value)
2) Thomas Koenig wants to extend Fortran -fconvert= option for the
ppc64le real(kind=16) swapping support; currently the option
accepts -fconvert={native,swap,big-endian,little-endian} and the
intent is to add support for -fconvert=r16_ibm and -fconvert=r16_ieee
(that alone is just normal Enum), but also to handle
-fconvert=swap,r16_ieee or -fconvert=r16_ieee,big-endian but not
-fconvert=big-endian,little-endian - the
native/swap/big-endian/little-endian are one mutually exclusive set
and r16_ieee/r16_ibm another one.
See https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587943.html
and thread around that.
3) Similarly Marek Polacek wants to extend the -Wbidi-chars= option,
such that it will handle not just the current
-Wbidi-chars={none,bidirectional,any}, but also -Wbidi-chars=ucn
and bidirectional,ucn and ucn,any etc. Again two separate sets,
one none/bidirectional/any and another one ucn.
See https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588960.html
The following patch adds framework for this and I'll post incremental
patches for 1) and 2).
As I've tried to document, such options are marked by additional
EnumSet property on the option and in that case all the EnumValues
in the Enum referenced from it must use a new Set property with set
number (initially I wanted just mark last enumerator in each mutually
exclusive set, but optionlist is sorted and so it doesn't really work
well). So e.g. for the Fortran -fconvert=, one specifies:
fconvert=
Fortran RejectNegative Joined Enum(gfc_convert) EnumSet Var(flag_convert) Init(GFC_FLAG_CONVERT_NATIVE)
-fconvert=<big-endian|little-endian|native|swap|r16_ieee|r16_ibm> The endianness used for unformatted files.
Enum
Name(gfc_convert) Type(enum gfc_convert) UnknownError(Unrecognized option to endianness value: %qs)
EnumValue
Enum(gfc_convert) String(big-endian) Value(GFC_FLAG_CONVERT_BIG) Set(1)
EnumValue
Enum(gfc_convert) String(little-endian) Value(GFC_FLAG_CONVERT_LITTLE) Set(1)
EnumValue
Enum(gfc_convert) String(native) Value(GFC_FLAG_CONVERT_NATIVE) Set(1)
EnumValue
Enum(gfc_convert) String(swap) Value(GFC_FLAG_CONVERT_SWAP) Set(1)
EnumValue
Enum(gfc_convert) String(r16_ieee) Value(GFC_FLAG_CONVERT_R16_IEEE) Set(2)
EnumValue
Enum(gfc_convert) String(r16_ibm) Value(GFC_FLAG_CONVERT_R16_IBM) Set(2)
and this says to the option handling code that
1) if only one arg is specified to one instance of the option, it can be any
of those 6
2) if two args are specified, one has to be from the first 4 and another
from the last 2, in any order
3) at most 2 args may be specified (there are just 2 sets)
There is a requirement on the Value values checked in self-test, the
values from one set ored together must be disjunct from values from
another set ored together. In the Fortran case, the first 4 are 0-3
so mask is 3, and the last 2 are 4 and 8, so mask is 12.
When say -fconvert=big-endian is specified, it sets the first set
to GFC_FLAG_CONVERT_BIG (2) but doesn't modify whatever value the
other set had, so e.g.
-fconvert=big-endian -fconvert=r16_ieee
-fconvert=r16_ieee -fconvert=big-endian
-fconvert=r16_ieee,big_endian
-fconvert=big_endian,r16_ieee
all behave the same.
Also, with the EnumSet support, it is now possible to allow
not specifying RejectNegative - we can set some set's value and
then clear it and set it again to some other value etc.
I think with the 2) patch I achieve what we want for Fortran, for 1)
the only behavior from gcc 11 is that
-fsanitize-coverage=trace-cmp,trace-cmp is now rejected.
This is mainly from the desire to disallow
-fconvert=big-endian,little-endian or -Wbidi-chars=bidirectional,any
etc. where it would be confusing to users what exactly it means.
But it is the only from these options that actually acts as an Enum
bit set, each enumerator can be specified with all the others.
So one option would be stop requiring the EnumSet implies Set properties
must be specified and just require that either they are specified on all
EnumValues, or on none of them; the latter case would be for
-fsanitize-coverage= and the non-Set case would mean that all the
EnumValues need to have disjoint Value bitmasks and that they can
be all specified and unlike the Set case also repeated.
Thoughts on this?
2022-01-24 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/104158
* opt-functions.awk (var_set): Handle EnumSet property.
* optc-gen.awk: Don't disallow RejectNegative if EnumSet is
specified.
* opt-read.awk: Handle Set property.
* opts.h (CL_ENUM_SET_SHIFT, CL_ERR_ENUM_SET_ARG): Define.
(struct cl_decoded_option): Mention enum in value description.
Add mask member.
(set_option): Add mask argument defaulted to 0.
* opts.cc (test_enum_sets): New function.
(opts_cc_tests): Call it.
* opts-common.cc (enum_arg_to_value): Change return argument
from bool to int, on success return index into the cl_enum_arg
array, on failure -1. Add len argument, if non-0, use strncmp
instead of strcmp.
(opt_enum_arg_to_value): Adjust caller.
(decode_cmdline_option): Handle EnumSet represented as
CLVC_ENUM with non-zero var_value. Initialize decoded->mask.
(decode_cmdline_options_to_array): CLear opt_array[0].mask.
(handle_option): Pass decoded->mask to set_options last argument.
(generate_option): Clear decoded->mask.
(generate_option_input_file): Likewise.
(cmdline_handle_error): Handle CL_ERR_ENUM_SET_ARG.
(set_option): Add mask argument, use it for CLVC_ENUM.
(control_warning_option): Adjust enum_arg_to_value caller.
* doc/options.texi: Document Set and EnumSet properties.
|
|
This patch resolves the P1 "ice-on-valid-code" regression boostrapping
GCC on risv-unknown-linux-gnu caused by my recent MULT_HIGHPART_EXPR
functionality. RISC-V differs from x86_64 and many targets by
supporting a usmusidi3 instruction, basically a widening multiply
where one operand is signed and the other is unsigned. Alas the
final version of my patch to recognize MULT_HIGHPART_EXPR didn't
sufficiently defend against the operands of WIDEN_MULT_EXPR having
different signedness. This is fixed by the two-line change to
tree-ssa-math-opts.cc's convert_mult_to_highpart in the patch below.
The majority of the rest of the patch is to the documentation
(in tree.def and generic.texi). It turns out that WIDEN_MULT_EXPR
wasn't previously documented in generic.texi, let alone the slightly
unusual semantics of allowing mismatched (signed vs unsigned) operands.
This also clarifies that MULT_HIGHPART_EXPR currently requires the
signedness of operands to match [but this might change in a future
release of GCC to support targets with usmul<mode>3_highpart].
The one final chunk of this patch (that is hopefully sufficiently
close to obvious for stage 4) is a similar (NULL pointer) sanity
check in riscv_cpu_cpp_builtins. Currently running cc1 from the
command line (or from gdb) without specifying -march results in a
segmentation fault (ICE). This is a minor annoyance tracking down
issues (in cross compilers) for riscv, and trivially fixed as below.
2022-01-22 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR middle-end/104140
* tree-ssa-math-opts.cc (convert_mult_to_highpart): Check that the
operands of the widening multiplication are either both signed or
both unsigned, and abort the conversion if mismatched.
* doc/generic.texi (WIDEN_MULT_EXPR): Describe expression node.
(MULT_HIGHPART_EXPR): Clarify that operands must have the same
signedness.
* tree.def (MULT_HIGHPART_EXPR): Document both operands must have
integer types with the same precision and signedness.
(WIDEN_MULT_EXPR): Document that operands must have integer types
with the same precision, but possibly differing signedness.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Defend against
riscv_current_subset_list returning a NULL pointer (empty list).
gcc/testsuite/ChangeLog
PR middle-end/104140
* gcc.target/riscv/pr104140.c: New test case.
|
|
Add support for accessing the stack canary value via the TLS register,
so that multiple threads running in the same address space can use
distinct canary values. This is intended for the Linux kernel running in
SMP mode, where processes entering the kernel are essentially threads
running the same program concurrently: using a global variable for the
canary in that context is problematic because it can never be rotated,
and so the OS is forced to use the same value as long as it remains up.
Using the TLS register to index the stack canary helps with this, as it
allows each CPU to context switch the TLS register along with the rest
of the process, permitting each process to use its own value for the
stack canary.
gcc/ChangeLog:
* config/arm/arm-opts.h (enum stack_protector_guard): New.
* config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
New.
* config/arm/arm.cc (TARGET_STACK_PROTECT_GUARD): Define.
(arm_option_override_internal): Handle and put in error checks.
for stack protector guard options.
(arm_option_reconfigure_globals): Likewise.
(arm_stack_protect_tls_canary_mem): New.
(arm_stack_protect_guard): New.
* config/arm/arm.md (stack_protect_set): New.
(stack_protect_set_tls): Likewise.
(stack_protect_test): Likewise.
(stack_protect_test_tls): Likewise.
(reload_tp_hard): Likewise.
* config/arm/arm.opt (-mstack-protector-guard): New
(-mstack-protector-guard-offset): New.
* doc/invoke.texi: Document new options.
gcc/testsuite/ChangeLog:
* gcc.target/arm/stack-protector-7.c: New test.
* gcc.target/arm/stack-protector-8.c: New test.
|
|
|
|
Add a new option -mfix-cortex-a-aes for enabling the Cortex-A AES
erratum work-around and enable it automatically for the affected
products (Cortex-A57 and Cortex-A72).
gcc/ChangeLog:
* config/arm/arm-cpus.in (quirk_aes_1742098): New quirk feature
(ALL_QUIRKS): Add it.
(cortex-a57, cortex-a72): Enable it.
(cortex-a57.cortex-a53, cortex-a72.cortex-a53): Likewise.
* config/arm/arm.opt (mfix-cortex-a57-aes-1742098): New command-line
option.
(mfix-cortex-a72-aes-1655431): New option alias.
* config/arm/arm.cc (arm_option_override): Handle default settings
for AES erratum switch.
* doc/invoke.texi (Arm Options): Document new options.
|
|
In pathological cases, the number of transitive relations being added is
potentially quadratic. Lookups for relations in a block is linear in
nature, so simply limit the number of relations to some reasonable number.
PR tree-optimization/104038
* doc/invoke.texi (relation-block-limit): New.
* params.opt (relation-block-limit): New.
* value-relation.cc (dom_oracle::register_relation): Check for NULL
record before invoking transitive registery.
(dom_oracle::set_one_relation): Check limit before creating record.
(dom_oracle::register_transitives): Stop when no record created.
* value-relation.h (relation_chain_head::m_num_relations): New.
|
|
|
|
* doc/install.texi: Update prerequisites for GNAT
|