aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc
AgeCommit message (Collapse)AuthorFilesLines
2022-03-30[nvptx, doc] Update misa and mptx, add march and march-mapTom de Vries1-7/+20
Update nvptx documentation: - Use meaningful terms: "PTX ISA target architecture" and "PTX ISA version". - Remove invalid claim that "ISA strings must be lower-case". - Add missing sm_xx entries. - Fix misa default. - Add march, copying misa doc. - Declare misa an march alias. - Add march-map. - Fix "for given the specified" typo. gcc/ChangeLog: 2022-03-29 Tom de Vries <tdevries@suse.de> * doc/invoke.texi (misa, mptx): Update. (march, march-map): Add.
2022-03-29LoongArch Port: Add doc.chenglulu3-5/+268
2022-03-29 Chenghua Xu <xuchenghua@loongson.cn> Lulu Cheng <chenglulu@loongson.cn> gcc/ChangeLog: * doc/install.texi: Add LoongArch options section. * doc/invoke.texi: Add LoongArch options section. * doc/md.texi: Add LoongArch options section. contrib/ChangeLog: * config-list.mk: Add LoongArch triplet.
2022-03-28c++: Fix __has_trivial_* docs [PR59426]Jason Merrill1-4/+4
These have been misdocumented since C++98 POD was split into C++11 trivial and standard-layout in r149721. PR c++/59426 gcc/ChangeLog: * doc/extend.texi: Refer to __is_trivial instead of __is_pod.
2022-03-25doc/invoke.texi: Move @ignore block out of @gccoptlist [PR103533]Tobias Burnus1-7/+7
With TeX output ("make pdf"), @gccoptlist's content end up in a single line such that TeX does not find the matching '@end ignore' for the '@ignore' block – failing with a runaway error. Solution is to move the @ignore block after the closing '}'. (Follow up to r12-7808-g319ba7e241e7e21f9eb481f075310796f13d2035 ) gcc/ PR analyzer/103533 * doc/invoke.texi (Static Analyzer Options): Move @ignore block after @gccoptlist's '}' for 'make pdf'.
2022-03-24analyzer: add region::tracked_p to optimize state objects [PR104954]David Malcolm1-0/+5
PR analyzer/104954 tracks that -fanalyzer was taking a very long time on a particular source file in the Linux kernel: drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c One issue occurs with the repeated use of dynamic debug lines e.g. via the DC_LOG_BANDWIDTH_CALCS macro, such as in print_bw_calcs_dceip in drivers/gpu/drm/amd/display/dc/calcs/calcs_logger.h: DC_LOG_BANDWIDTH_CALCS("#####################################################################"); DC_LOG_BANDWIDTH_CALCS("struct bw_calcs_dceip"); DC_LOG_BANDWIDTH_CALCS("#####################################################################"); [...snip dozens of lines...] DC_LOG_BANDWIDTH_CALCS("[bw_fixed] dmif_request_buffer_size: %d", bw_fixed_to_int(dceip->dmif_request_buffer_size)); When this is configured to use __dynamic_pr_debug, each of these becomes code like: do { static struct _ddebug __attribute__((__aligned__(8))) __attribute__((__section__("__dyndbg"))) __UNIQUE_ID_ddebug277 = { [...snip...] }; if (arch_static_branch(&__UNIQUE_ID_ddebug277.key, false)) __dynamic_pr_debug(&__UNIQUE_ID_ddebug277, [...the message...]); } while (0); The analyzer was naively seeing each call to __dynamic_pr_debug, noting that the __UNIQUE_ID_nnnn object escapes. At each call, as successive __UNIQUE_ID_nnnn object escapes, there are N escaped objects, and thus N need clobbering, and so we have O(N^2) clobbering of escaped objects overall, leading to huge amounts of pointless work: print_bw_calcs_data has 225 uses of DC_LOG_BANDWIDTH_CALCS, many of which are in loops. This patch adds a way to identify declarations that aren't interesting to the analyzer, so that we don't attempt to create binding_clusters for them (i.e. we don't store any state for them in our state objects). This is implemented by adding a new region::tracked_p, implemented for declarations by walking the existing IPA data the first time the analyzer sees a declaration, setting it to false for global vars that have no loads/stores/aliases, and "sufficiently safe" address-of ipa-refs. The patch gives a large speedup of -fanalyzer on the above kernel source file: Before After Total cc1 wallclock time: 180s 36s analyzer wallclock time: 162s 17s % spent in analyzer: 90% 47% gcc/analyzer/ChangeLog: PR analyzer/104954 * analyzer.opt (-fdump-analyzer-untracked): New option. * engine.cc (impl_run_checkers): Handle it. * region-model-asm.cc (region_model::on_asm_stmt): Don't attempt to clobber regions with !tracked_p (). * region-model-manager.cc (dump_untracked_region): New. (region_model_manager::dump_untracked_regions): New. (frame_region::dump_untracked_regions): New. * region-model.h (region_model_manager::dump_untracked_regions): New decl. * region.cc (ipa_ref_requires_tracking): New. (symnode_requires_tracking_p): New. (decl_region::calc_tracked_p): New. * region.h (region::tracked_p): New vfunc. (frame_region::dump_untracked_regions): New decl. (class decl_region): Note that this is also used fo SSA names. (decl_region::decl_region): Initialize m_tracked. (decl_region::tracked_p): New. (decl_region::calc_tracked_p): New decl. (decl_region::m_tracked): New. * store.cc (store::get_or_create_cluster): Assert that we don't try to create clusters for base regions that aren't trackable. (store::mark_as_escaped): Don't mark base regions that we're not tracking. gcc/ChangeLog: PR analyzer/104954 * doc/invoke.texi (Static Analyzer Options): Add -fdump-analyzer-untracked. gcc/testsuite/ChangeLog: PR analyzer/104954 * gcc.dg/analyzer/asm-x86-dyndbg-1.c: New test. * gcc.dg/analyzer/asm-x86-dyndbg-2.c: New test. * gcc.dg/analyzer/many-unused-locals.c: New test. * gcc.dg/analyzer/untracked-1.c: New test. * gcc.dg/analyzer/unused-local-1.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-03-24Docs: Document that taint analyzer checker disables some warnings [PR103533]Avinash Sonawane1-6/+27
gcc/ChangeLog: PR analyzer/103533 * doc/invoke.texi: Document that enabling taint analyzer checker disables some warnings from `-fanalyzer`. Signed-off-by: Avinash Sonawane <rootkea@gmail.com>
2022-03-21docs: Document min-pagesize parameter.Martin Liska1-0/+3
gcc/ChangeLog: * doc/invoke.texi: Document min-pagesize parameter.
2022-03-18x86: Correct march=sapphirerapids to base on icelake serverCui,Lili1-5/+6
march=sapphirerapids should be based on icelake server not cooperlake. gcc/ChangeLog: PR target/104963 * config/i386/i386.h (PTA_SAPPHIRERAPIDS): change it to base on ICX. * doc/invoke.texi: Update documents for Intel sapphirerapids. gcc/testsuite/ChangeLog: PR target/104963 * gcc.target/i386/pr104963.c: New test case.
2022-03-16c++: fold calls to std::move/forward [PR96780]Patrick Palka1-0/+10
A well-formed call to std::move/forward is equivalent to a cast, but the former being a function call means the compiler generates debug info, which persists even after the call gets inlined, for an operation that's never interesting to debug. This patch addresses this problem by folding calls to std::move/forward and other cast-like functions into simple casts as part of the frontend's general expression folding routine. This behavior is controlled by a new flag -ffold-simple-inlines, and otherwise by -fno-inline, so that users can enable this folding with -O0 (which implies -fno-inline). After this patch with -O2 and a non-checking compiler, debug info size for some testcases from range-v3 and cmcstl2 decreases by as much as ~10% and overall compile time and memory usage decreases by ~2%. PR c++/96780 gcc/ChangeLog: * doc/invoke.texi (C++ Dialect Options): Document -ffold-simple-inlines. gcc/c-family/ChangeLog: * c.opt: Add -ffold-simple-inlines. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_fold) <case CALL_EXPR>: Fold calls to std::move/forward and other cast-like functions into simple casts. gcc/testsuite/ChangeLog: * g++.dg/opt/pr96780.C: New test.
2022-03-15Merge branch 'master' into devel/sphinxMartin Liska4-7/+11
2022-03-14c++: Reject __builtin_clear_padding on non-trivially-copyable types with one ↵Jakub Jelinek1-0/+5
exception [PR102586] As mentioned by Jason in the PR, non-trivially-copyable types (or non-POD for purposes of layout?) types can be base classes of derived classes in which the padding in those non-trivially-copyable types can be reused for some real data members or even the layout can change and data members can be moved to other positions. __builtin_clear_padding is right now used for multiple purposes, in <atomic> where it isn't used yet but was planned as the main spot it can be used for trivially copyable types only, ditto for std::bit_cast where we also use it. It is used for OpenMP long double atomics too but long double is trivially copyable, and lastly for -ftrivial-auto-var-init=. The following patch restricts the builtin to pointers to trivially-copyable types, with the exception when it is called directly on an address of a variable, in that case already the FE can verify it is the complete object type and so it is safe to clear all the paddings in it. 2022-03-14 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102586 gcc/ * doc/extend.texi (__builtin_clear_padding): Clearify that for C++ argument type should be pointer to trivially-copyable type unless it is address of a variable or parameter. gcc/cp/ * call.cc (build_cxx_call): Diagnose __builtin_clear_padding where first argument's type is pointer to non-trivially-copyable type unless it is address of a variable or parameter. gcc/testsuite/ * g++.dg/cpp2a/builtin-clear-padding1.C: New test.
2022-03-13texi + c-target.def: Fix typosTobias Burnus3-6/+6
gcc/c-family/ChangeLog: * c-target.def (check_string_object_format_arg): Fix description typo. gcc/ChangeLog: * doc/invoke.texi: Fix typos. * doc/tm.texi.in: Remove duplicated word. * doc/tm.texi: Regenerate. libgomp/ChangeLog: * libgomp.texi: Fix typo.
2022-03-10Merge branch 'master' into devel/sphinxMartin Liska1-5/+6
2022-03-09c, c++, c-family: -Wshift-negative-value and -Wshift-overflow* tweaks for ↵Jakub Jelinek1-2/+2
-fwrapv and C++20+ [PR104711] As mentioned in the PR, different standards have different definition on what is an UB left shift. They all agree on out of bounds (including negative) shift count. The rules used by ubsan are: C99-C2x ((unsigned) x >> (uprecm1 - y)) != 0 then UB C++11-C++17 x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 then UB C++20 and later everything is well defined Now, for C++20, I've in the P1236R1 implementation added an early exit for -Wshift-overflow* warning so that it never warns, but apparently -Wshift-negative-value remained as is. As it is well defined in C++20, the following patch doesn't enable -Wshift-negative-value from -Wextra anymore for C++20 and later, if users want for compatibility with C++17 and earlier get the warning, they still can by using -Wshift-negative-value explicitly. Another thing is -fwrapv, that is an extension to the standards, so it is up to us how exactly we define that case. Our ubsan code treats TYPE_OVERFLOW_WRAPS (type0) and cxx_dialect >= cxx20 the same as only diagnosing out of bounds shift count and nothing else and IMHO it is most sensical to treat -fwrapv signed left shifts the same as C++20 treats them, https://eel.is/c++draft/expr.shift#2 "The value of E1 << E2 is the unique value congruent to E1×2^E2 modulo 2^N, where N is the width of the type of the result. [Note 1: E1 is left-shifted E2 bit positions; vacated bits are zero-filled. — end note]" with no UB dependent on the E1 values. The UB is only "The behavior is undefined if the right operand is negative, or greater than or equal to the width of the promoted left operand." Under the hood (except for FEs and ubsan from FEs) GCC middle-end doesn't consider UB in left shifts dependent on the first operand's value, only the out of bounds shifts. While this change isn't a regression, I'd think it is useful for GCC 12, it doesn't add new warnings, but just removes warnings that aren't appropriate. 2022-03-09 Jakub Jelinek <jakub@redhat.com> PR c/104711 gcc/ * doc/invoke.texi (-Wextra): Document that -Wshift-negative-value is enabled by it only for C++11 to C++17 rather than for C++03 or later. (-Wshift-negative-value): Similarly (except here we stated that it is enabled for C++11 or later). gcc/c-family/ * c-opts.cc (c_common_post_options): Don't enable -Wshift-negative-value from -Wextra for C++20 or later. * c-ubsan.cc (ubsan_instrument_shift): Adjust comments. * c-warn.cc (maybe_warn_shift_overflow): Use TYPE_OVERFLOW_WRAPS instead of TYPE_UNSIGNED. gcc/c/ * c-fold.cc (c_fully_fold_internal): Don't emit -Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS. * c-typeck.cc (build_binary_op): Likewise. gcc/cp/ * constexpr.cc (cxx_eval_check_shift_p): Use TYPE_OVERFLOW_WRAPS instead of TYPE_UNSIGNED. * typeck.cc (cp_build_binary_op): Don't emit -Wshift-negative-value warning if TYPE_OVERFLOW_WRAPS. gcc/testsuite/ * c-c++-common/Wshift-negative-value-1.c: Remove dg-additional-options, instead in target selectors of each diagnostic check for exact C++ versions where it should be diagnosed. * c-c++-common/Wshift-negative-value-2.c: Likewise. * c-c++-common/Wshift-negative-value-3.c: Likewise. * c-c++-common/Wshift-negative-value-4.c: Likewise. * c-c++-common/Wshift-negative-value-7.c: New test. * c-c++-common/Wshift-negative-value-8.c: New test. * c-c++-common/Wshift-negative-value-9.c: New test. * c-c++-common/Wshift-negative-value-10.c: New test. * c-c++-common/Wshift-overflow-1.c: Remove dg-additional-options, instead in target selectors of each diagnostic check for exact C++ versions where it should be diagnosed. * c-c++-common/Wshift-overflow-2.c: Likewise. * c-c++-common/Wshift-overflow-5.c: Likewise. * c-c++-common/Wshift-overflow-6.c: Likewise. * c-c++-common/Wshift-overflow-7.c: Likewise. * c-c++-common/Wshift-overflow-8.c: New test. * c-c++-common/Wshift-overflow-9.c: New test. * c-c++-common/Wshift-overflow-10.c: New test. * c-c++-common/Wshift-overflow-11.c: New test. * c-c++-common/Wshift-overflow-12.c: New test.
2022-03-08tree-optimization/84201 - add --param vect-induction-floatRichard Biener1-0/+3
This adds a --param to allow disabling of vectorization of floating point inductions. Ontop of -Ofast this should allow 549.fotonik3d_r to not miscompare. 2022-03-08 Richard Biener <rguenther@suse.de> PR tree-optimization/84201 * params.opt (-param=vect-induction-float): Add. * doc/invoke.texi (vect-induction-float): Document. * tree-vect-loop.cc (vectorizable_induction): Honor param_vect_induction_float. * gcc.dg/vect/pr84201.c: New testcase.
2022-03-07doc: Remove redundant sentence about modules being in C++20Jonathan Wakely1-3/+1
As C++20 has already been published, we don't need to link to the draft (which is now the C++23 draft anyway). And there's no need to say it's part of the C++20 spec, or that there might be defect reports. That's true for everything in C++20, so calling it out here just for Modules isn't needed. gcc/ChangeLog: * doc/invoke.texi (C++ Modules): Remove anachronism.
2022-03-06Merge branch 'master' into devel/sphinxMartin Liska9-171/+323
2022-03-02Don't emit switch-unreachable warnings for -ftrivial-auto-var-init (PR102276)Qing Zhao1-1/+13
At the same time, adding -Wtrivial-auto-var-init and update documentation. -Wtrivial-auto-var-init and update documentation. for the following testing case: 1 int g(int *); 2 int f1() 3 { 4 switch (0) { 5 int x; 6 default: 7 return g(&x); 8 } 9 } compiling with -O -ftrivial-auto-var-init causes spurious warning: warning: statement will never be executed [-Wswitch-unreachable] 5 | int x; | ^ This is due to the compiler-generated initialization at the point of the declaration. We could avoid the warning to exclude the following cases: when flag_auto_var_init > AUTO_INIT_UNINITIALIZED And 1) call to .DEFERRED_INIT 2) call to __builtin_clear_padding if the 2nd argument is present and non-zero 3) a gimple assign store right after the .DEFERRED_INIT call that has the LHS as RHS However, we still need to warn users about the incapability of the option -ftrivial-auto-var-init by adding a new warning option -Wtrivial-auto-var-init to report cases when it cannot initialize the auto variable. At the same time, update documentation for -ftrivial-auto-var-init to connect it with the new warning option -Wtrivial-auto-var-init, and add documentation for -Wtrivial-auto-var-init. gcc/ChangeLog: PR middle-end/102276 * common.opt (-Wtrivial-auto-var-init): New option. * doc/invoke.texi (-Wtrivial-auto-var-init): Document new option. (-ftrivial-auto-var-init): Update option; * gimplify.cc (emit_warn_switch_unreachable): New function. (warn_switch_unreachable_r): Rename to ... (warn_switch_unreachable_and_auto_init_r): This. (maybe_warn_switch_unreachable): Rename to ... (maybe_warn_switch_unreachable_and_auto_init): This. (gimplify_switch_expr): Update calls to renamed function. gcc/testsuite/ChangeLog: PR middle-end/102276 * gcc.dg/auto-init-pr102276-1.c: New test. * gcc.dg/auto-init-pr102276-2.c: New test. * gcc.dg/auto-init-pr102276-3.c: New test. * gcc.dg/auto-init-pr102276-4.c: New test.
2022-03-01docs: Document more .gcda file name generation.Martin Liska1-1/+9
PR gcov-profile/104677 gcc/ChangeLog: * doc/invoke.texi: Document more .gcda file name generation.
2022-02-24RISC-V: Document the degree of position independence that medany affordsPalmer Dabbelt1-0/+4
The code generated by -mcmodel=medany is defined to be position-independent, but is not guaranteed to function correctly when linked into position-independent executables or libraries. See the recent discussion at the psABI specification [1] for more details. It would be better to reject these invalid sequences when linking, but as pointed out in a recent LD bug [2] there may be some compatibility issues related to the PCREL_HI20 relocations used to initialize GP. Given the complexity here it's unlikely we'll be able to reject these sequences any time soon, so instead just document that these may not work. [1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/245 [2]: https://sourceware.org/bugzilla/show_bug.cgi?id=28789 gcc/ChangeLog: * doc/invoke.texi (RISC-V -mcmodel=medany): Document the degree of position independence that -mcmodel=medany affords. Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-23middle-end/104644 - recursion with bswap match.pd patternRichard Biener1-2/+4
The following patch avoids infinite recursion during generic folding. The (cmp (bswap @0) INTEGER_CST@1) simplification relies on (bswap @1) actually being simplified, if it is not simplified, we just move the bswap from one operand to the other and if @0 is also INTEGER_CST, we apply the same rule next. The reason why bswap @1 isn't folded to INTEGER_CST is that the INTEGER_CST has TREE_OVERFLOW set on it and fold-const-call.cc predicate punts in such cases: static inline bool integer_cst_p (tree t) { return TREE_CODE (t) == INTEGER_CST && !TREE_OVERFLOW (t); } The patch uses ! modifier to ensure the bswap is simplified and extends support to GENERIC by means of requiring !EXPR_P which is not perfect but a conservative approximation. 2022-02-22 Richard Biener <rguenther@suse.de> PR tree-optimization/104644 * doc/match-and-simplify.texi: Amend ! documentation. * genmatch.cc (expr::gen_transform): Code-generate ! support for GENERIC. (parser::parse_expr): Allow ! for GENERIC. * match.pd (cmp (bswap @0) INTEGER_CST@1): Use ! modifier on bswap. * gcc.dg/pr104644.c: New test. Co-Authored-by: Jakub Jelinek <jakub@redhat.com>
2022-02-23x86: Update Intel architectures ISA support in documentation.Cui,Lili1-87/+98
Since the ISA supported by Intel architectures in the documentation are inconsistent with the actual, modify them all. gcc/Changelog: * doc/invoke.texi: Update documents for Intel architectures.
2022-02-22arm: Fix vcond_mask expander for MVE (PR target/100757)Christophe Lyon1-0/+4
The problem in this PR is that we call VPSEL with a mask of vector type instead of HImode. This happens because operand 3 in vcond_mask is the pre-computed vector comparison and has vector type. This patch fixes it by implementing TARGET_VECTORIZE_GET_MASK_MODE, returning the appropriate VxBI mode when targeting MVE. In turn, this implies implementing vec_cmp<mode><MVE_vpred>, vec_cmpu<mode><MVE_vpred> and vcond_mask_<mode><MVE_vpred>, and we can move vec_cmp<mode><v_cmp_result>, vec_cmpu<mode><mode> and vcond_mask_<mode><v_cmp_result> back to neon.md since they are not used by MVE anymore. The new *<MVE_vpred> patterns listed above are implemented in mve.md since they are only valid for MVE. However this may make maintenance/comparison more painful than having all of them in vec-common.md. In the process, we can get rid of the recently added vcond_mve parameter of arm_expand_vector_compare. Compared to neon.md's vcond_mask_<mode><v_cmp_result> before my "arm: Auto-vectorization for MVE: vcmp" patch (r12-834), it keeps the VDQWH iterator added in r12-835 (to have V4HF/V8HF support), as well as the (!<Is_float_mode> || flag_unsafe_math_optimizations) condition which was not present before r12-834 although SF modes were enabled by VDQW (I think this was a bug). Using TARGET_VECTORIZE_GET_MASK_MODE has the advantage that we no longer need to generate vpsel with vectors of 0 and 1: the masks are now merged via scalar 'ands' instructions operating on 16-bit masks after converting the boolean vectors. In addition, this patch fixes a problem in arm_expand_vcond() where the result would be a vector of 0 or 1 instead of operand 1 or 2. Since we want to skip gcc.dg/signbit-2.c for MVE, we also add a new arm_mve effective target. Reducing the number of iterations in pr100757-3.c from 32 to 8, we generate the code below: float a[32]; float fn1(int d) { float c = 4.0f; for (int b = 0; b < 8; b++) if (a[b] != 2.0f) c = 5.0f; return c; } fn1: ldr r3, .L3+48 vldr.64 d4, .L3 // q2=(2.0,2.0,2.0,2.0) vldr.64 d5, .L3+8 vldrw.32 q0, [r3] // q0=a(0..3) adds r3, r3, #16 vcmp.f32 eq, q0, q2 // cmp a(0..3) == (2.0,2.0,2.0,2.0) vldrw.32 q1, [r3] // q1=a(4..7) vmrs r3, P0 vcmp.f32 eq, q1, q2 // cmp a(4..7) == (2.0,2.0,2.0,2.0) vmrs r2, P0 @ movhi ands r3, r3, r2 // r3=select(a(0..3]) & select(a(4..7)) vldr.64 d4, .L3+16 // q2=(5.0,5.0,5.0,5.0) vldr.64 d5, .L3+24 vmsr P0, r3 vldr.64 d6, .L3+32 // q3=(4.0,4.0,4.0,4.0) vldr.64 d7, .L3+40 vpsel q3, q3, q2 // q3=vcond_mask(4.0,5.0) vmov.32 r2, q3[1] // keep the scalar max vmov.32 r0, q3[3] vmov.32 r3, q3[2] vmov.f32 s11, s12 vmov s15, r2 vmov s14, r3 vmaxnm.f32 s15, s11, s15 vmaxnm.f32 s15, s15, s14 vmov s14, r0 vmaxnm.f32 s15, s15, s14 vmov r0, s15 bx lr .L4: .align 3 .L3: .word 1073741824 // 2.0f .word 1073741824 .word 1073741824 .word 1073741824 .word 1084227584 // 5.0f .word 1084227584 .word 1084227584 .word 1084227584 .word 1082130432 // 4.0f .word 1082130432 .word 1082130432 .word 1082130432 This patch adds tests that trigger an ICE without this fix. The pr100757*.c testcases are derived from gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using various types and return values different from 0 and 1 to avoid commonalization with boolean masks. In addition, since we should not need these masks, the tests make sure they are not present. Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon <christophe.lyon@arm.com> PR target/100757 gcc/ * config/arm/arm-protos.h (arm_get_mask_mode): New prototype. (arm_expand_vector_compare): Update prototype. * config/arm/arm.cc (TARGET_VECTORIZE_GET_MASK_MODE): New. (arm_vector_mode_supported_p): Add support for VxBI modes. (arm_expand_vector_compare): Remove useless generation of vpsel. (arm_expand_vcond): Fix select operands. (arm_get_mask_mode): New. * config/arm/mve.md (vec_cmp<mode><MVE_vpred>): New. (vec_cmpu<mode><MVE_vpred>): New. (vcond_mask_<mode><MVE_vpred>): New. * config/arm/vec-common.md (vec_cmp<mode><v_cmp_result>) (vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): Move to ... * config/arm/neon.md (vec_cmp<mode><v_cmp_result>) (vec_cmpu<mode><mode, vcond_mask_<mode><v_cmp_result>): ... here and disable for MVE. * doc/sourcebuild.texi (arm_mve): Document new effective-target. gcc/testsuite/ PR target/100757 * gcc.target/arm/simd/pr100757-2.c: New. * gcc.target/arm/simd/pr100757-3.c: New. * gcc.target/arm/simd/pr100757-4.c: New. * gcc.target/arm/simd/pr100757.c: New. * gcc.dg/signbit-2.c: Skip when targeting ARM/MVE. * lib/target-supports.exp (check_effective_target_arm_mve): New.
2022-02-22nvptx: Add -mptx=6.0Tobias Burnus1-3/+4
Currently supported internally are 3.1, 6.0, 6.3 and 7.0. However, -mptx= supports 3.1, 6.3, 7.0 – but not the internal default 6.0. Add -mptx=6.0 for consistency. Tested on nvptx. gcc/ChangeLog: * config/nvptx/nvptx.opt (mptx): Add 6.0 alias PTX_VERSION_6_0. * doc/invoke.texi (-mptx): Update for new values and defaults. Co-Authored-By: Tom de Vries <tdevries@suse.de>
2022-02-21aarch64: Add compiler support for Shadow Call StackDan Li3-0/+37
Shadow Call Stack can be used to protect the return address of a function at runtime, and clang already supports this feature[1]. To enable SCS in user mode, in addition to compiler, other support is also required (as discussed in [2]). This patch only adds basic support for SCS from the compiler side, and provides convenience for users to enable SCS. For linux kernel, only the support of the compiler is required. [1] https://clang.llvm.org/docs/ShadowCallStack.html [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 Signed-off-by: Dan Li <ashimida@linux.alibaba.com> gcc/ChangeLog: * config/aarch64/aarch64.cc (SLOT_REQUIRED): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_layout_frame): Likewise, and change callee_adjust when scs is enabled. (aarch64_save_callee_saves): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_restore_callee_saves): Change wb_candidate[12] to wb_pop_candidate[12]. (aarch64_get_separate_components): Change wb_candidate[12] to wb_push_candidate[12]. (aarch64_expand_prologue): Push x30 onto SCS before it's pushed onto stack. (aarch64_expand_epilogue): Pop x30 frome SCS, while preventing it from being popped from the regular stack again. (aarch64_override_options_internal): Add SCS compile option check. (TARGET_HAVE_SHADOW_CALL_STACK): New hook. * config/aarch64/aarch64.h (struct GTY): Add is_scs_enabled, wb_pop_candidate[12], and rename wb_candidate[12] to wb_push_candidate[12]. * config/aarch64/aarch64.md (scs_push): New template. (scs_pop): Likewise. * doc/invoke.texi: Document -fsanitize=shadow-call-stack. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook have_shadow_call_stack. * flag-types.h (enum sanitize_code): Add SANITIZE_SHADOW_CALL_STACK. * opts.cc (parse_sanitizer_options): Add shadow-call-stack and exclude SANITIZE_SHADOW_CALL_STACK. * target.def: New hook. * toplev.cc (process_options): Add SCS compile option check. * ubsan.cc (ubsan_expand_null_ifn): Enum type conversion. gcc/testsuite/ChangeLog: * gcc.target/aarch64/shadow_call_stack_1.c: New test. * gcc.target/aarch64/shadow_call_stack_2.c: New test. * gcc.target/aarch64/shadow_call_stack_3.c: New test. * gcc.target/aarch64/shadow_call_stack_4.c: New test. * gcc.target/aarch64/shadow_call_stack_5.c: New test. * gcc.target/aarch64/shadow_call_stack_6.c: New test. * gcc.target/aarch64/shadow_call_stack_7.c: New test. * gcc.target/aarch64/shadow_call_stack_8.c: New test.
2022-02-14Update -Warray-bounds documentation [PR104355].Martin Sebor1-10/+14
Resolves: PR middle-end/104355 - Misleading and outdated -Warray-bounds documentation gcc/ChangeLog: PR middle-end/104355 * doc/invoke.texi (-Warray-bounds): Update documentation.
2022-02-10doc: Tweak the www.bitwizard.nl referenceGerald Pfeifer1-1/+1
gcc: * doc/install.texi (Specific): Change the www.bitwizard.nl reference to use https.
2022-02-09x86: Add -m[no-]direct-extern-accessH.J. Lu2-1/+20
Add -m[no-]direct-extern-access and nodirect_extern_access attribute. -mdirect-extern-access is the default. With nodirect_extern_access attribute, GOT is always used to access undefined data and function symbols with nodirect_extern_access attribute, including in PIE and non-PIE. With -mno-direct-extern-access: 1. Always use GOT to access undefined data and function symbols, including in PIE and non-PIE. These will avoid copy relocations in executables. This is compatible with existing executables and shared libraries. 2. In executable and shared library, bind symbols with the STV_PROTECTED visibility locally: a. The address of data symbol is the address of data body. b. For systems without function descriptor, the function pointer is the address of function body. c. The resulting shared libraries may not be incompatible with executables which have copy relocations on protected symbols or use executable PLT entries as function addresses for protected functions in shared libraries. 3. Update asm_preferred_eh_data_format to select PC relative EH encoding format with -mno-direct-extern-access to avoid copy relocation. 4. Add ix86_reloc_rw_mask for TARGET_ASM_RELOC_RW_MASK to avoid copy relocation with -mno-direct-extern-access. gcc/ PR target/35513 PR target/100593 * config/i386/gnu-property.cc: Include "i386-protos.h". (file_end_indicate_exec_stack_and_gnu_property): Generate a GNU_PROPERTY_1_NEEDED note for -mno-direct-extern-access or nodirect_extern_access attribute. * config/i386/i386-options.cc (handle_nodirect_extern_access_attribute): New function. (ix86_attribute_table): Add nodirect_extern_access attribute. * config/i386/i386-protos.h (ix86_force_load_from_GOT_p): Add a bool argument. (ix86_has_no_direct_extern_access): New. * config/i386/i386.cc (ix86_has_no_direct_extern_access): New. (ix86_force_load_from_GOT_p): Add a bool argument to indicate call operand. Force non-call load from GOT for -mno-direct-extern-access or nodirect_extern_access attribute. (legitimate_pic_address_disp_p): Avoid copy relocation in PIE for -mno-direct-extern-access or nodirect_extern_access attribute. (ix86_print_operand): Pass true to ix86_force_load_from_GOT_p for call operand. (asm_preferred_eh_data_format): Use PC-relative format for -mno-direct-extern-access to avoid copy relocation. Check ptr_mode instead of TARGET_64BIT when selecting DW_EH_PE_sdata4. (ix86_binds_local_p): Set ix86_has_no_direct_extern_access to true for -mno-direct-extern-access or nodirect_extern_access attribute. Don't treat protected data as extern and avoid copy relocation on common symbol with -mno-direct-extern-access or nodirect_extern_access attribute. (ix86_reloc_rw_mask): New to avoid copy relocation for -mno-direct-extern-access. (TARGET_ASM_RELOC_RW_MASK): New. * config/i386/i386.opt: Add -mdirect-extern-access. * doc/extend.texi: Document nodirect_extern_access attribute. * doc/invoke.texi: Document -m[no-]direct-extern-access. gcc/testsuite/ PR target/35513 PR target/100593 * g++.target/i386/pr35513-1.C: New file. * g++.target/i386/pr35513-2.C: Likewise. * gcc.target/i386/pr35513-1a.c: Likewise. * gcc.target/i386/pr35513-1b.c: Likewise. * gcc.target/i386/pr35513-2a.c: Likewise. * gcc.target/i386/pr35513-2b.c: Likewise. * gcc.target/i386/pr35513-3a.c: Likewise. * gcc.target/i386/pr35513-3b.c: Likewise. * gcc.target/i386/pr35513-4a.c: Likewise. * gcc.target/i386/pr35513-4b.c: Likewise. * gcc.target/i386/pr35513-5a.c: Likewise. * gcc.target/i386/pr35513-5b.c: Likewise. * gcc.target/i386/pr35513-6a.c: Likewise. * gcc.target/i386/pr35513-6b.c: Likewise. * gcc.target/i386/pr35513-7a.c: Likewise. * gcc.target/i386/pr35513-7b.c: Likewise. * gcc.target/i386/pr35513-8.c: Likewise. * gcc.target/i386/pr35513-9a.c: Likewise. * gcc.target/i386/pr35513-9b.c: Likewise. * gcc.target/i386/pr35513-10a.c: Likewise. * gcc.target/i386/pr35513-10b.c: Likewise. * gcc.target/i386/pr35513-11a.c: Likewise. * gcc.target/i386/pr35513-11b.c: Likewise. * gcc.target/i386/pr35513-12a.c: Likewise. * gcc.target/i386/pr35513-12b.c: Likewise.
2022-02-08doc: RISC-V: Document the `-misa-spec=' optionMaciej W. Rozycki2-0/+31
We have recently updated the default for the `-misa-spec=' option, yet we still have not documented it nor its `--with-isa-spec=' counterpart in the GCC manuals. Fix that. gcc/ * doc/install.texi (Configuration): Document `--with-isa-spec=' RISC-V option. * doc/invoke.texi (Option Summary): List `-misa-spec=' RISC-V option. (RISC-V Options): Document it.
2022-02-04rs6000: Clean up ISA 3.1 documentation [PR100808]Bill Schmidt1-28/+43
Due to a pasto error in the documentation, vec_replace_unaligned was implemented with the same function prototypes as vec_replace_elt. It was intended that vec_replace_unaligned always specify output vectors as having type vector unsigned char, to emphasize that elements are potentially misaligned by this built-in function. This patch corrects the misimplementation. 2022-02-04 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ PR target/100808 * doc/extend.texi (Basic PowerPC Built-in Functions Available on ISA 3.1): Provide consistent type names. Remove unnecessary semicolons. Fix bad line breaks.
2022-02-04doc: Update references to "C++2a" in cpp.texiJonathan Wakely1-4/+4
gcc/ChangeLog: * doc/cpp.texi (Variadic Macros): Replace C++2a with C++20.
2022-02-02docs: mention analyzer interaction with -ftrivial-auto-var-init [PR104270]David Malcolm1-1/+2
gcc/ChangeLog: PR analyzer/104270 * doc/invoke.texi (-ftrivial-auto-var-init=): Add reference to -Wanalyzer-use-of-uninitialized-value to paragraph documenting that -ftrivial-auto-var-init= doesn't suppress warnings. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-02-02vect: Simplify and extend the complex numbers validation routines.Tamar Christina1-24/+28
This patch boosts the analysis for complex mul,fma and fms in order to ensure that it doesn't create an incorrect output. Essentially it adds an extra verification to check that the two nodes it's going to combine do the same operations on compatible values. The reason it needs to do this is that if one computation differs from the other then with the current implementation we have no way to deal with it since we have to remove the permute. When we can keep the permute around we can probably handle these by unrolling. While implementing this since I have to do the traversal anyway I took advantage of it by simplifying the code a bit. Previously we would determine whether something is a conjugate and then try to figure out which conjugate it is and then try to see if the permutes match what we expect. Now the code that does the traversal will detect this in one go and return to us whether the operation is something that can be combined and whether a conjugate is present. Secondly because it does this I can now simplify the checking code itself to essentially just try to apply fixed patterns to each operation. The patterns represent the order operations should appear in. For instance a complex MUL operation combines : Left 1 + Right 1 Left 2 + Right 2 with a permute on the nodes consisting of: { Even, Even } + { Odd, Odd } { Even, Odd } + { Odd, Even } By abstracting over these patterns the checking code becomes quite simple. As part of this I was checking the order of the operands which was left in "slp" order. as in, the same order they showed up in during SLP, which means that the accumulator is first. However it looks like I didn't document this and the x86 optab was implemented assuming the same order as FMA, i.e. that the accumulator is last. I have this changed the order to match that of FMA and FMS which corrects the x86 codegen and will update the Arm targets. This has now also been documented. gcc/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * doc/md.texi: Update docs for cfms, cfma. * tree-data-ref.h (same_data_refs): Accept optional offset. * tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating patterns. (vect_normalize_conj_loc): Remove. (is_eq_or_top): Change to take two nodes. (enum _conj_status, compatible_complex_nodes_p, vect_validate_multiplication): New. (class complex_add_pattern, complex_add_pattern::matches, complex_add_pattern::recognize, class complex_mul_pattern, complex_mul_pattern::recognize, class complex_fms_pattern, complex_fms_pattern::recognize, class complex_operations_pattern, complex_operations_pattern::recognize, addsub_pattern::recognize): Pass new cache. (complex_fms_pattern::matches, complex_mul_pattern::matches): Pass new cache and use new validation code. * tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns, vect_analyze_slp): Pass along cache. (compatible_calls_p): Expose. * tree-vectorizer.h (compatible_calls_p, slp_node_hash, slp_compat_nodes_map_t): New. (class vect_pattern): Update signatures include new cache. gcc/testsuite/ChangeLog: PR tree-optimization/102819 PR tree-optimization/103169 * g++.dg/vect/pr99149.cc: xfail for now. * gcc.dg/vect/complex/pr102819-1.c: New test. * gcc.dg/vect/complex/pr102819-2.c: New test. * gcc.dg/vect/complex/pr102819-3.c: New test. * gcc.dg/vect/complex/pr102819-4.c: New test. * gcc.dg/vect/complex/pr102819-5.c: New test. * gcc.dg/vect/complex/pr102819-6.c: New test. * gcc.dg/vect/complex/pr102819-7.c: New test. * gcc.dg/vect/complex/pr102819-8.c: New test. * gcc.dg/vect/complex/pr102819-9.c: New test. * gcc.dg/vect/complex/pr103169.c: New test.
2022-02-02cris: Don't default to -mmul-bug-workaroundHans-Peter Nilsson1-1/+1
This flips the default for the errata handling for an old version (TL;DR: workaround: no multiply instruction last on a cache-line). Newer versions of the CRIS cpu don't have that bug. While the impact of the workaround is very marginal (coremark: less than .05% larger, less than .0005% slower) it's an irritating pseudorandom factor when assessing the impact of other changes. Also, fix a wart requiring changes to more than TARGET_DEFAULT to flip the default. People building old kernels or operating systems to run on ETRAX 100 LX are advised to pass "-mmul-bug-workaround". gcc: * config/cris/cris.h (TARGET_DEFAULT): Don't include MASK_MUL_BUG. (MUL_BUG_ASM_DEFAULT): New macro. (MAYBE_AS_NO_MUL_BUG_ABORT): Define in terms of MUL_BUG_ASM_DEFAULT. * doc/invoke.texi (CRIS Options, -mmul-bug-workaround): Adjust accordingly.
2022-02-01[COMMITTED] Change multiprecision.org to use httpsAndrew Pinski1-1/+1
As reported at https://gcc.gnu.org/pipermail/gcc/2022-February/238216.html, multiprecision.org now uses https so this updates the documentation to use https instead of http. Committed as obvious. gcc/ChangeLog: * doc/install.texi:
2022-02-01docs: remove --disable-stage1-checking from requirementsMartin Liska1-5/+0
As the minimal GCC version that can build the current master is 4.8, it does not make sense mentioning something for older versions. gcc/ChangeLog: * doc/install.texi: Remove option for GCC < 4.8.
2022-01-28doc: Update -Wbidi-chars documentationMarek Polacek1-1/+3
gcc/ChangeLog: * doc/invoke.texi: Update -Wbidi-chars documentation.
2022-01-28Merge branch 'master' into devel/sphinxMartin Liska3-2/+31
2022-01-24preprocessor: -Wbidi-chars and UCNs [PR104030]Marek Polacek1-2/+6
Stephan Bergmann reported that our -Wbidi-chars breaks the build of LibreOffice because we warn about UCNs even when their usage is correct: LibreOffice constructs strings piecewise, as in: aText = u"\u202D" + aText; and warning about that is overzealous. Since no editor (AFAIK) interprets UCNs to show them as Unicode characters, there's less risk in misinterpreting them, and so perhaps we shouldn't warn about them by default. However, identifiers containing UCNs or programs generating other programs could still cause confusion, so I'm keeping the UCN checking. To turn it on, you just need to use -Wbidi-chars=unpaired,ucn or -Wbidi-chars=any,ucn. The implementation is done by using the new EnumSet feature. PR preprocessor/104030 gcc/c-family/ChangeLog: * c.opt (Wbidi-chars): Mark as EnumSet. Also accept =ucn. gcc/ChangeLog: * doc/invoke.texi: Update documentation for -Wbidi-chars. libcpp/ChangeLog: * include/cpplib.h (enum cpp_bidirectional_level): Add bidirectional_ucn. Set values explicitly. * internal.h (cpp_reader): Adjust warn_bidi_p. * lex.cc (maybe_warn_bidi_on_close): Don't warn about UCNs unless UCN checking is on. (maybe_warn_bidi_on_char): Likewise. gcc/testsuite/ChangeLog: * c-c++-common/Wbidi-chars-10.c: Turn on UCN checking. * c-c++-common/Wbidi-chars-11.c: Likewise. * c-c++-common/Wbidi-chars-14.c: Likewise. * c-c++-common/Wbidi-chars-16.c: Likewise. * c-c++-common/Wbidi-chars-17.c: Likewise. * c-c++-common/Wbidi-chars-4.c: Likewise. * c-c++-common/Wbidi-chars-5.c: Likewise. * c-c++-common/Wbidi-chars-6.c: Likewise. * c-c++-common/Wbidi-chars-7.c: Likewise. * c-c++-common/Wbidi-chars-8.c: Likewise. * c-c++-common/Wbidi-chars-9.c: Likewise. * c-c++-common/Wbidi-chars-ranges.c: Likewise. * c-c++-common/Wbidi-chars-18.c: New test. * c-c++-common/Wbidi-chars-19.c: New test. * c-c++-common/Wbidi-chars-20.c: New test. * c-c++-common/Wbidi-chars-21.c: New test. * c-c++-common/Wbidi-chars-22.c: New test. * c-c++-common/Wbidi-chars-23.c: New test.
2022-01-24rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept ↵Raoni Fassina Firmino2-0/+25
and feraiseexcept [PR94193] This optimizations were originally in glibc, but was removed and suggested that they were a good fit as gcc builtins[1]. feclearexcept and feraiseexcept were extended (in comparison to the glibc version) to accept any combination of the accepted flags, not limited to just one flag bit at a time anymore. The builtin expanders needs knowledge of the target libc's FE_* values, so they are limited to expand only to suitable libcs. [1] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html 2020-08-13 Raoni Fassina Firmino <raoni@linux.ibm.com> gcc/ PR target/94193 * builtins.cc (expand_builtin_fegetround): New function. (expand_builtin_feclear_feraise_except): New function. (expand_builtin): Add cases for BUILT_IN_FEGETROUND, BUILT_IN_FECLEAREXCEPT and BUILT_IN_FERAISEEXCEPT. * config/rs6000/rs6000.md (fegetroundsi): New pattern. (feclearexceptsi): New Pattern. (feraiseexceptsi): New Pattern. * doc/extend.texi: Add a new introductory paragraph about the new builtins. * doc/md.texi: (fegetround@var{m}): Document new optab. (feclearexcept@var{m}): Document new optab. (feraiseexcept@var{m}): Document new optab. * optabs.def (fegetround_optab): New optab. (feclearexcept_optab): New optab. (feraiseexcept_optab): New optab. gcc/testsuite/ PR target/94193 * gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-1.c: New test. * gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-2.c: New test. * gcc.target/powerpc/builtin-fegetround.c: New test. Signed-off-by: Raoni Fassina Firmino <raoni@linux.ibm.com>
2022-01-24Merge branch 'master' into devel/sphinxMartin Liska3-0/+53
2022-01-24options: Add EnumBitSet property support [PR104158]Jakub Jelinek1-0/+8
On Sat, Jan 22, 2022 at 01:47:08AM +0100, Jakub Jelinek via Gcc-patches wrote: > I think with the 2) patch I achieve what we want for Fortran, for 1) > the only behavior from gcc 11 is that > -fsanitize-coverage=trace-cmp,trace-cmp is now rejected. > This is mainly from the desire to disallow > -fconvert=big-endian,little-endian or -Wbidi-chars=bidirectional,any > etc. where it would be confusing to users what exactly it means. > But it is the only from these options that actually acts as an Enum > bit set, each enumerator can be specified with all the others. > So one option would be stop requiring the EnumSet implies Set properties > must be specified and just require that either they are specified on all > EnumValues, or on none of them; the latter case would be for > -fsanitize-coverage= and the non-Set case would mean that all the > EnumValues need to have disjoint Value bitmasks and that they can > be all specified and unlike the Set case also repeated. > Thoughts on this? Here is an incremental patch to the first two patches of the series that implements EnumBitSet that fully restores the -fsanitize-coverage GCC 11 behavior. 2022-01-24 Jakub Jelinek <jakub@redhat.com> PR sanitizer/104158 * opt-functions.awk (var_set): Handle EnumBitSet property. * optc-gen.awk: Don't disallow RejectNegative if EnumBitSet is specified. * opts.h (enum cl_enum_var_value): New type. * opts-common.cc (decode_cmdline_option): Use CLEV_* values. Handle CLEV_BITSET. (cmdline_handle_error): Handle CLEV_BITSET. * opts.cc (test_enum_sets): Also test EnumBitSet requirements. * doc/options.texi (EnumBitSet): Document. * common.opt (fsanitize-coverage=): Use EnumBitSet instead of EnumSet. (trace-pc, trace-cmp): Drop Set properties. * gcc.dg/sancov/pr104158-7.c: Adjust for repeating of arguments being allowed.
2022-01-24options: Add EnumSet and Set property support [PR104158]Jakub Jelinek1-0/+25
The following patch is infrastructure support for at least 3 different options that need changes: 1) PR104158 talks about a regression with the -fsanitizer-coverage= option; in GCC 11 and older and on trunk prior to r12-1177, this option behaved similarly to -f{,no-}sanitizer{,-recover}= options, namely that the option allows negative and argument of the option is a list of strings, each of them has some enumerator and -fsanitize-coverage= enabled those bits in the underlying flag_sanitize_coverage, while -fno-sanitize-coverage= disabled them. So, -fsanitize-coverage=trace-pc,trace-cmp was equivalent to -fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp and both set flag_sanitize_coverage to (SANITIZE_COV_TRACE_PC | SANITIZE_COV_TRACE_CMP) Also, e.g. -fsanitize-coverage=trace-pc,trace-cmp -fno-sanitize-coverage=trace-pc would in the end set flag_sanitize_coverage to SANITIZE_COV_TRACE_CMP (first set both bits, then subtract one) The r12-1177 change, I think done to improve argument misspelling diagnostic, changed the option incompatibly in multiple ways, -fno-sanitize-coverage= is now rejected, only a single argument is allowed, not multiple and -fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp enables just SANITIZE_COV_TRACE_CMP and not both (each option overrides the previous value) 2) Thomas Koenig wants to extend Fortran -fconvert= option for the ppc64le real(kind=16) swapping support; currently the option accepts -fconvert={native,swap,big-endian,little-endian} and the intent is to add support for -fconvert=r16_ibm and -fconvert=r16_ieee (that alone is just normal Enum), but also to handle -fconvert=swap,r16_ieee or -fconvert=r16_ieee,big-endian but not -fconvert=big-endian,little-endian - the native/swap/big-endian/little-endian are one mutually exclusive set and r16_ieee/r16_ibm another one. See https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587943.html and thread around that. 3) Similarly Marek Polacek wants to extend the -Wbidi-chars= option, such that it will handle not just the current -Wbidi-chars={none,bidirectional,any}, but also -Wbidi-chars=ucn and bidirectional,ucn and ucn,any etc. Again two separate sets, one none/bidirectional/any and another one ucn. See https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588960.html The following patch adds framework for this and I'll post incremental patches for 1) and 2). As I've tried to document, such options are marked by additional EnumSet property on the option and in that case all the EnumValues in the Enum referenced from it must use a new Set property with set number (initially I wanted just mark last enumerator in each mutually exclusive set, but optionlist is sorted and so it doesn't really work well). So e.g. for the Fortran -fconvert=, one specifies: fconvert= Fortran RejectNegative Joined Enum(gfc_convert) EnumSet Var(flag_convert) Init(GFC_FLAG_CONVERT_NATIVE) -fconvert=<big-endian|little-endian|native|swap|r16_ieee|r16_ibm> The endianness used for unformatted files. Enum Name(gfc_convert) Type(enum gfc_convert) UnknownError(Unrecognized option to endianness value: %qs) EnumValue Enum(gfc_convert) String(big-endian) Value(GFC_FLAG_CONVERT_BIG) Set(1) EnumValue Enum(gfc_convert) String(little-endian) Value(GFC_FLAG_CONVERT_LITTLE) Set(1) EnumValue Enum(gfc_convert) String(native) Value(GFC_FLAG_CONVERT_NATIVE) Set(1) EnumValue Enum(gfc_convert) String(swap) Value(GFC_FLAG_CONVERT_SWAP) Set(1) EnumValue Enum(gfc_convert) String(r16_ieee) Value(GFC_FLAG_CONVERT_R16_IEEE) Set(2) EnumValue Enum(gfc_convert) String(r16_ibm) Value(GFC_FLAG_CONVERT_R16_IBM) Set(2) and this says to the option handling code that 1) if only one arg is specified to one instance of the option, it can be any of those 6 2) if two args are specified, one has to be from the first 4 and another from the last 2, in any order 3) at most 2 args may be specified (there are just 2 sets) There is a requirement on the Value values checked in self-test, the values from one set ored together must be disjunct from values from another set ored together. In the Fortran case, the first 4 are 0-3 so mask is 3, and the last 2 are 4 and 8, so mask is 12. When say -fconvert=big-endian is specified, it sets the first set to GFC_FLAG_CONVERT_BIG (2) but doesn't modify whatever value the other set had, so e.g. -fconvert=big-endian -fconvert=r16_ieee -fconvert=r16_ieee -fconvert=big-endian -fconvert=r16_ieee,big_endian -fconvert=big_endian,r16_ieee all behave the same. Also, with the EnumSet support, it is now possible to allow not specifying RejectNegative - we can set some set's value and then clear it and set it again to some other value etc. I think with the 2) patch I achieve what we want for Fortran, for 1) the only behavior from gcc 11 is that -fsanitize-coverage=trace-cmp,trace-cmp is now rejected. This is mainly from the desire to disallow -fconvert=big-endian,little-endian or -Wbidi-chars=bidirectional,any etc. where it would be confusing to users what exactly it means. But it is the only from these options that actually acts as an Enum bit set, each enumerator can be specified with all the others. So one option would be stop requiring the EnumSet implies Set properties must be specified and just require that either they are specified on all EnumValues, or on none of them; the latter case would be for -fsanitize-coverage= and the non-Set case would mean that all the EnumValues need to have disjoint Value bitmasks and that they can be all specified and unlike the Set case also repeated. Thoughts on this? 2022-01-24 Jakub Jelinek <jakub@redhat.com> PR sanitizer/104158 * opt-functions.awk (var_set): Handle EnumSet property. * optc-gen.awk: Don't disallow RejectNegative if EnumSet is specified. * opt-read.awk: Handle Set property. * opts.h (CL_ENUM_SET_SHIFT, CL_ERR_ENUM_SET_ARG): Define. (struct cl_decoded_option): Mention enum in value description. Add mask member. (set_option): Add mask argument defaulted to 0. * opts.cc (test_enum_sets): New function. (opts_cc_tests): Call it. * opts-common.cc (enum_arg_to_value): Change return argument from bool to int, on success return index into the cl_enum_arg array, on failure -1. Add len argument, if non-0, use strncmp instead of strcmp. (opt_enum_arg_to_value): Adjust caller. (decode_cmdline_option): Handle EnumSet represented as CLVC_ENUM with non-zero var_value. Initialize decoded->mask. (decode_cmdline_options_to_array): CLear opt_array[0].mask. (handle_option): Pass decoded->mask to set_options last argument. (generate_option): Clear decoded->mask. (generate_option_input_file): Likewise. (cmdline_handle_error): Handle CL_ERR_ENUM_SET_ARG. (set_option): Add mask argument, use it for CLVC_ENUM. (control_warning_option): Adjust enum_arg_to_value caller. * doc/options.texi: Document Set and EnumSet properties.
2022-01-21PR middle-end/104140: bootstrap ICE on riscv.Roger Sayle1-0/+9
This patch resolves the P1 "ice-on-valid-code" regression boostrapping GCC on risv-unknown-linux-gnu caused by my recent MULT_HIGHPART_EXPR functionality. RISC-V differs from x86_64 and many targets by supporting a usmusidi3 instruction, basically a widening multiply where one operand is signed and the other is unsigned. Alas the final version of my patch to recognize MULT_HIGHPART_EXPR didn't sufficiently defend against the operands of WIDEN_MULT_EXPR having different signedness. This is fixed by the two-line change to tree-ssa-math-opts.cc's convert_mult_to_highpart in the patch below. The majority of the rest of the patch is to the documentation (in tree.def and generic.texi). It turns out that WIDEN_MULT_EXPR wasn't previously documented in generic.texi, let alone the slightly unusual semantics of allowing mismatched (signed vs unsigned) operands. This also clarifies that MULT_HIGHPART_EXPR currently requires the signedness of operands to match [but this might change in a future release of GCC to support targets with usmul<mode>3_highpart]. The one final chunk of this patch (that is hopefully sufficiently close to obvious for stage 4) is a similar (NULL pointer) sanity check in riscv_cpu_cpp_builtins. Currently running cc1 from the command line (or from gdb) without specifying -march results in a segmentation fault (ICE). This is a minor annoyance tracking down issues (in cross compilers) for riscv, and trivially fixed as below. 2022-01-22 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR middle-end/104140 * tree-ssa-math-opts.cc (convert_mult_to_highpart): Check that the operands of the widening multiplication are either both signed or both unsigned, and abort the conversion if mismatched. * doc/generic.texi (WIDEN_MULT_EXPR): Describe expression node. (MULT_HIGHPART_EXPR): Clarify that operands must have the same signedness. * tree.def (MULT_HIGHPART_EXPR): Document both operands must have integer types with the same precision and signedness. (WIDEN_MULT_EXPR): Document that operands must have integer types with the same precision, but possibly differing signedness. * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Defend against riscv_current_subset_list returning a NULL pointer (empty list). gcc/testsuite/ChangeLog PR middle-end/104140 * gcc.target/riscv/pr104140.c: New test case.
2022-01-21[ARM] Add support for TLS register based stack protector canary accessArd Biesheuvel1-0/+11
Add support for accessing the stack canary value via the TLS register, so that multiple threads running in the same address space can use distinct canary values. This is intended for the Linux kernel running in SMP mode, where processes entering the kernel are essentially threads running the same program concurrently: using a global variable for the canary in that context is problematic because it can never be rotated, and so the OS is forced to use the same value as long as it remains up. Using the TLS register to index the stack canary helps with this, as it allows each CPU to context switch the TLS register along with the rest of the process, permitting each process to use its own value for the stack canary. gcc/ChangeLog: * config/arm/arm-opts.h (enum stack_protector_guard): New. * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem): New. * config/arm/arm.cc (TARGET_STACK_PROTECT_GUARD): Define. (arm_option_override_internal): Handle and put in error checks. for stack protector guard options. (arm_option_reconfigure_globals): Likewise. (arm_stack_protect_tls_canary_mem): New. (arm_stack_protect_guard): New. * config/arm/arm.md (stack_protect_set): New. (stack_protect_set_tls): Likewise. (stack_protect_test): Likewise. (stack_protect_test_tls): Likewise. (reload_tp_hard): Likewise. * config/arm/arm.opt (-mstack-protector-guard): New (-mstack-protector-guard-offset): New. * doc/invoke.texi: Document new options. gcc/testsuite/ChangeLog: * gcc.target/arm/stack-protector-7.c: New test. * gcc.target/arm/stack-protector-8.c: New test.
2022-01-20Merge branch 'master' into devel/sphinxMartin Liska1-0/+14
2022-01-20arm: Add option for mitigating against Cortex-A CPU erratum for AESRichard Earnshaw1-0/+11
Add a new option -mfix-cortex-a-aes for enabling the Cortex-A AES erratum work-around and enable it automatically for the affected products (Cortex-A57 and Cortex-A72). gcc/ChangeLog: * config/arm/arm-cpus.in (quirk_aes_1742098): New quirk feature (ALL_QUIRKS): Add it. (cortex-a57, cortex-a72): Enable it. (cortex-a57.cortex-a53, cortex-a72.cortex-a53): Likewise. * config/arm/arm.opt (mfix-cortex-a57-aes-1742098): New command-line option. (mfix-cortex-a72-aes-1655431): New option alias. * config/arm/arm.cc (arm_option_override): Handle default settings for AES erratum switch. * doc/invoke.texi (Arm Options): Document new options.
2022-01-18Limit the number of relations registered per basic block.Andrew MacLeod1-0/+3
In pathological cases, the number of transitive relations being added is potentially quadratic. Lookups for relations in a block is linear in nature, so simply limit the number of relations to some reasonable number. PR tree-optimization/104038 * doc/invoke.texi (relation-block-limit): New. * params.opt (relation-block-limit): New. * value-relation.cc (dom_oracle::register_relation): Check for NULL record before invoking transitive registery. (dom_oracle::set_one_relation): Check limit before creating record. (dom_oracle::register_transitives): Stop when no record created. * value-relation.h (relation_chain_head::m_num_relations): New.
2022-01-18Merge branch 'master' into devel/sphinxMartin Liska23-258/+459
2022-01-18Update prerequisites for GNATArnaud Charlet1-1/+1
* doc/install.texi: Update prerequisites for GNAT