aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-06-15configure: Implement --enable-host-pieMarek Polacek8-70/+130
[ This is my third attempt to add this configure option. The first version was approved but it came too late in the development cycle. The second version was also approved, but I had to revert it: <https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607082.html>. I've fixed the problem (by moving $(PICFLAG) from INTERNAL_CFLAGS to ALL_COMPILERFLAGS). Another change is that since r13-4536 I no longer need to touch Makefile.def, so this patch is simplified. ] This patch implements the --enable-host-pie configure option which makes the compiler executables PIE. This can be used to enhance protection against ROP attacks, and can be viewed as part of a wider trend to harden binaries. It is similar to the option --enable-host-shared, except that --e-h-s won't add -shared to the linker flags whereas --e-h-p will add -pie. It is different from --enable-default-pie because that option just adds an implicit -fPIE/-pie when the compiler is invoked, but the compiler itself isn't PIE. Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH regressions. When building the compiler, the build process may use various in-tree libraries; these need to be built with -fPIE so that it's possible to use them when building a PIE. For instance, when --with-included-gettext is in effect, intl object files must be compiled with -fPIE. Similarly, when building in-tree gmp, isl, mpfr and mpc, they must be compiled with -fPIE. With this patch and --enable-host-pie used to configure gcc: $ file gcc/cc1{,plus,obj,gm2} gcc/f951 gcc/lto1 gcc/cpp gcc/go1 gcc/rust1 gcc/gnat1 gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=98e22cde129d304aa6f33e61b1c39e144aeb135e, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=859d1ea37e43dfe50c18fd4e3dd9a34bb1db8f77, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/cc1obj: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=1964f8ecee6163182bc26134e2ac1f324816e434, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/cc1gm2: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a396672c7ff913d21855829202e7b02ecf42ff4c, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/f951: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=59c523db893186547ac75c7a71f48be0a461c06b, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/lto1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=084a7b77df7be2d63c2d4c655b5bbc3fcdb6038d, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/cpp: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=3503bf8390d219a10d6653b8560aa21158132168, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/go1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=988cc673af4fba5dcb482f4b34957b99050a68c5, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/rust1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b6a5d3d514446c4dcdee0707f086ab9b274a8a3c, for GNU/Linux 3.2.0, with debug_info, not stripped gcc/gnat1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=bb11ccdc2c366fe3fe0980476bcd8ca19b67f9dc, for GNU/Linux 3.2.0, with debug_info, not stripped I plan to add an option to link with -Wl,-z,now. Bootstrapped on x86_64-pc-linux-gnu with --with-included-gettext --enable-host-pie as well as without --enable-host-pie. Also tested on a Debian system where the system gcc was configured with --enable-default-pie. Co-Authored by: Iain Sandoe <iain@sandoe.co.uk> ChangeLog: * configure.ac (--enable-host-pie): New check. Set PICFLAG after this check. * configure: Regenerate. c++tools/ChangeLog: * Makefile.in: Rename PIEFLAG to PICFLAG. Set LD_PICFLAG. Use it. Use pic/libiberty.a if PICFLAG is set. * configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG. (--enable-host-pie): New check. * configure: Regenerate. fixincludes/ChangeLog: * Makefile.in: Set and use PICFLAG and LD_PICFLAG. Use the "pic" build of libiberty if PICFLAG is set. * configure.ac: * configure: Regenerate. gcc/ChangeLog: * Makefile.in: Set LD_PICFLAG. Use it. Set enable_host_pie. Remove NO_PIE_CFLAGS and NO_PIE_FLAG. Pass LD_PICFLAG to ALL_LINKERFLAGS. Use the "pic" build of libiberty if --enable-host-pie. * configure.ac (--enable-host-shared): Don't set PICFLAG here. (--enable-host-pie): New check. Set PICFLAG and LD_PICFLAG after this check. * configure: Regenerate. * doc/install.texi: Document --enable-host-pie. gcc/ada/ChangeLog: * gcc-interface/Make-lang.in (ALL_ADAFLAGS): Remove NO_PIE_CFLAGS. Add PICFLAG. Use PICFLAG when building ada/b_gnat1.o and ada/b_gnatb.o. * gcc-interface/Makefile.in: Use pic/libiberty.a if PICFLAG is set. Remove NO_PIE_FLAG. gcc/m2/ChangeLog: * Make-lang.in: New var, GM2_PICFLAGS. Use it. gcc/d/ChangeLog: * Make-lang.in: Remove NO_PIE_CFLAGS. intl/ChangeLog: * Makefile.in: Use @PICFLAG@ in COMPILE as well. * configure.ac (--enable-host-shared): Don't set PICFLAG here. (--enable-host-pie): New check. Set PICFLAG after this check. * configure: Regenerate. libcody/ChangeLog: * Makefile.in: Pass LD_PICFLAG to LDFLAGS. * configure.ac (--enable-host-shared): Don't set PICFLAG here. (--enable-host-pie): New check. Set PICFLAG and LD_PICFLAG after this check. * configure: Regenerate. libcpp/ChangeLog: * configure.ac (--enable-host-shared): Don't set PICFLAG here. (--enable-host-pie): New check. Set PICFLAG after this check. * configure: Regenerate. libdecnumber/ChangeLog: * configure.ac (--enable-host-shared): Don't set PICFLAG here. (--enable-host-pie): New check. Set PICFLAG after this check. * configure: Regenerate. libiberty/ChangeLog: * configure.ac: Also set shared when enable_host_pie. * configure: Regenerate. zlib/ChangeLog: * configure.ac (--enable-host-shared): Don't set PICFLAG here. (--enable-host-pie): New check. Set PICFLAG after this check. * configure: Regenerate.
2023-06-15cprop_hardreg: Enable propagation of the stack pointer if possibleManolis Tsamis1-1/+6
Propagation of the stack pointer in cprop_hardreg is currenty forbidden in all cases, due to maybe_mode_change returning NULL. Relax this restriction and allow propagation when no mode change is requested. gcc/ChangeLog: * regcprop.cc (maybe_mode_change): Enable stack pointer propagation.
2023-06-15Add another testcase for PR 110266Andrew Pinski1-0/+9
Since the combining of sin/cos into cexpi is depedent on the target, this adds another testcase which had failed (earlier in evpr rather than vrp2) that will fail on all targets rather than ones which have sincos or C99 math functions. Committed as obvious after a quick test. gcc/testsuite/ChangeLog: PR tree-optimization/110266 * gcc.c-torture/compile/pr110266.c: New test.
2023-06-15Check for integer only complex.Andrew MacLeod2-2/+24
With the expanded capabilities of range-op dispatch, floating point complex objects can appear when folding, whic they couldn't before. In the processig for extracting integers from complex ints, make sure it is an integer complex. PR tree-optimization/110266 gcc/ * gimple-range-fold.cc (adjust_imagpart_expr): Check for integer complex type. (adjust_realpart_expr): Ditto. gcc/testsuite/ * gcc.dg/pr110266.c: New.
2023-06-15libcpp: Diagnose #include after failed __has_include [PR80753]Jakub Jelinek1-0/+15
As can be seen in the testcase, we don't diagnose #include/#include_next of a non-existent header if __has_include/__has_include_next is done for that header first. The problem is that we normally error the first time some header is not found, but in the _cpp_FFK_HAS_INCLUDE case obviously don't want to diagnose it, just expand it to 0. And libcpp caches both successful includes and unsuccessful ones. The following patch fixes that by remembering that we haven't diagnosed error when using __has_include* on it, and diagnosing it when using the cache entry in normal mode the first time. I think _cpp_FFK_NORMAL is the only mode in which we normally diagnose errors, for _cpp_FFK_PRE_INCLUDE that open_file_failed isn't reached and for _cpp_FFK_FAKE neither. 2023-06-15 Jakub Jelinek <jakub@redhat.com> PR preprocessor/80753 libcpp/ * files.cc (struct _cpp_file): Add deferred_error bitfield. (_cpp_find_file): When finding a file in cache with deferred_error set in _cpp_FFK_NORMAL mode, call open_file_failed and clear the flag. Set deferred_error in _cpp_FFK_HAS_INCLUDE mode if open_file_failed hasn't been called. gcc/testsuite/ * c-c++-common/missing-header-5.c: New test.
2023-06-15x86/AVX512: use VMOVDDUP for broadcast to V2DFJan Beulich1-2/+2
Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on double precision floating values - is more appropriate to use here, and it can also result in shorter insn encodings when source is memory or %xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX prefix then instead of a 3-byte one). gcc/ * config/i386/sse.md (<avx512>_vec_dup<mode><mask_name>): Use vmovddup.
2023-06-15x86: add Bk and Br to comment list B's sub-charsJan Beulich1-0/+2
gcc/ * config/i386/constraints.md: Mention k and r for B.
2023-06-15LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]Lulu Cheng1-2/+6
Micro-architecture unconditionally treats a "jr $ra" as "return from subroutine", hence doing "jr $ra" would interfere with both subroutine return prediction and the more general indirect branch prediction. Therefore, a problem like PR110136 can cause a significant increase in branch error prediction rate and affect performance. The same problem exists with "indirect_jump". gcc/ChangeLog: PR target/110136 * config/loongarch/loongarch.md: Modify the register constraints for template "jumptable" and "indirect_jump" from "r" to "e". Co-authored-by: Andrew Pinski <apinski@marvell.com>
2023-06-15ada: Remove unused filesMarc Poulhiès6-28/+0
gcc/ada/ChangeLog: * vxworks7-cert-rtp-base-link.spec: Removed. * vxworks7-cert-rtp-base-link__ppc64.spec: Removed. * vxworks7-cert-rtp-base-link__x86.spec: Removed. * vxworks7-cert-rtp-base-link__x86_64.spec: Removed. * vxworks7-cert-rtp-link.spec: Removed. * vxworks7-cert-rtp-link__ppcXX.spec: Removed.
2023-06-15ada: Fix wrong code for ACATS cd1c03i on Morello targetEric Botcazou1-2/+6
gcc/ada/ * gcc-interface/utils2.cc (build_binary_op) <MODIFY_EXPR>: Do not remove a VIEW_CONVERT_EXPR on the LHS if it is also on the RHS.
2023-06-15ada: Fix wrong finalization for double subtype of bounded vectorEric Botcazou1-4/+10
The special handling of temporaries created for return values and subject to a renaming needs to be restricted to the top level, where it is needed to prevent dangling references to the frame of the elaboration routine from being created, because, at a lower level, the front-end may create implicit renamings of objects as these temporaries, so a copy is not allowed. gcc/ada/ * gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Restrict the special handling of temporaries created for return values and subject to a renaming to the top level.
2023-06-15ada: Make minor improvements to user's guideRonan Desplanques3-37/+36
gcc/ada/ * doc/gnat_ugn/about_this_guide.rst: Fix typo. Uniformize punctuation. * doc/gnat_ugn/the_gnat_compilation_model.rst: Uniformize punctuation. Fix capitalization. Fix indentation of code block. Fix RST formatting syntax errors. * gnat_ugn.texi: Regenerate.
2023-06-15ada: Reject Loop_Entry inside prefix of Loop_EntryYannick Moy1-3/+18
This rule was incompletely stated in SPARK RM and not checked. This is now fixed. gcc/ada/ * sem_attr.adb (Analyze_Attribute): Reject case of Loop_Entry inside the prefix of Loop_Entry, as per SPARK RM 5.5.3.1(4,8).
2023-06-15ada: Fix too small secondary stack allocation for returned conversionEric Botcazou2-30/+63
The previous fix did not address a latent issue whereby the allocation would be made using the (static) subtype of the conversion instead of the (dynamic) subtype of the return object, so this change rewrites the code responsible for determining the type used for the allocation, and also contains a small improvement to the Has_Tag_Of_Type predicate. gcc/ada/ * exp_ch3.adb (Make_Allocator_For_Return): Rewrite the logic that determines the type used for the allocation and add assertions. * exp_util.adb (Has_Tag_Of_Type): Also return true for extension aggregates.
2023-06-15ada: Fix internal error on loop iterator filter with -gnatVaEric Botcazou3-4/+13
The problem is that the condition of the iterator filter is expanded early, before it is integrated into an if statement of the loop body, so there is no place to attach the actions generated by this expansion. This happens only for simple loops, i.e. with a parameter specification, so the fix uses the same approach for them as for loops based on iterators. gcc/ada/ * sinfo.ads (Iterator_Filter): Document field. * sem_ch5.adb (Analyze_Iterator_Specification): Move comment around. (Analyze_Loop_Parameter_Specification): Only preanalyze the iterator filter, if any. * exp_ch5.adb (Expand_N_Loop_Statement): Analyze the new list built when an iterator filter is present.
2023-06-15ada: Revert latest change to Find_Hook_ContextEric Botcazou1-10/+0
The issue is that, if an aggregate is both below a conditional expression and above another conditional expression in the tree, we have currently no place to put the finalization actions generated by the innermost expression in the context of the aggregate before it is expanded, so they end up being placed after the outermost expression. But it is not clear whether that's really problematic because this does not seem to happen for array aggregates with multiple or others choices: in this case the aggregate is expanded first and the code path is not taken. gcc/ada/ * exp_util.adb (Find_Hook_Context): Revert latest change.
2023-06-15ada: Fix too small secondary stack allocation for returned aggregateEric Botcazou1-3/+14
This restores the specific treatment of aggregates that are returned through an extended return statement in a function returning a class-wide type, and which was incorrectly dropped in an earlier change. gcc/ada/ * exp_ch3.adb (Make_Allocator_For_Return): Deal again specifically with an aggregate returned through an object of a class-wide type.
2023-06-15ada: Remove dead code in Expand_Iterator_Loop_Over_ContainerEric Botcazou1-13/+4
The Condition_Actions field can only be populated for while loops. gcc/ada/ * exp_ch5.adb (Expand_Iterator_Loop_Over_Container): Do not insert an always empty list. Remove unused parameter Isc. (Expand_Iterator_Loop): Adjust call to above procedure.
2023-06-15ada: Add escape hatch to configurable run-timeRonan Desplanques2-0/+12
Before this patch, the fact that Restrictions pragmas had to fit on a single line in system.ads was difficult to reconcile with the 80-character line limit that is enforced in that file. The special rules for pragmas in system.ads made it impossible to us the Style_Checks pragma to allow long Restrictions pragmas. This patch relaxes those rules so the Style_Checks pragma can be used in system.ads. gcc/ada/ * targparm.adb: Allow pragma Style_Checks in some forms. * targparm.ads: Document new pragma permission.
2023-06-15ada: Fix missing finalization for aggregates nested in conditional expressionsEric Botcazou2-1/+23
The finalization actions for the components of the aggregates are blocked by Expand_Ctrl_Function_Call, which sets Is_Ignored_Transient on all the temporaries generated from within a conditional expression whatever the intermediate constructs. Now aggregates and their expansion in the form of block and loop statements are "impenetrable" as far as temporaries are concerned, i.e. the lifetime of temporaries generated within them does not extend beyond them, so their finalization must not be blocked there. gcc/ada/ * exp_util.ads (Within_Case_Or_If_Expression): Adjust description. * exp_util.adb (Find_Hook_Context): Stop the search for the topmost conditional expression, if within one, at contexts where temporaries may be contained. (Within_Case_Or_If_Expression): Return false upon first encoutering contexts where temporaries may be contained.
2023-06-15ada: Adjust QNX Ada priorities to match QNX system prioritiesJohannes Kliemann2-9/+7
The Ada priority range of the QNX runtime started from 0, differing from the QNX system priorities range starting from 1. As this may cause confusion, especially if used in a mixed language environment, the Ada priority range now starts at 1. The default priority of Ada tasks as mandated is the middle of the priority range. On QNX this means the default priority of Ada tasks is 30. This is much higher than the default QNX priority of 10 and may cause unexpected system interruptions when Ada tasks take a lot of CPU time. gcc/ada/ * libgnarl/s-osinte__qnx.adb: Adjust priority conversion function. * libgnat/system-qnx-arm.ads: Adjust priority range and default priority.
2023-06-15ada: Adjust comments in targparm.adsRonan Desplanques1-20/+5
This patch removes a few dangling references to the late front-end implementation of exceptions from the comments of targparm.ads, and also fixes a thinko there. gcc/ada/ * targparm.ads: Remove references to front-end-based exceptions. Fix thinko.
2023-06-15ada: Accept aspect Always_Terminates on packagesPiotr Trojanek2-2/+21
The recently added aspect Always_Terminates is now allowed on packages and generic packages, but only when it has no arguments. The intuitive meaning is that all subprograms declared in such a package are always terminating. gcc/ada/ * contracts.adb (Add_Contract_Item): Add pragma Always_Terminates to package contract. * sem_prag.adb (Analyze_Pragma): Accept pragma Always_Terminates on packages and generic packages, but only when it has no arguments.
2023-06-15ada: Accept aspect Always_Terminates on entriesPiotr Trojanek1-0/+5
The recently added aspect Always_Terminates is allowed on both procedures and entries. gcc/ada/ * sem_prag.adb (Analyze_Pragma): Accept pragma Always_Terminates when it applies to an entry.
2023-06-15ada: Reject aspect Always_Terminates on functions and generic functionsPiotr Trojanek1-0/+13
The recently added aspect Always_Terminates is only allowed on procedures. gcc/ada/ * sem_prag.adb (Analyze_Pragma): Reject pragma Always_Terminates when it applies to a function or generic function.
2023-06-15ada: Fix missing error on function call returning incomplete viewEric Botcazou1-0/+6
Testing for the presence of Non_Limited_View is not sufficient to detect whether the nonlimited view has been analyzed because Build_Limited_Views always sets the field on the limited view. Instead the discriminant is whether this nonlimited view is itself an incomplete type. gcc/ada/ * sem_ch4.adb (Analyze_Call): Adjust the test to detect the presence of an incomplete view of a type on a function call.
2023-06-15ada: Fix minor issues in commentsRonan Desplanques1-3/+2
The package Ttypef has been removed but a reference to it was left over in a comment. This patch removes that reference, and also fixes a typo. gcc/ada/ * ttypes.ads: Remove reference to Ttypef in comment. Fix typo in comment.
2023-06-15ada: Remove Ttypes.Max_Unaligned_FieldEric Botcazou6-28/+2
This constant has been unused for ages. The corresponding getter function is also removed from the Get_Targ package, but the corresponding constant declared in Set_Targ is preserved for the sake of backward compatibility of the target file format. gcc/ada/ * get_targ.ads (Get_Max_Unaligned_Field): Delete. * ada_get_targ.adb (Get_Max_Unaligned_Field): Likewise. * get_targ.adb (Get_Max_Unaligned_Field): Likewise. * set_targ.ads (Max_Unaligned_Field): Adjust comment. * set_targ.adb: Set Max_Unaligned_Field to 1 during elaboration. * ttypes.ads (Max_Unaligned_Field): Delete.
2023-06-15ada: Fix inverted implementation of RM 8.4(10) clause for operatorsEric Botcazou1-1/+1
The comment is correct but the code implements the opposite outcome. gcc/ada/ * sem_type.adb (Disambiguate): Fix pasto in the implementation of the RM 8.4(10) clause for operators.
2023-06-15ada: Accept aspect Always_Terminates without expressionPiotr Trojanek3-51/+53
The recently added aspect Always_Terminates is now accepted without explicit boolean expression, where a missing expression implicitly means True, similar to aspects Async_Readers, Async_Writers, etc. gcc/ada/ * aspects.adb (Base_Aspect): Fix layout. * aspects.ads (Aspect_Argument): Expression for Always_Terminates is optional. * sem_prag.adb (Analyze_Always_Terminates_In_Decl_Part): Only analyze expression when pragma argument is present. (Analyze_Pragma): Argument for Always_Terminates is optional; fix whitespace for Async_Readers.
2023-06-15ada: Crash on C++ constructor of private typeJavier Miranda1-2/+18
The compiler crashes compiling a function that has pragma CPP_constructor when its return type is a private type. gcc/ada/ * sem_util.adb (Is_CPP_Constructor_Call): Add missing support for calls to functions returning a private type.
2023-06-15ada: Remove obsolete references for Build_Transient_Object_StatementsEric Botcazou1-3/+3
gcc/ada/ * exp_util.ads (Build_Transient_Object_Statements): Remove obsolete references to array and record aggregates in documentation.
2023-06-15ada: Fix aspect Linker_Section ignored on subprogram bodyEric Botcazou1-12/+22
The compiler is waiting for the freeze node of the body, but it is never generated since the freezing of the body is not delayed. The change also removes an obsolete piece of code. gcc/ada/ * sem_ch13.adb (Analyze_Aspect_Specifications): Add missing items in the list of aspects handled by means of Insert_Pragma. <Aspect_Linker_Section>: Remove obsolete code. Do not delay the processing of the aspect if the entity is already frozen.
2023-06-15ada: Cleanup analysis of iterated component associationPiotr Trojanek1-7/+5
Cleanups related to analysis of iterated component association for GNATprove. gcc/ada/ * sem_aggr.adb (Resolve_Array_Aggregate): Simplify comment. (Resolve_Iterated_Component_Association): Tune comment; change variable to constant.
2023-06-15LoongArch: Set default alignment for functions and labels with -mtuneXi Ruoyao4-0/+27
The LA464 micro-architecture is sensitive to alignment of code. The Loongson team has benchmarked various combinations of function, the results [1] show that 16-byte label alignment together with 32-byte function alignment gives best results in terms of SPEC score. Add a mtune-based table-driven mechanism to set the default of -falign-{functions,labels}. As LA464 is the first (and the only for now) uarch supported by GCC, the same setting is also used for the "generic" -mtune=loongarch64. In the future we may set different settings for LA{2,3,6}64 once we add the support for them. Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? gcc/ChangeLog: * config/loongarch/loongarch-tune.h (loongarch_align): New struct. * config/loongarch/loongarch-def.h (loongarch_cpu_align): New array. * config/loongarch/loongarch-def.c (loongarch_cpu_align): Define the array. * config/loongarch/loongarch.cc (loongarch_option_override_internal): Set the value of -falign-functions= if -falign-functions is enabled but no value is given. Likewise for -falign-labels=.
2023-06-15Fix 'dg-warning' in 'c-c++-common/Wfree-nonheap-object-3.c' for C++Thomas Schwinge1-1/+1
[...]/c-c++-common/Wfree-nonheap-object-3.c:57:24: warning: 'malloc (dealloc_float)' attribute ignored with deallocation functions declared 'inline' [-Wattributes] [...]/c-c++-common/Wfree-nonheap-object-3.c:51:1: note: deallocation function declared here [...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_int(int)': [...]/c-c++-common/Wfree-nonheap-object-3.c:25:20: warning: 'void __builtin_free(void*)' called on pointer 'p' with nonzero offset 4 [-Wfree-nonheap-object] [...]/c-c++-common/Wfree-nonheap-object-3.c:24:24: note: returned from 'int* alloc_int(int)' [...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_long(int)': [...]/c-c++-common/Wfree-nonheap-object-3.c:45:18: warning: 'void dealloc_long(long int*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object] [...]/c-c++-common/Wfree-nonheap-object-3.c:44:26: note: returned from 'long int* alloc_long(int)' In function 'void dealloc_float(float*)', inlined from 'void test_nowarn_float(int)' at [...]/c-c++-common/Wfree-nonheap-object-3.c:68:19: [...]/c-c++-common/Wfree-nonheap-object-3.c:53:18: warning: 'void __builtin_free(void*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object] [...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_float(int)': [...]/c-c++-common/Wfree-nonheap-object-3.c:67:28: note: returned from 'float* alloc_float(int)' PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 25) FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 45) PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 51) PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 53) PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 57) FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for excess errors) Excess errors: [...]/c-c++-common/Wfree-nonheap-object-3.c:45:18: warning: 'void dealloc_long(long int*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object] ..., that is: decorated 'void dealloc_long(long int*)' instead of plain 'dealloc_long' -- similar to how all the other 'dg-warning's allow for the decorated function signature in addition to the plain one. This issue was latent since the test case was added in commit fe7f75cf16783589eedbab597e6d0b8d35d7e470 "Correct/improve maybe_emit_free_warning (PR middle-end/98166, PR c++/57111, PR middle-end/98160)", and was finally exposed by my recent commit 9c03391ba447ff86038d6a34c90ae737c3915b5f "Tighten 'dg-warning' alternatives in 'c-c++-common/Wfree-nonheap-object{,-2,-3}.c'". gcc/testsuite/ * c-c++-common/Wfree-nonheap-object-3.c: Fix 'dg-warning' for C++.
2023-06-15middle-end, i386: Pattern recognize add/subtract with carry [PR79173]Jakub Jelinek20-6/+1118
The following patch introduces {add,sub}c5_optab and pattern recognizes various forms of add with carry and subtract with carry/borrow, see pr79173-{1,2,3,4,5,6}.c tests on what is matched. Primarily forms with 2 __builtin_add_overflow or __builtin_sub_overflow calls per limb (with just one for the least significant one), for add with carry even when it is hand written in C (for subtraction reassoc seems to change it too much so that the pattern recognition doesn't work). __builtin_{add,sub}_overflow are standardized in C23 under ckd_{add,sub} names, so it isn't any longer a GNU only extension. Note, clang has for these (IMHO badly designed) __builtin_{add,sub}c{b,s,,l,ll} builtins which don't add/subtract just a single bit of carry, but basically add 3 unsigned values or subtract 2 unsigned values from one, and result in carry out of 0, 1, or 2 because of that. If we wanted to introduce those for clang compatibility, we could and lower them early to just two __builtin_{add,sub}_overflow calls and let the pattern matching in this patch recognize it later. I've added expanders for this on ix86 and in addition to that added various peephole2s (in preparation patches for this patch) to make sure we get nice (and small) code for the common cases. I think there are other PRs which request that e.g. for the _{addcarry,subborrow}_u{32,64} intrinsics, which the patch also improves. Would be nice if support for these optabs was added to many other targets, arm/aarch64 and powerpc* certainly have such instructions, I'd expect in fact that most targets do. The _BitInt support I'm working on will also need this to emit reasonable code. 2023-06-15 Jakub Jelinek <jakub@redhat.com> PR middle-end/79173 * internal-fn.def (UADDC, USUBC): New internal functions. * internal-fn.cc (expand_UADDC, expand_USUBC): New functions. (commutative_ternary_fn_p): Return true also for IFN_UADDC. * optabs.def (uaddc5_optab, usubc5_optab): New optabs. * tree-ssa-math-opts.cc (uaddc_cast, uaddc_ne0, uaddc_is_cplxpart, match_uaddc_usubc): New functions. (math_opts_dom_walker::after_dom_children): Call match_uaddc_usubc for PLUS_EXPR, MINUS_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR unless other optimizations have been successful for those. * gimple-fold.cc (gimple_fold_call): Handle IFN_UADDC and IFN_USUBC. * fold-const-call.cc (fold_const_call): Likewise. * gimple-range-fold.cc (adjust_imagpart_expr): Likewise. * tree-ssa-dce.cc (eliminate_unnecessary_stmts): Likewise. * doc/md.texi (uaddc<mode>5, usubc<mode>5): Document new named patterns. * config/i386/i386.md (uaddc<mode>5, usubc<mode>5): New define_expand patterns. (*setcc_qi_addqi3_cconly_overflow_1_<mode>, *setccc): Split into NOTE_INSN_DELETED note rather than nop instruction. (*setcc_qi_negqi_ccc_1_<mode>, *setcc_qi_negqi_ccc_2_<mode>): Likewise. * gcc.target/i386/pr79173-1.c: New test. * gcc.target/i386/pr79173-2.c: New test. * gcc.target/i386/pr79173-3.c: New test. * gcc.target/i386/pr79173-4.c: New test. * gcc.target/i386/pr79173-5.c: New test. * gcc.target/i386/pr79173-6.c: New test. * gcc.target/i386/pr79173-7.c: New test. * gcc.target/i386/pr79173-8.c: New test. * gcc.target/i386/pr79173-9.c: New test. * gcc.target/i386/pr79173-10.c: New test.
2023-06-15i386: Add peephole2 patterns to improve subtract with borrow with memory ↵Jakub Jelinek1-3/+151
destination [PR79173] This patch adds subborrow<mode> alternative so that it can have memory destination and adds various peephole2s which help to match it. 2023-06-15 Jakub Jelinek <jakub@redhat.com> PR middle-end/79173 * config/i386/i386.md (subborrow<mode>): Add alternative with memory destination and add for it define_peephole2 TARGET_READ_MODIFY_WRITE/-Os patterns to prefer using memory destination in these patterns.
2023-06-15i386: Add peephole2 patterns to improve add with carry or subtract with ↵Jakub Jelinek1-0/+289
borrow with memory destination [PR79173] This patch adds various peephole2s which help to recognize add with carry or subtract with borrow with memory destination. 2023-06-14 Jakub Jelinek <jakub@redhat.com> PR middle-end/79173 * config/i386/i386.md (*sub<mode>_3, @add<mode>3_carry, addcarry<mode>, @sub<mode>3_carry, *add<mode>3_cc_overflow_1): Add define_peephole2 TARGET_READ_MODIFY_WRITE/-Os patterns to prefer using memory destination in these patterns.
2023-06-15middle-end: Move constant args folding of .UBSAN_CHECK_* and .*_OVERFLOW ↵Jakub Jelinek2-16/+41
into fold-const-call.cc Here is an incremental patch to handle constant folding of these in fold-const-call.cc rather than gimple-fold.cc. Not really sure if that is the way to go because it is replacing 28 lines of former code with 65 of new code, for the overall benefit that say int foo (long long *p) { int one = 1; long long max = __LONG_LONG_MAX__; return __builtin_add_overflow (one, max, p); } can be now fully folded already in ccp1 pass while before it was only cleaned up in forwprop1 pass right after it. On Wed, Jun 14, 2023 at 12:25:46PM +0000, Richard Biener wrote: > I think that's still very much desirable so this followup looks OK. > Maybe you can re-base it as prerequesite though? Rebased then (of course with the UADDC/USUBC handling removed from this first patch, will be added in the second one). 2023-06-15 Jakub Jelinek <jakub@redhat.com> * gimple-fold.cc (gimple_fold_call): Move handling of arg0 as well as arg1 INTEGER_CSTs for .UBSAN_CHECK_{ADD,SUB,MUL} and .{ADD,SUB,MUL}_OVERFLOW calls from here... * fold-const-call.cc (fold_const_call): ... here.
2023-06-15AArch64: New RTL for ABDOluwatamilore Adebayo14-2/+424
This patch adds new RTL and tests for sabd and uabd PR tree-optimization/109156 gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_<su>abd<mode>): Rename to <su>abd<mode>3. * config/aarch64/aarch64-sve.md (<su>abd<mode>_3): Rename to <su>abd<mode>3. gcc/testsuite/ChangeLog: * gcc.target/aarch64/abd.h: New file. * gcc.target/aarch64/abd_2.c: New test. * gcc.target/aarch64/abd_3.c: New test. * gcc.target/aarch64/abd_4.c: New test. * gcc.target/aarch64/abd_none_2.c: New test. * gcc.target/aarch64/abd_none_3.c: New test. * gcc.target/aarch64/abd_none_4.c: New test. * gcc.target/aarch64/abd_run_1.c: New test. * gcc.target/aarch64/sve/abd_1.c: New test. * gcc.target/aarch64/sve/abd_none_1.c: New test. * gcc.target/aarch64/sve/abd_2.c: New test. * gcc.target/aarch64/sve/abd_none_2.c: New test.
2023-06-15Missed opportunity to use [SU]ABDOluwatamilore Adebayo4-31/+217
This adds a recognition pattern for the non-widening absolute difference (ABD). gcc/ChangeLog: * doc/md.texi (sabd, uabd): Document them. * internal-fn.def (ABD): Use new optab. * optabs.def (sabd_optab, uabd_optab): New optabs, * tree-vect-patterns.cc (vect_recog_absolute_difference): Recognize the following idiom abs (a - b). (vect_recog_sad_pattern): Refactor to use vect_recog_absolute_difference. (vect_recog_abd_pattern): Use patterns found by vect_recog_absolute_difference to build a new ABD internal call.
2023-06-15LoongArch: Change the default value of LARCH_CALL_RATIO to 6.chenxiaolong1-1/+1
During the regression testing of the LoongArch architecture GCC, it was found that the tests in the pr90883.C file failed. The problem was modulated and found that the error was caused by setting the macro LARCH_CALL_RATIO to a too large value. Combined with the actual LoongArch architecture, the different thresholds for meeting the test conditions were tested using the engineering method (SPEC CPU 2006), and the results showed that its optimal threshold should be set to 6. gcc/ChangeLog: * config/loongarch/loongarch.h (LARCH_CALL_RATIO): Modify the value of macro LARCH_CALL_RATIO on LoongArch to make it perform optimally.
2023-06-15RISC-V: Use merge approach to optimize vector permutationJuzhe-Zhong15-0/+1417
This patch is to optimize the permuation case that is suiteable use merge approach. Consider this following case: typedef int8_t vnx16qi __attribute__((vector_size (16))); void __attribute__ ((noipa)) merge0 (vnx16qi x, vnx16qi y, vnx16qi *out) { vnx16qi v = __builtin_shufflevector ((vnx16qi) x, (vnx16qi) y, MASK_16); *(vnx16qi*)out = v; } The gimple IR: v_3 = VEC_PERM_EXPR <x_1(D), y_2(D), { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }>; Selector = { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }, the common expression: { 0, nunits + 1, 2, nunits + 3, 4, nunits + 5, ... } For this selector, we can use vmsltu + vmerge to optimize the codegen. Before this patch: merge0: addi a5,sp,16 vl1re8.v v3,0(a5) li a5,31 vsetivli zero,16,e8,m1,ta,mu vmv.v.x v2,a5 lui a5,%hi(.LANCHOR0) addi a5,a5,%lo(.LANCHOR0) vl1re8.v v1,0(a5) vl1re8.v v4,0(sp) vand.vv v1,v1,v2 vmsgeu.vi v0,v1,16 vrgather.vv v2,v4,v1 vadd.vi v1,v1,-16 vrgather.vv v2,v3,v1,v0.t vs1r.v v2,0(a0) ret After this patch: merge0: addi a5,sp,16 vl1re8.v v1,0(a5) lui a5,%hi(.LANCHOR0) addi a5,a5,%lo(.LANCHOR0) vsetivli zero,16,e8,m1,ta,ma vl1re8.v v0,0(a5) vl1re8.v v2,0(sp) vmsltu.vi v0,v0,16 vmerge.vvm v1,v1,v2,v0 vs1r.v v1,0(a0) ret The key of this optimization is that: 1. mask = vmsltu (selector, nunits) 2. result = vmerge (op0, op1, mask) gcc/ChangeLog: * config/riscv/riscv-v.cc (shuffle_merge_patterns): New pattern. (expand_vec_perm_const_1): Add merge optmization. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: New test.
2023-06-15RISC-V: Ensure vector args and return use function stack to pass [PR110119]Lehua Ding3-5/+64
The V2 patch address comments from Juzhe, thanks. Hi, The reason for this bug is that in the case where the vector register is set to a fixed length (with `--param=riscv-autovec-preference=fixed-vlmax` option), TARGET_PASS_BY_REFERENCE thinks that variables of type vint32m1 can be passed through two scalar registers, but when GCC calls FUNCTION_VALUE (call function riscv_get_arg_info inside) it returns NULL_RTX. These two functions are not unified. The current treatment is to pass all vector arguments and returns through the function stack, and a new calling convention for vector registers will be added in the future. https://github.com/riscv-non-isa/riscv-elf-psabi-doc/ https://github.com/palmer-dabbelt/riscv-elf-psabi-doc/commit/126fa719972ff998a8a239c47d506c7809aea363 Best, Lehua gcc/ChangeLog: PR target/110119 * config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for vector mode (riscv_pass_by_reference): Return true for vector mode gcc/testsuite/ChangeLog: PR target/110119 * gcc.target/riscv/rvv/base/pr110119-1.c: New test. * gcc.target/riscv/rvv/base/pr110119-2.c: New test.
2023-06-15RISC-V: Align the predictor style for define_insn_and_splitPan Li2-22/+22
This patch is considered as the follow up of the below PATCH. https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621347.html We aligned the predictor style for the define_insn_and_split suggested by Kito. To avoid potential issues before we hit. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/autovec-opt.md: Align the predictor sytle. * config/riscv/autovec.md: Ditto.
2023-06-15RISC-V: Bugfix for vec_init repeating auto vectorization in RV32Pan Li1-4/+12
When constructing a vector mask from individual elements we wrongly assumed that we can broadcast BITS_PER_WORD (i.e. XLEN). The maximum is actually the vector element length (i.e. ELEN). This patch fixes this. After this patch, below failures on RV32 will be fixed. FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask): Take elen instead of scalar BITS_PER_WORD. (expand_vector_init_merge_repeating_sequence): Use inner_bits_size instead of scaler BITS_PER_WORD.
2023-06-15Daily bump.GCC Administrator4-1/+59
2023-06-14Remove MFWRAP_SPEC remnantJivan Hakobyan1-8/+0
This patch removes a remnant of mudflap. gcc/ChangeLog: * config/moxie/uclinux.h (MFWRAP_SPEC): Remove
2023-06-14aarch64: Fix -Werror=sign-compare bootstrap failureKyrylo Tkachov1-3/+3
Pushing to fix bootstrap. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold): Fix signed comparison warning in loop from npats to enelts.