riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2021-04-26	early-remat: Handle sets of multiple candidate regs [PR94605]	Richard Sandiford	2	-1/+13
	early-remat.c:process_block wasn't handling insns that set multiple candidate registers, which led to an assertion failure at the end of the main loop. Instructions that set two pseudos aren't rematerialisation candidates in themselves, but we still need to track them if another instruction that sets the same register is a rematerialisation candidate. gcc/ PR rtl-optimization/94605 * early-remat.c (early_remat::process_block): Handle insns that set multiple candidate registers. gcc/testsuite/ PR rtl-optimization/94605 * gcc.target/aarch64/sve/pr94605.c: New test. (cherry picked from commit 3c3f12e2a7625c9a2f5d74a47dbacb2fd1ae5643)
2021-04-26	Daily bump.	GCC Administrator	1	-1/+1

2021-04-25	Daily bump.	GCC Administrator	1	-1/+1

2021-04-24	Daily bump.	GCC Administrator	1	-1/+1

2021-04-23	Daily bump.	GCC Administrator	11	-1/+920

2021-04-22	sanitizer: Fix asan against glibc 2.34 [PR100114]	Jakub Jelinek	1	-5/+8
	As mentioned in the PR, SIGSTKSZ is no longer a compile time constant in glibc 2.34 and later, so static const uptr kAltStackSize = SIGSTKSZ * 4; needs dynamic initialization, but is used by a function called indirectly from .preinit_array and therefore before the variable is constructed. This results in using 0 size instead and all asan instrumented programs die with: ==91==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22) Here is a cherry-pick from upstream to fix this. 2021-04-17 Jakub Jelinek <jakub@redhat.com> PR sanitizer/100114 * sanitizer_common/sanitizer_posix_libcdep.cc: Cherry-pick llvm-project revisions 82150606fb11d28813ae6da1101f5bda638165fe and b93629dd335ffee2fc4b9b619bf86c3f9e6b0023. (cherry picked from commit 950bac27d63c1c2ac3a6ed867692d6a13f21feb3)
2021-04-22	intl: Add --enable-host-shared support [PR100096]	Jakub Jelinek	3	-2/+20
	As mentioned in the PR, building gcc with jit enabled and --enable-host-shared doesn't work on NetBSD/i?86, as libgccjit.so.0 has text relocations. The r0-125846-g459260ecf8b420b029601a664cdb21c185268ecb changes added --enable-host-shared support to various libraries, but didn't add it to intl/ subdirectory; on Linux it isn't really needed, because all: all-no all-no: #nothing but on other OSes intl/libintl.a is built. The following patch makes sure it is built with -fPIC when --enable-host-shared is used. 2021-04-16 Jakub Jelinek <jakub@redhat.com> PR jit/100096 * configure.ac: Add --enable-host-shared support. * Makefile.in: Update copyright. Add @PICFLAG@ to CFLAGS. * configure: Regenerated. (cherry picked from commit a11f31102706e33f66b60367d6863613ab3bd051)
2021-04-22	c++: Fix up handling of structured bindings in extract_locals_r [PR99833]	Jakub Jelinek	2	-1/+32
	The following testcase ICEs in tsubst_decomp_names because the assumptions that the structured binding artificial var is followed in DECL_CHAIN by the corresponding structured binding vars is violated. I've tracked it to extract_locals* which is done for the constexpr IF_STMT. extract_locals_r when it sees a DECL_EXPR adds that decl into a hash set so that such decls aren't returned from extract_locals, but in the case of a structured binding that just means the artificial var and not the vars corresponding to structured binding identifiers. The following patch fixes it by pushing not just the artificial var for structured bindings but also the other vars. 2021-04-16 Jakub Jelinek <jakub@redhat.com> PR c++/99833 pt.c (extract_locals_r): When handling DECL_EXPR of a structured binding, add to data.internal also all corresponding structured binding decls. * g++.dg/cpp1z/pr99833.C: New test. (cherry picked from commit 06d50ebc9fb2761ed2bdda5e76adb4d47a8ca983)
2021-04-22	combine: Fix up expand_compound_operation [PR99905]	Jakub Jelinek	2	-5/+42
	The following testcase is miscompiled on x86_64-linux. expand_compound_operation is called on (zero_extract:DI (mem/c:TI (reg/f:DI 16 argp) [3 i+0 S16 A128]) (const_int 16 [0x10]) (const_int 63 [0x3f])) so mode is DImode, inner_mode is TImode, pos 63, len 16 and modewidth 64. A couple of lines above the problematic spot we have: if (modewidth >= pos + len) { tem = gen_lowpart (mode, XEXP (x, 0)); where the code uses gen_lowpart and then shift left/right to extract it in mode. But the guarding condition is false - 64 >= 63 + 16 and so we enter the next condition, where the code shifts XEXP (x, 0) right by pos and then adds AND. It does so incorrectly though. Given the modewidth < pos + len, inner_mode must be necessarily larger than mode and XEXP (x, 0) has the innermode, but it was calling simplify_shift_const with mode rather than inner_mode, which meant inconsistent arguments to simplify_shift_const and in this case made a DImode MEM shift out of it. The following patch fixes it, by doing the shift in inner_mode properly and then after the shift doing the lowpart subreg and masking already in mode. 2021-04-13 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/99905 * combine.c (expand_compound_operation): If pos + len > modewidth, perform the right shift by pos in inner_mode and then convert to mode, instead of trying to simplify a shift of rtx with inner_mode by pos as if it was a shift in mode. * gcc.target/i386/pr99905.c: New test. (cherry picked from commit c965254e5af9dc68444e0289250c393ae0cd6131)
2021-04-22	combine: Don't fold away side-effects in simplify_and_const_int_1 [PR99830]	Jakub Jelinek	2	-1/+11
	Here is an alternate patch for the PR99830 bug. As discussed on IRC and in the PR, the reason why a (clobber:TI (const_int 0)) has been propagated into the debug insns is that it got optimized away during simplification from the i3 instruction pattern. And that happened because simplify_and_const_int_1 (SImode, varop, 255) with varop of (ashift:SI (subreg:SI (and:TI (clobber:TI (const_int 0 [0])) (const_int 255 [0xff])) 0) (const_int 16 [0x10])) was called and through nonzero_bits determined that (whatever << 16) & 255 is const0_rtx. It is, but if there are side-effects in varop and such clobbers are considered as such, we shouldn't optimize those away. 2021-04-13 Jakub Jelinek <jakub@redhat.com> PR debug/99830 * combine.c (simplify_and_const_int_1): Don't optimize varop away if it has side-effects. * gcc.dg/pr99830.c: New test. (cherry picked from commit 4ac7483ede91fef7cfd548ff6e30e46eeb9d95ae)
2021-04-22	c: Avoid clobbering TREE_TYPE (error_mark_node) [PR99990]	Jakub Jelinek	2	-1/+13
	The following testcase ICEs during error recovery, because finish_decl overwrites TREE_TYPE (error_mark_node), which better should stay always to be error_mark_node. 2021-04-10 Jakub Jelinek <jakub@redhat.com> PR c/99990 * c-decl.c (finish_decl): Don't overwrite TREE_TYPE of error_mark_node. * gcc.dg/pr99990.c: New test. (cherry picked from commit 91e076f3a66c1c9f6aa51e9d53d07803606e3bf1)
2021-04-22	expand: Fix up LTO ICE with COMPOUND_LITERAL_EXPR [PR99849]	Jakub Jelinek	2	-1/+24
	The gimplifier optimizes away COMPOUND_LITERAL_EXPRs, but they can remain in the form of ADDR_EXPR of COMPOUND_LITERAL_EXPRs in static initializers. By the TREE_STATIC check I meant to check that the underlying decl of the compound literal is a global rather than automatic variable which obviously can't be referenced in static initializers, but unfortunately with LTO it might end up in another partition and thus be DECL_EXTERNAL instead. 2021-04-10 Jakub Jelinek <jakub@redhat.com> PR lto/99849 * expr.c (expand_expr_addr_expr_1): Test is_global_var rather than just TREE_STATIC on COMPOUND_LITERAL_EXPR_DECLs. * gcc.dg/lto/pr99849_0.c: New test. (cherry picked from commit 2e57bc7eedb084869d17fe07b538d907b8fee819)
2021-04-22	rtlanal: Another fix for VOIDmode MEMs [PR98601]	Jakub Jelinek	2	-2/+21
	This is a sequel to the PR85022 changes, inline-asm can (unfortunately) introduce VOIDmode MEMs and in PR85022 they have been changed so that we don't pretend we know their size (as opposed to assuming they have zero size). This time we ICE in rtx_addr_can_trap_p_1 because it assumes that all memory but BLKmode has known size. The patch just treats VOIDmode MEMs like BLKmode in that regard. And, the STRICT_ALIGNMENT change is needed because VOIDmode has GET_MODE_SIZE of 0 and we don't want to check if something is a multiple of 0. 2021-04-10 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/98601 * rtlanal.c (rtx_addr_can_trap_p_1): Allow in assert unknown size not just for BLKmode, but also for VOIDmode. For STRICT_ALIGNMENT unaligned_mems handle VOIDmode like BLKmode. * gcc.dg/torture/pr98601.c: New test. (cherry picked from commit e68ac8c2b46997af1464f2549ac520a192c928b1)
2021-04-22	dse: Fix up hard reg conflict checking in replace_read [PR99863]	Jakub Jelinek	2	-6/+38
	Since PR37922 fix RTL DSE has hard register conflict checking in replace_read, so that if the replacement sequence sets (or typically just clobbers) some hard register (usually condition codes) we verify that hard register is not live. Unfortunately, it compares the hard reg set clobbered/set by the sequence (regs_set) against the currently live hard register set, but it then emits the insn sequence not at the current insn position, but before store_insn->insn. So, we should not compare against the current live hard register set, but against the hard register live set at the point of the store insn. Fortunately, we already have that remembered in store_insn->fixed_regs_live. In addition to bootstrapping/regtesting this patch on x86_64-linux and i686-linux, I've also added statistics gathering and it seems the only place where we end up rejecting the replace_read is the newly added testcase (the PR37922 is no longer effective at that) and fixed_regs_live has been always non-NULL at the if (store_insn->fixed_regs_live) spot. Rather than having there an assert, I chose to just keep regs_set as is, which means in that hypothetical case where fixed_regs_live wouldn't be computed for some store we'd still accept sequences that don't clobber/set any hard registers and just punt on those that clobber/set those. 2021-04-03 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/99863 * dse.c (replace_read): Drop regs_live argument. Instead of regs_live, use store_insn->fixed_regs_live if non-NULL, otherwise punt if insns sequence clobbers or sets any hard registers. * gcc.target/i386/pr99863.c: New test. (cherry picked from commit 7a2f91d413eb7a3eb0ba52c7ac9618a35addd12a)
2021-04-22	c++: Fix ICE on PTRMEM_CST in lambda in inline var initializer [PR99790]	Jakub Jelinek	2	-0/+17
	The following testcase ICEs (since the addition of inline var support), because the lambda contains PTRMEM_CST but finish_function is called for the lambda quite early during parsing it (from finish_lambda_function) when the containing class is still incomplete. That means that during genericization cplus_expand_constant keeps the PTRMEM_CST unmodified, but later nothing lowers it when the class is finalized. Using sizeof etc. on the class in such contexts is rejected by both g++ and clang++, and when the PTRMEM_CST appears e.g. in static var initializers rather than in functions, we handle it correctly because c_parse_final_cleanups -> lower_var_init will handle those cplus_expand_constant when all classes are already finalized. The following patch fixes it by calling cplus_expand_constant again during gimplification, as we are now unconditionally unit at a time, I'd think everything that could be completed will be before we start gimplification. 2021-03-30 Jakub Jelinek <jakub@redhat.com> PR c++/99790 * cp-gimplify.c (cp_gimplify_expr): Handle PTRMEM_CST. * g++.dg/cpp1z/pr99790.C: New test. (cherry picked from commit 7cdd30b43a63832d6f908b2dd64bd19a0817cd7b)
2021-04-22	fold-const: Fix ICE in extract_muldiv_1 [PR99777]	Jakub Jelinek	2	-4/+48
	extract_muldiv{,_1} is apparently only prepared to handle scalar integer operations, the callers ensure it by only calling it if the divisor or one of the multiplicands is INTEGER_CST and because neither multiplication nor division nor modulo are really supported e.g. for pointer types, nullptr type etc. But the CASE_CONVERT handling doesn't really check if it isn't a cast from some other type kind, so on the testcase we end up trying to build MULT_EXPR in POINTER_TYPE which ICEs. A few years ago Marek has added ANY_INTEGRAL_TYPE_P checks to two spots, but the code uses TYPE_PRECISION which means something completely different for vector types, etc. So IMNSHO we should just punt on conversions from non-integrals or non-scalar integrals. 2021-03-29 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/99777 * fold-const.c (extract_muldiv_1): For conversions, punt on casts from types other than scalar integral types. * g++.dg/torture/pr99777.C: New test. (cherry picked from commit afe9a630eae114665e77402ea083201c9d406e99)
2021-04-22	dwarf2cfi: Defer queued register saves some more [PR99334]	Jakub Jelinek	2	-4/+32
	On the testcase in the PR with -fno-tree-sink -O3 -fPIC -fomit-frame-pointer -fno-strict-aliasing -mstackrealign we have prologue: 0000000000000000 <_func_with_dwarf_issue_>: 0: 4c 8d 54 24 08 lea 0x8(%rsp),%r10 5: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp 9: 41 ff 72 f8 pushq -0x8(%r10) d: 55 push %rbp e: 48 89 e5 mov %rsp,%rbp 11: 41 57 push %r15 13: 41 56 push %r14 15: 41 55 push %r13 17: 41 54 push %r12 19: 41 52 push %r10 1b: 53 push %rbx 1c: 48 83 ec 20 sub $0x20,%rsp and emit 00000000 0000000000000014 00000000 CIE Version: 1 Augmentation: "zR" Code alignment factor: 1 Data alignment factor: -8 Return address column: 16 Augmentation data: 1b DW_CFA_def_cfa: r7 (rsp) ofs 8 DW_CFA_offset: r16 (rip) at cfa-8 DW_CFA_nop DW_CFA_nop 00000018 0000000000000044 0000001c FDE cie=00000000 pc=0000000000000000..00000000000001d5 DW_CFA_advance_loc: 5 to 0000000000000005 DW_CFA_def_cfa: r10 (r10) ofs 0 DW_CFA_advance_loc: 9 to 000000000000000e DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 13 to 000000000000001b DW_CFA_def_cfa_expression (DW_OP_breg6 (rbp): -40; DW_OP_deref) DW_CFA_expression: r15 (r15) (DW_OP_breg6 (rbp): -8) DW_CFA_expression: r14 (r14) (DW_OP_breg6 (rbp): -16) DW_CFA_expression: r13 (r13) (DW_OP_breg6 (rbp): -24) DW_CFA_expression: r12 (r12) (DW_OP_breg6 (rbp): -32) ... unwind info for that. The problem is when async signal (or stepping through in the debugger) stops after the pushq %rbp instruction and before movq %rsp, %rbp, the unwind info says that caller's %rbp is saved there at %rbp, but that is not true, caller's %rbp is either still available in the %rbp register, or in %rsp, only after executing the next instruction - movq %rsp, %rbp - the location for %rbp is correct. So, either we'd need to temporarily say: DW_CFA_advance_loc: 9 to 000000000000000e DW_CFA_expression: r6 (rbp) (DW_OP_breg7 (rsp): 0) DW_CFA_advance_loc: 3 to 0000000000000011 DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 10 to 000000000000001b or to me it seems more compact to just say: DW_CFA_advance_loc: 12 to 0000000000000011 DW_CFA_expression: r6 (rbp) (DW_OP_breg6 (rbp): 0) DW_CFA_advance_loc: 10 to 000000000000001b I've tried instead to deal with it through REG_FRAME_RELATED_EXPR from the backend, but that failed miserably as explained in the PR, dwarf2cfi.c has some rules (Rule 16 to Rule 19) that are specific to the dynamic stack realignment using drap register that only the i386 backend does right now, and by using REG_FRAME_RELATED_EXPR or REG_CFA* notes we can't emulate those rules. The following patch instead does the deferring of the hard frame pointer save rule in dwarf2cfi.c Rule 18 handling and emits it on the (set hfp sp) assignment that must appear shortly after it and adds assertion that it is the case. The difference before/after the patch on the assembly is: --- pr99334.s~ 2021-03-26 15:42:40.881749380 +0100 +++ pr99334.s 2021-03-26 17:38:05.729161910 +0100 @@ -11,8 +11,8 @@ _func_with_dwarf_issue_: andq $-16, %rsp pushq -8(%r10) pushq %rbp - .cfi_escape 0x10,0x6,0x2,0x76,0 movq %rsp, %rbp + .cfi_escape 0x10,0x6,0x2,0x76,0 pushq %r15 pushq %r14 pushq %r13 i.e. does just what we IMHO need, after pushq %rbp %rbp still contains parent's frame value and so the save rule doesn't need to be overridden there, ditto at the start of the next insn before the side-effect took effect, and we override it only after it when %rbp already has the right value. If some other target adds dynamic stack realignment in the future and the offset 0 case wouldn't be true there, the code can be adjusted so that it works on all the drap architectures, I'm pretty sure the code would need other adjustments too. For the rule 18 and for the (set hfp sp) after it we already have asserts for the drap cases that check whether the code looks the way i?86/x86_64 emit it currently. 2021-03-26 Jakub Jelinek <jakub@redhat.com> PR debug/99334 * dwarf2out.h (struct dw_fde_node): Add rule18 member. * dwarf2cfi.c (dwarf2out_frame_debug_expr): When handling (set hfp sp) assignment with drap_reg active, queue reg save for hfp with offset 0 and flush queued reg saves. When handling a push with rule18, defer queueing reg save for hfp and just assert the offset is 0. (scan_trace): Assert that fde->rule18 is false. (cherry picked from commit f5df18504c1790413f293bfb50d40faa7f1ea860)
2021-04-22	c++: Diagnose bare parameter packs in bitfield widths [PR99745]	Jakub Jelinek	2	-1/+10
	The following invalid tests ICE because we don't diagnose (and drop) bare parameter packs in bitfield widths. 2021-03-25 Jakub Jelinek <jakub@redhat.com> PR c++/99745 * decl2.c (grokbitfield): Diagnose bitfields containing bare parameter packs and don't set DECL_BIT_FIELD_REPRESENTATIVE in that case. * g++.dg/cpp0x/variadic181.C: New test. (cherry picked from commit f8780caf07340f5d5e55cf5fb1b2be07cabab1ea)
2021-04-22	c++: Diagnose references to void in structured bindings [PR99650]	Jakub Jelinek	2	-0/+25
	We ICE on the following testcase, because std::tuple_element<...,...>::type is void and for structured bindings we therefore need to create void & or void && which is invalid. We created such REFERENCE_TYPE and later ICEd in the middle-end. The following patch fixes it by diagnosing that. 2021-03-23 Jakub Jelinek <jakub@redhat.com> PR c++/99650 * decl.c (cp_finish_decomp): Diagnose void initializers when using tuple_element and get. * g++.dg/cpp1z/decomp55.C: New test. (cherry picked from commit d5e379e3fe19362442b5d0ac608fb8ddf67fecd3)
2021-04-22	dwarf2out: Fix debug info for 2 byte floats [PR99388]	Jakub Jelinek	1	-10/+20
	Aarch64, ARM and a couple of other architectures have 16-bit floats, HFmode. As can be seen e.g. on void foo (void) { __fp16 a = 1.0; asm ("nop"); a = 2.0; asm ("nop"); a = 3.0; asm ("nop"); } testcase, GCC mishandles this on the dwarf2out.c side by assuming all floating point types have sizes in multiples of 4 bytes, so what GCC emits is it says that e.g. the DW_OP_implicit_value will be 2 bytes but then doesn't emit anything and so anything emitted after it is treated by consumers as the value and then they get out of sync. real_to_target which insert_float uses indeed fills it that way, but putting into an array of long 32 bits each time, but for the half floats it puts everything into the least significant 16 bits of the first long no matter what endianity host or target has. The following patch fixes it. With the patch the -g -O2 -dA output changes (in a cross without .uleb128 support): .byte 0x9e // DW_OP_implicit_value .byte 0x2 // uleb128 0x2 + .2byte 0x3c00 // fp or vector constant word 0 .byte 0x7 // DW_LLE_start_end (.LLST0) .8byte .LVL1 // Location list begin address (.LLST0) .8byte .LVL2 // Location list end address (.LLST0) .byte 0x4 // uleb128 0x4; Location expression size .byte 0x9e // DW_OP_implicit_value .byte 0x2 // uleb128 0x2 + .2byte 0x4000 // fp or vector constant word 0 .byte 0x7 // DW_LLE_start_end (.LLST0) .8byte .LVL2 // Location list begin address (.LLST0) .8byte .LFE0 // Location list end address (.LLST0) .byte 0x4 // uleb128 0x4; Location expression size .byte 0x9e // DW_OP_implicit_value .byte 0x2 // uleb128 0x2 + .2byte 0x4200 // fp or vector constant word 0 .byte 0 // DW_LLE_end_of_list (.LLST0) Bootstrapped/regtested on x86_64-linux, aarch64-linux and armv7hl-linux-gnueabi, ok for trunk? I fear the CONST_VECTOR case is still broken, while HFmode elements of vectors should be fine (it uses eltsize of the element sizes) and likewise SFmode could be fine, DFmode vectors are emitted as two 32-bit ints regardless of endianity and I'm afraid it can't be right on big-endian. But I haven't been able to create a testcase that emits a CONST_VECTOR, for e.g. unused vector vars with constant operands we emit CONCATN during expansion and thus ... DW_OP_piece for each element of the vector and for DW_TAG_call_site_parameter we give up (because we handle CONST_VECTOR only in loc_descriptor, not mem_loc_descriptor). 2021-03-21 Jakub Jelinek <jakub@redhat.com> PR debug/99388 * dwarf2out.c (insert_float): Change return type from void to unsigned, handle GET_MODE_SIZE (mode) == 2 and return element size. (mem_loc_descriptor, loc_descriptor, add_const_value_attribute): Adjust callers. (cherry picked from commit d3dd3703f1d42b14c88b91e51a2a775fe00a2974)
2021-04-22	c: Fix up -Wunused-but-set-* warnings for _Atomics [PR99588]	Jakub Jelinek	3	-6/+97
	As the following testcases show, compared to -D_Atomic= case we have many -Wunused-but-set-* warning false positives. When an _Atomic variable/parameter is read, we call mark_exp_read on it in convert_lvalue_to_rvalue, but build_atomic_assign does not. For consistency with the non-_Atomic case where we mark_exp_read the lhs for lhs op= ... but not for lhs = ..., this patch does that too. But furthermore we need to pattern match the trees emitted by _Atomic store, so that _Atomic store itself is not marked as being a variable read, but when the result of the store is used, we mark it. 2021-03-19 Jakub Jelinek <jakub@redhat.com> PR c/99588 * c-typeck.c (mark_exp_read): Recognize what build_atomic_assign with modifycode NOP_EXPR produces and mark the _Atomic var as read if found. (build_atomic_assign): For modifycode of NOP_EXPR, use COMPOUND_EXPRs rather than STATEMENT_LIST. Otherwise call mark_exp_read on lhs. Set TREE_SIDE_EFFECTS on the TARGET_EXPR. * gcc.dg/Wunused-var-5.c: New test. * gcc.dg/Wunused-var-6.c: New test. (cherry picked from commit b1fc1f1c4b2e9005c40ed476b067577da2d2ce84)
2021-04-22	c++: Ensure correct destruction order of local statics [PR99613]	Jakub Jelinek	1	-8/+16
	As mentioned in the PR, if end of two constructions of local statics is strongly ordered, their destructors should be run in the reverse order. As we run __cxa_guard_release before calling __cxa_atexit, it is possible that we have two threads that access two local statics in the same order for the first time, one thread wins the __cxa_guard_acquire on the first one but is rescheduled in between the __cxa_guard_release and __cxa_atexit calls, then the other thread is scheduled and wins __cxa_guard_acquire on the second one and calls __cxa_quard_release and __cxa_atexit and only afterwards the first thread calls its __cxa_atexit. This means a variable whose completion of the constructor strongly happened after the completion of the other one will be destructed after the other variable is destructed. The following patch fixes that by swapping the __cxa_guard_release and __cxa_atexit calls. 2021-03-16 Jakub Jelinek <jakub@redhat.com> PR c++/99613 * decl.c (expand_static_init): For thread guards, call __cxa_atexit before calling __cxa_guard_release rather than after it. Formatting fixes. (cherry picked from commit 1703937a05b8b95bc29d2de292387dfd9eb7c9a3)
2021-04-22	expand: Fix ICE in store_bit_field_using_insv [PR93235]	Jakub Jelinek	2	-3/+22
	The following testcase ICEs on aarch64. The problem is that op0 is (subreg:HI (reg:HF ...) 0) and because we can't create a SUBREG of a SUBREG and aarch64 doesn't have HImode insv, only SImode insv, store_bit_field_using_insv tries to create (subreg:SI (reg:HF ...) 0) which is not valid for the target and so gen_rtx_SUBREG ICEs. The following patch fixes it by punting if the to be created SUBREG doesn't validate, callers of store_bit_field_using_insv can handle the fallback. 2021-03-04 Jakub Jelinek <jakub@redhat.com> PR middle-end/93235 * expmed.c (store_bit_field_using_insv): Return false of xop0 is a SUBREG and a SUBREG to op_mode can't be created. * gcc.target/aarch64/pr93235.c: New test. (cherry picked from commit 510ff5def87c70836fdbf832228661ae28e524b6)
2021-04-22	c++: Fix -fstrong-eval-order for operator &&, \|\| and , [PR82959]	Jakub Jelinek	2	-0/+36
	P0145R3 added "However, the operands are sequenced in the order prescribed for the built-in operator" rule for overloaded operator calls when using the operator syntax. op_is_ordered follows that, but added just the overloaded operators added in that paper. &&, \|\| and comma operators had rules that lhs is sequenced before rhs already in C++98. The following patch adds those cases to op_is_ordered. 2021-03-03 Jakub Jelinek <jakub@redhat.com> PR c++/82959 * call.c (op_is_ordered): Handle TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR and COMPOUND_EXPR. * g++.dg/cpp1z/eval-order10.C: New test. (cherry picked from commit 529e3b3402bd2a97b02318bd834df72815be5f0f)
2021-04-22	c-family: Avoid ICE on va_arg [PR99324]	Jakub Jelinek	2	-3/+22
	build_va_arg calls the middle-end mark_addressable, which e.g. requires that cfun is non-NULL. The following patch calls instead c_common_mark_addressable_vec which is the c-family variant similarly to the FE c_mark_addressable and cxx_mark_addressable, except that it doesn't error on addresses of register variables. As the taking of the address is artificial for the .VA_ARG ifn and when that is lowered goes away, it is similar case to the vector subscripting for which c_common_mark_addressable_vec has been added. 2021-03-03 Jakub Jelinek <jakub@redhat.com> PR c/99324 * c-common.c (build_va_arg): Call c_common_mark_addressable_vec instead of mark_addressable. Fix a comment typo - neutrallly -> neutrally. * gcc.c-torture/compile/pr99324.c: New test. (cherry picked from commit 0e87dc86eb56f732a41af2590f0b807031003fbe)
2021-04-22	c++: Fix operator() lookup in lambdas [PR95451]	Jakub Jelinek	2	-1/+37
	During name lookup, name-lookup.c uses: if (!(!iter->type && HIDDEN_TYPE_BINDING_P (iter)) && (bool (want & LOOK_want::HIDDEN_LAMBDA) \|\| !is_lambda_ignored_entity (iter->value)) && qualify_lookup (iter->value, want)) binding = iter->value; Unfortunately as the following testcase shows, this doesn't work in generic lambdas, where we on the auto b = ... lambda ICE and on the auto d = lambda reject it even when it should be valid. The problem is that the binding doesn't have a FUNCTION_DECL with LAMBDA_FUNCTION_P for the operator(), but an OVERLOAD with TEMPLATE_DECL for such FUNCTION_DECL. The following patch fixes that in is_lambda_ignored_entity, other possibility would be to do that before calling is_lambda_ignored_entity in name-lookup.c. 2021-02-26 Jakub Jelinek <jakub@redhat.com> PR c++/95451 * lambda.c (is_lambda_ignored_entity): Before checking for LAMBDA_FUNCTION_P, use OVL_FIRST. Drop FUNCTION_DECL check. * g++.dg/cpp1y/lambda-generic-95451.C: New test. (cherry picked from commit 8f9308936cf1df134d5aac1f890eb67266530ab5)
2021-04-22	fold-const: Fix up ((1 << x) & y) != 0 folding for vectors [PR99225]	Jakub Jelinek	2	-8/+39
	This optimization was written purely with scalar integers in mind, can work fine even with vectors, but we can't use build_int_cst but need to use build_one_cst instead. 2021-02-24 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/99225 * fold-const.c (fold_binary_loc) <case NE_EXPR>: In (x & (1 << y)) != 0 to ((x >> y) & 1) != 0 simplifications use build_one_cst instead of build_int_cst (..., 1). Formatting fixes. * gcc.c-torture/compile/pr99225.c: New test. (cherry picked from commit 4de402ab60c54fff48cb7371644b024d10d7e5bb)
2021-04-22	fold-const: Fix ICE in fold_read_from_constant_string on invalid code [PR99204]	Jakub Jelinek	2	-1/+11
	fold_read_from_constant_string and expand_expr_real_1 have code to optimize constant reads from string (tree vs. rtl). If the STRING_CST array type has zero low bound, index is fold converted to sizetype and so the compare_tree_int works fine, but if it has some other low bound, it calls size_diffop_loc and that function from 2 sizetype operands creates a ssizetype difference. expand_expr_real_1 then uses tree_fits_uhwi_p + compare_tree_int and so works fine, but fold-const.c only checked if index is INTEGER_CST and calls compare_tree_int, which means for negative index it will succeed and result in UB in the compiler. 2021-02-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/99204 * fold-const.c (fold_read_from_constant_string): Check that tree_fits_uhwi_p (index) rather than just that index is INTEGER_CST. * gfortran.dg/pr99204.f90: New test. (cherry picked from commit f53a9b563b5017af179f1fd900189c0ba83aa2ec)
2021-04-22	libstdc++: Fix up constexpr std::char_traits<char>::compare [PR99181]	Jakub Jelinek	2	-1/+48
	Because of LWG 467, std::char_traits<char>::lt compares the values cast to unsigned char rather than char, so even when char is signed we get unsigned comparision. std::char_traits<char>::compare uses __builtin_memcmp and that works the same, but during constexpr evaluation we were calling __gnu_cxx::char_traits<char_type>::compare. As char_traits::lt is not virtual, __gnu_cxx::char_traits<char_type>::compare used __gnu_cxx::char_traits<char_type>::lt rather than std::char_traits<char>::lt and thus compared chars as signed if char is signed. This change fixes it by inlining __gnu_cxx::char_traits<char_type>::compare into std::char_traits<char>::compare by hand, so that it calls the right lt method. 2021-02-23 Jakub Jelinek <jakub@redhat.com> PR libstdc++/99181 * include/bits/char_traits.h (char_traits<char>::compare): For constexpr evaluation don't call __gnu_cxx::char_traits<char_type>::compare but do the comparison loop directly. * testsuite/21_strings/char_traits/requirements/char/99181.cc: New test. (cherry picked from commit 311c57f6d8f285d69e44bf94152c753900cb1a0a)
2021-04-22	tree-cfg: Fix up gimple_merge_blocks FORCED_LABEL handling [PR99034]	Jakub Jelinek	2	-1/+34
	The verifiers require that DECL_NONLOCAL or EH_LANDING_PAD_NR labels are always the first label if there is more than one label. When merging blocks, we don't honor that though. On the following testcase, we try to merge blocks: <bb 13> [count: 0]: <L2>: S::~S (&s); and <bb 15> [count: 0]: <L0>: resx 1 where <L2> is landing pad and <L0> is FORCED_LABEL. And the code puts the FORCED_LABEL before the landing pad label, violating the verification requirements. The following patch fixes it by moving the FORCED_LABEL after the DECL_NONLOCAL or EH_LANDING_PAD_NR label if it is the first label. 2021-02-19 Jakub Jelinek <jakub@redhat.com> PR ipa/99034 * tree-cfg.c (gimple_merge_blocks): If bb a starts with eh landing pad or non-local label, put FORCED_LABELs from bb b after that label rather than before it. * g++.dg/opt/pr99034.C: New test. (cherry picked from commit 33be24d77d3d8f0c992eb344ce63f78e14cf753d)
2021-04-22	c: Fix ICE with -fexcess-precision=standard [PR99136]	Jakub Jelinek	2	-1/+12
	The following testcase ICEs on i686-linux, because c_finish_return wraps c_fully_folded retval back into EXCESS_PRECISION_EXPR, but when the function return type is void, we don't call convert_for_assignment on it that would then be fully folded again, but just put the retval into RETURN_EXPR's operand, so nothing removes it anymore and during gimplification we ICE as EXCESS_PRECISION_EXPR is not handled. This patch fixes it by not adding that EXCESS_PRECISION_EXPR in functions returning void, the return value is ignored and all we need is evaluate any side-effects of the expression. 2021-02-18 Jakub Jelinek <jakub@redhat.com> PR c/99136 * c-typeck.c (c_finish_return): Don't wrap retval into EXCESS_PRECISION_EXPR in functions that return void. * gcc.dg/pr99136.c: New test. (cherry picked from commit 3d7ce7ce6c03165ca1041b38e02428c925254968)
2021-04-22	c++: Fix up build_zero_init_1 once more [PR99106]	Jakub Jelinek	2	-1/+6
	My earlier build_zero_init_1 patch for flexible array members created an empty CONSTRUCTOR. As the following testcase shows, that doesn't work very well because the middle-end doesn't expect CONSTRUCTOR elements with incomplete type (that the empty CONSTRUCTOR at the end of outer CONSTRUCTOR had). The following patch just doesn't add any CONSTRUCTOR for the flexible array members, it doesn't seem to be needed. 2021-02-17 Jakub Jelinek <jakub@redhat.com> PR sanitizer/99106 * init.c (build_zero_init_1): For flexible array members just return NULL_TREE instead of returning empty CONSTRUCTOR with non-complete ARRAY_TYPE. * g++.dg/ubsan/pr99106.C: New test. (cherry picked from commit af868e89ec21340d1cafd26eaed356ce4b0104c3)
2021-04-22	match.pd: Fix up A % (cast) (pow2cst << B) simplification [PR99079]	Jakub Jelinek	3	-6/+82
	The (mod @0 (convert?@3 (power_of_two_cand@1 @2))) simplification uses tree_nop_conversion_p (type, TREE_TYPE (@3)) condition, but I believe it doesn't check what it was meant to check. On convert?@3 TREE_TYPE (@3) is not the type of what it has been converted from, but what it has been converted to, which needs to be (because it is operand of normal binary operation) equal or compatible to type of the modulo result and first operand - type. I could fix that by using && tree_nop_conversion_p (type, TREE_TYPE (@1)) and be done with it, but actually most of the non-nop conversions are IMHO ok and so we would regress those optimizations. In particular, if we have say narrowing conversions (foo5 and foo6 in the new testcase), I think we are fine, either the shift of the power of two constant after narrowing conversion is still that power of two (or negation of that) and then it will still work, or the result of narrowing conversion is 0 and then we would have UB which we can ignore. Similarly, widening conversions where the shift result is unsigned are fine, or even widening conversions where the shift result is signed, but we sign extend to a signed wider divisor, the problematic case of INT_MIN will become x % (long long) INT_MIN and we can still optimize that to x & (long long) INT_MAX. What doesn't work is the case in the pr99079.c testcase, widening conversion of a signed shift result to wider unsigned divisor, where if the shift is negative, we end up with x % (unsigned long long) INT_MIN which is x % 0xffffffff80000000ULL where the divisor is not a power of two and we can't optimize that to x & 0x7fffffffULL. So, the patch rejects only the single problematic case. Furthermore, when the shift result is signed, we were introducing UB into a program which previously didn't have one (well, left shift into the sign bit is UB in some language/version pairs, but it is definitely valid in C++20 - wonder if I shouldn't move the gcc.c-torture/execute/pr99079.c testcase to g++.dg/torture/pr99079.C and use -std=c++20), by adding that subtraction of 1, x % (1 << 31) in C++20 is well defined, but x & ((1 << 31) - 1) triggers UB on the subtraction. So, the patch performs the subtraction in the unsigned type if it isn't wrapping. 2021-02-15 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/99079 * match.pd (A % (pow2pcst << N) -> A & ((pow2pcst << N) - 1)): Remove useless tree_nop_conversion_p (type, TREE_TYPE (@3)) check. Instead require both type and TREE_TYPE (@1) to be integral types and either type having smaller or equal precision, or TREE_TYPE (@1) being unsigned type, or type being signed type. If TREE_TYPE (@1) doesn't have wrapping overflow, perform the subtraction of one in unsigned type. * gcc.dg/fold-modpow2-2.c: New test. * gcc.c-torture/execute/pr99079.c: New test. (cherry picked from commit 45de8afb2d534e3b38b4d1898686b20c29cc6a94)
2021-04-22	c++: Fix zero initialization of flexible array members [PR99033]	Jakub Jelinek	2	-9/+29
	array_type_nelts returns error_mark_node for type of flexible array members and build_zero_init_1 was placing an error_mark_node into the CONSTRUCTOR, on which e.g. varasm ICEs. I think there is nothing erroneous on zero initialization of flexible array members though, such arrays should simply get no elements, like they do if such classes are constructed (everything except when some larger initializer comes from an explicit initializer). So, this patch handles [] arrays in zero initialization like [0] arrays and fixes handling of the [0] arrays - the tree_int_cst_equal (max_index, integer_minus_one_node) check didn't do what it thought it would do, max_index is typically unsigned integer (sizetype) and so it is never equal to a -1. What the patch doesn't do and maybe would be desirable is if it returns error_mark_node for other reasons let the recursive callers not stick that into CONSTRUCTOR but return error_mark_node instead. But I don't have a testcase where that would be needed right now. 2021-02-11 Jakub Jelinek <jakub@redhat.com> PR c++/99033 * init.c (build_zero_init_1): Handle zero initialiation of flexible array members like initialization of [0] arrays. Use integer_minus_onep instead of comparison to integer_minus_one_node and integer_zerop instead of comparison against size_zero_node. Formatting fixes. * g++.dg/ext/flexary38.C: New test. (cherry picked from commit ea535f59b19f65e5b313c990ee6c194a7b055bd7)
2021-04-22	varasm: Fix ICE with -fsyntax-only [PR99035]	Jakub Jelinek	2	-1/+14
	My FE change from 2 years ago uses TREE_ASM_WRITTEN in -fsyntax-only mode more aggressively to avoid "expanding" functions multiple times. With -fsyntax-only nothing is really expanded, so I think it is acceptable to adjust the assert and allow declare_weak at any time, with -fsyntax-only we know it is during parsing only anyway. 2021-02-10 Jakub Jelinek <jakub@redhat.com> PR c++/99035 * varasm.c (declare_weak): For -fsyntax-only, allow even TREE_ASM_WRITTEN function decls. * g++.dg/ext/weak6.C: New test. (cherry picked from commit a964f494cd5a90f631b8c0c01777a9899e0351ce)
2021-04-22	openmp: Temporarily disable into_ssa when gimplifying OpenMP reduction ↵	Jakub Jelinek	5	-0/+72
	clauses [PR99007] gimplify_scan_omp_clauses was already calling gimplify_expr with false as last argument to make sure it is not an SSA_NAME, but as the testcases show, that is not enough, SSA_NAME temporaries created during that gimplification can be reused too and we can't allow SSA_NAMEs to be used across OpenMP region boundaries, as we can only firstprivatize decls. Fixed by temporarily disabling into_ssa. 2021-02-10 Jakub Jelinek <jakub@redhat.com> PR middle-end/99007 * gimplify.c (gimplify_scan_omp_clauses): For MEM_REF on reductions, temporarily disable gimplify_ctxp->into_ssa around gimplify_expr calls. * g++.dg/gomp/pr99007.C: New test. * gcc.dg/gomp/pr99007-1.c: New test. * gcc.dg/gomp/pr99007-2.c: New test. * gcc.dg/gomp/pr99007-3.c: New test. (cherry picked from commit deba6b20a3889aa23f0e4b3a5248de4172a0167d)
2021-04-22	c++: Fix ICE with structured binding initialized to incomplete array [PR97878]	Jakub Jelinek	2	-0/+30
	We ICE on the following testcase, for incomplete array a on auto [b] { a }; without giving any kind of diagnostics, with auto [c] = a; during error-recovery. The problem is that we get too far through check_initializer and e.g. store_init_value -> constexpr stuff can't deal with incomplete array types. As the type of the structured binding artificial variable is always deduced, I think it is easiest to diagnose this early, even if they have array types we'll need their deduced type to be complete rather than just its element type. 2021-02-05 Jakub Jelinek <jakub@redhat.com> PR c++/97878 * decl.c (check_array_initializer): For structured bindings, require the array type to be complete. * g++.dg/cpp1z/decomp54.C: New test. (cherry picked from commit 8b7f2d3eae16dd629ae7ae40bb76f4bb0099f441)
2021-04-22	ifcvt: Avoid ICEs trying to force_operand random RTL [PR97487]	Jakub Jelinek	3	-6/+92
	As the testcase shows, RTL ifcvt can throw random RTL (whatever it found in some insns) at expand_binop or expand_unop and expects it to do something (and then will check if it created valid insns and punts if not). These functions in the end if the operands don't match try to copy_to_mode_reg the operands, which does if (!general_operand (x, VOIDmode)) x = force_operand (x, temp); but, force_operand is far from handling all possible RTLs, it will ICE for all more unusual RTL codes. Basically handles just simple arithmetic and unary RTL operations if they have an optab and expand_simple_binop/expand_simple_unop ICE on others. The following patch fixes it by adding some operand verification (whether there is a hope that copy_to_mode_reg will succeed on those). It is added both to noce_emit_move_insn (not needed for this exact testcase, that function simply tries to recog the insn as is and if it fails, handles some simple binop/unop cases; the patch performs the verification of their operands) and noce_try_sign_mask. 2021-02-03 Jakub Jelinek <jakub@redhat.com> PR middle-end/97487 * ifcvt.c (noce_can_force_operand): New function. (noce_emit_move_insn): Use it. (noce_try_sign_mask): Likewise. Formatting fix. * gcc.dg/pr97487-1.c: New test. * gcc.dg/pr97487-2.c: New test. (cherry picked from commit 025a0ee3911c0866c69f841df24a558c7c8df0eb)
2021-04-22	expand: Fix up find_bb_boundaries [PR98331]	Jakub Jelinek	2	-0/+19
	When expansion emits some control flow insns etc. inside of a former GIMPLE basic block, find_bb_boundaries needs to split it into multiple basic blocks. The code needs to ignore debug insns in decisions how many splits to do or where in between some non-debug insns the split should be done, but it can decide where to put debug insns if they can be kept and otherwise throws them away (they can't stay outside of basic blocks). On the following testcase, we end up in the bb from expander with control flow insn debug insns barrier some other insn (the some other insn is effectively dead after __builtin_unreachable and we'll optimize that out later). Without debug insns, we'd do the split when encountering some other insn and split after PREV_INSN (some other insn), i.e. after barrier (and the splitting code then moves the barrier in between basic blocks). But if there are debug insns, we actually split before the first debug insn that appeared after the control flow insn, so after control flow insn, and get a basic block that starts with debug insns and then has a barrier in the middle that nothing moves it out of the bb. This leads to ICEs and even if it wouldn't, different behavior from -g0. The reason for treating debug insns that way is a different case, e.g. control flow insn debug insns some other insn or even control flow insn barrier debug insns some other insn where splitting before the first such debug insn allows us to keep them while otherwise we would have to drop them on the floor, and in those situations we behave the same with -g0 and -g. So, the following patch fixes it by resetting debug_insn not just when splitting the blocks (it is set only after seeing a control flow insn and before splitting for it if needed), but also when seeing a barrier, which effectively means we always throw away debug insns after a control flow insn and before following barrier if any, but there is no way around that, control flow insn must be the last in the bb (BB_END) and BARRIER after it, debug insns aren't allowed outside of bb. We still handle the other cases fine (when there is no barrier or when debug insns appear only after the barrier). 2021-01-29 Jakub Jelinek <jakub@redhat.com> PR debug/98331 * cfgbuild.c (find_bb_boundaries): Reset debug_insn when seeing a BARRIER. * gcc.dg/pr98331.c: New test. (cherry picked from commit ea0e1eaa30f42e108f6c716745347cc1dcfdc475)
2021-04-22	c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, ↵	Jakub Jelinek	3	-1/+39
	PR98847] As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. 2021-01-28 Jakub Jelinek <jakub@redhat.com> PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.dg/opt/pr98847.C: New test. (cherry picked from commit cf93f94b3498f3925895fb0bbfd4b64232b9987a)
2021-04-22	aarch64: Tighten up checks for ubfix [PR98681]	Jakub Jelinek	2	-4/+23
	The testcase in the patch doesn't assemble, because the instruction requires that the penultimate operand (lsb) range is [0, 32] (or [0, 64]) and the last operand's range is [1, 32 - lsb] (or [1, 64 - lsb]). The INTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) will accept the lsb operand to be in range [MIN, 32] (or [MIN, 64]) and then we invoke UB in the compiler and sometimes it will make it through. The patch changes all the INTVAL uses in that function to UINTVAL, which isn't strictly necessary, but can be done (e.g. after the UINTVAL (shft_amnt) < GET_MODE_BITSIZE (mode) check we know it is not negative and thus INTVAL (shft_amnt) and UINTVAL (shft_amnt) then behave the same. But, I had to add INTVAL (mask) > 0 check in that case, otherwise we risk (hypothetically) emitting instruction that doesn't assemble. The problem is with masks that have the MSB bit set, while the instruction can handle those, e.g. ubfiz w1, w0, 13, 19 will do (w0 << 13) & 0xffffe000 in RTL we represent SImode constants with MSB set as negative HOST_WIDE_INT, so it will actually be HOST_WIDE_INT_C (0xffffffffffffe000), and the instruction uses %P3 to print the last operand, which calls asm_fprintf (f, "%u", popcount_hwi (INTVAL (x))) to print that. But that will not print 19, but 51 instead, will include there also all the copies of the sign bit. Not supporting those masks with MSB set isn't a big loss though, they really shouldn't appear normally, as both GIMPLE and RTL optimizations should optimize those away (one isn't masking any bits off with such masks, so just w0 << 13 will do too). 2021-01-26 Jakub Jelinek <jakub@redhat.com> PR target/98681 * config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p): Use UINTVAL (shft_amnt) and UINTVAL (mask) instead of INTVAL (shft_amnt) and INTVAL (mask). Add && INTVAL (mask) > 0 condition. * gcc.c-torture/execute/pr98681.c: New test. (cherry picked from commit fb09d7242a25971b275292332337a56b86637f2c)
2021-04-22	rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]	Jakub Jelinek	1	-1/+2
	The x86 __m64 type is defined as: /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. / typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); and so matches the comment above it in that reads and stores through pointers to __m64 can alias anything. But in the rs6000 headers that is the case only for __m128, but not __m64. The following patch adds that attribute, which fixes the FAIL: gcc.target/powerpc/sse-movhps-1.c execution test FAIL: gcc.target/powerpc/sse-movlps-1.c execution test regressions that appeared when Honza improved ipa-modref. 2021-01-23 Jakub Jelinek <jakub@redhat.com> PR testsuite/97301 config/rs6000/mmintrin.h (__m64): Add __may_alias__ attribute. (cherry picked from commit db9a3ce7b83ce3ed3e0ffe7eb7a918595640e161)
2021-04-22	c++: Fix up ubsan false positives on references [PR95693]	Jakub Jelinek	3	-8/+35
	Alex' 2 years old change to build_zero_init_1 to return NULL pointer with reference type for references breaks the sanitizers, the assignment of NULL to a reference typed member is then instrumented before it is overwritten with a non-NULL address later on. That change has been done to fix error recovery ICE during process_init_constructor_record, where we: if (TYPE_REF_P (fldtype)) { if (complain & tf_error) error ("member %qD is uninitialized reference", field); else return PICFLAG_ERRONEOUS; } a few lines earlier, but then continue and ICE when build_zero_init returns NULL. The following patch reverts the build_zero_init_1 change and instead creates the NULL with reference type constants during the error recovery. The pr84593.C testcase Alex' change was fixing still works as before. 2021-01-22 Jakub Jelinek <jakub@redhat.com> PR sanitizer/95693 * init.c (build_zero_init_1): Revert the 2018-03-06 change to return build_zero_cst for reference types. * typeck2.c (process_init_constructor_record): Instead call build_zero_cst here during error recovery instead of build_zero_init. * g++.dg/ubsan/pr95693.C: New test. (cherry picked from commit e5750f847158e7f9bdab770fd9c5fff58c5074d3)
2021-04-22	match.pd: Replace incorrect simplifications into copysign [PR90248]	Jakub Jelinek	3	-31/+90
	In the PR Andrew said he has implemented a simplification that has been added to LLVM, but that actually is not true, what is in there are X * (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into +-abs(X) but what has been added into GCC are (X cmp 0.0 ? +-1.0 : -+1.0) simplifications into copysign(1, +-X) and then X * copysign (1, +-X) into +-abs (X). The problem is with the (X cmp 0.0 ? +-1.0 : -+1.0) simplifications, they don't work correctly when X is zero. E.g. (X > 0.0 ? 1.0 : -1.0) is -1.0 when X is either -0.0 or 0.0, but copysign will make it return 1.0 for 0.0 and -1.0 only for -0.0. (X >= 0.0 ? 1.0 : -1.0) is 1.0 when X is either -0.0 or 0.0, but copysign will make it return still 1.0 for 0.0 and -1.0 for -0.0. The simplifications were guarded on !HONOR_SIGNED_ZEROS, but as discussed in the PR, that option doesn't mean that -0.0 will not ever appear as operand of some operation, it is hard to guarantee that without compiler adding canonicalizations of -0.0 to 0.0 after most of the operations and thus making it very slow, but that the user asserts that he doesn't care if the result of operations will be 0.0 or -0.0. Not to mention that some of the transformations are incorrect even for positive 0.0. So, instead of those simplifications this patch recognizes patterns where those ?: expressions are multiplied by X, directly into +-abs. That works fine even for 0.0 and -0.0 (as long as we don't care about whether the result is exactly 0.0 or -0.0 in those cases), because whether the result of copysign is -1.0 or 1.0 doesn't matter when it is multiplied by 0.0 or -0.0. As a follow-up, maybe we should add the simplification mentioned in the PR, in particular doing copysign by hand through VIEW_CONVERT_EXPR <int, float_X> < 0 ? -float_constant : float_constant into copysign (float_constant, float_X). But I think that would need to be done in phiopt. 2021-01-22 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/90248 * match.pd (X cmp 0.0 ? 1.0 : -1.0 -> copysign(1, +-X), X cmp 0.0 ? -1.0 : +1.0 -> copysign(1, -+X)): Remove simplifications. (X * (X cmp 0.0 ? 1.0 : -1.0) -> +-abs(X), X * (X cmp 0.0 ? -1.0 : 1.0) -> +-abs(X)): New simplifications. * gcc.dg/tree-ssa/copy-sign-1.c: Don't expect any copysign builtins. * gcc.dg/pr90248.c: New test. (cherry picked from commit dd92986ea6d2d363146e1726817a84910453fdc8)
2021-04-22	tree-cfg: Allow enum types as result of POINTER_DIFF_EXPR [PR98556]	Jakub Jelinek	2	-1/+12
	As conversions between signed integers and signed enums with the same precision are useless in GIMPLE, it seems strange that we require that POINTER_DIFF_EXPR result must be INTEGER_TYPE. If we really wanted to require that, we'd need to change the gimplifier to ensure that, which it isn't the case on the following testcase. What is going on during the gimplification is that when we have the (enum T) (p - q) cast, it is stripped through /* Strip away as many useless type conversions as possible at the toplevel. / STRIP_USELESS_TYPE_CONVERSION (expr_p); and when the MODIFY_EXPR is gimplified, the to_p has enum T type, while from_p has intptr_t type and as there is no conversion in between, we just create GIMPLE_ASSIGN from that. 2021-01-09 Jakub Jelinek <jakub@redhat.com> PR c++/98556 * tree-cfg.c (verify_gimple_assign_binary): Allow lhs of POINTER_DIFF_EXPR to be any integral type. * c-c++-common/pr98556.c: New test. (cherry picked from commit 0188eab844eacda5edc6257771edb771844ae069)
2021-04-22	wide-int: Fix wi::to_mpz [PR98474]	Jakub Jelinek	2	-0/+44
	The following testcase is miscompiled, because niter analysis miscomputes the number of iterations to 0. The problem is that niter analysis uses mpz_t (wonder why, wouldn't widest_int do the same job?) and when wi::to_mpz is called e.g. on the TYPE_MAX_VALUE of __uint128_t, it initializes the mpz_t result with wrong value. wi::to_mpz has code to handle negative wide_ints in signed types by inverting all bits, importing to mpz and complementing it, which is fine, but doesn't handle correctly the case when the wide_int's len (times HOST_BITS_PER_WIDE_INT) is smaller than precision when wi::neg_p. E.g. the 0xffffffffffffffffffffffffffffffff TYPE_MAX_VALUE is represented in wide_int as 0xffffffffffffffff len 1, and wi::to_mpz would create 0xffffffffffffffff mpz_t value from that. This patch handles it by adding the needed -1 host wide int words (and has also code to deal with precision that aren't multiple of HOST_BITS_PER_WIDE_INT). 2020-12-31 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/98474 * wide-int.cc (wi::to_mpz): If wide_int has MSB set, but type is unsigned and excess negative, append set bits after len until precision. * gcc.c-torture/execute/pr98474.c: New test. (cherry picked from commit a4d191d08c6acb24034af4182b3524e6ef97546c)
2021-04-22	gimplify: Gimplify value in gimplify_init_ctor_eval_range [PR98353]	Jakub Jelinek	2	-1/+22
	gimplify_init_ctor_eval_range wasn't gimplifying value, so if it wasn't a gimple val, verification at the end of gimplification would ICE (or with release checking some random pass later on would ICE or misbehave). 2020-12-21 Jakub Jelinek <jakub@redhat.com> PR c++/98353 * gimplify.c (gimplify_init_ctor_eval_range): Gimplify value before storing it into cref. * g++.dg/opt/pr98353.C: New test. (cherry picked from commit f3113a85f098df8165624321cc85d20219fb2ada)
2021-04-22	openmp: Don't optimize shared to firstprivate on task with depend clause	Jakub Jelinek	2	-0/+59
	The attached testcase is miscompiled, because we optimize shared clauses to firstprivate when task body can't modify the variable even when the task has depend clause. That is wrong, because firstprivate means the variable will be copied immediately when the task is created, while with depend clause some other task might change it later before the dependencies are satisfied and the task should observe the value only after the change. 2020-12-18 Jakub Jelinek <jakub@redhat.com> * gimplify.c (struct gimplify_omp_ctx): Add has_depend member. (gimplify_scan_omp_clauses): Set it to true if OMP_CLAUSE_DEPEND appears on OMP_TASK. (gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Force GOVD_WRITTEN on shared variables if task construct has depend clause. * testsuite/libgomp.c/task-6.c: New test. (cherry picked from commit 99ddd36e800a24fcb744a14ff9adabb0a7ef72c8)
2021-04-22	openmp, openacc: Fix up handling of data regions [PR98183]	Jakub Jelinek	4	-20/+41
	While the data regions (target data and OpenACC counterparts) aren't standalone directives, unlike most other OpenMP/OpenACC constructs we allow (apparently as an extension) exceptions and goto out of the block. During gimplification we place an end call into a finally block so that it is reached even on exceptions or goto out etc.). During omplower pass we then add paired #pragma omp return for them, but due to the exceptions because the region is not SESE we can end up with #pragma omp return appearing only conditionally in the CFG etc., which the ompexp pass can't handle. For the ompexp pass, we actually don't care about the end part or about target data nesting, so we can treat it as standalone directive. 2020-12-12 Jakub Jelinek <jakub@redhat.com> PR middle-end/98183 * omp-low.c (lower_omp_target): Don't add OMP_RETURN for data regions. * omp-expand.c (expand_omp_target): Don't try to remove OMP_RETURN for data regions. (build_omp_regions_1, omp_make_gimple_edges): Don't expect OMP_RETURN for data regions. * gcc.dg/gomp/pr98183.c: New test. * gcc.dg/goacc/pr98183.c: New test. (cherry picked from commit 8c1ed7223ad1bc19ed9c936ba496220c8ef673bc)
2021-04-22	openmp: Fix ICE with broken doacross loop [PR98205]	Jakub Jelinek	2	-7/+42
	If the loop body doesn't ever continue, we don't have a bb to insert the updates. Fixed by not adding them at all in that case. 2020-12-10 Jakub Jelinek <jakub@redhat.com> PR middle-end/98205 * omp-expand.c (expand_omp_for_generic): Fix up broken_loop handling. * c-c++-common/gomp/doacross-4.c: New test. (cherry picked from commit c925d4cebf817905c237aa2d93887f254b4a74f4)