aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-05-30Daily bump.GCC Administrator1-1/+1
2023-05-29Daily bump.GCC Administrator1-1/+1
2023-05-28Daily bump.GCC Administrator1-1/+1
2023-05-27Daily bump.GCC Administrator1-1/+1
2023-05-26Daily bump.GCC Administrator1-1/+1
2023-05-25Daily bump.GCC Administrator1-1/+1
2023-05-24Daily bump.GCC Administrator1-1/+1
2023-05-23Daily bump.GCC Administrator3-1/+14
2023-05-22Do not generate vmaddfp and vnmsubfpMichael Meissner2-17/+55
This is version 3 of the patch. This is essentially version 1 with the removal of changes to altivec.md, and cleanup of the comments. Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used, and those changes are deleted in this patch. The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating these instructions seems to break Eigen on big endian systems. I have done bootstrap builds on power9 little endian (with both IEEE long double and IBM long double). I have also done the builds and test on a power8 big endian system (testing both 32-bit and 64-bit code generation). Chip has verified that it fixes the problem that Eigen encountered. Can I check this into the master GCC branch? After a burn-in period, can I check this patch into the active GCC branches? Thanks in advance. 2023-05-22 Michael Meissner <meissner@linux.ibm.com> gcc/ PR target/70243 * config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp. Back port from master 04/10/2023. (vsx_nfmsv4sf4): Do not generate vnmsubfp. gcc/testsuite/ PR target/70243 * gcc.target/powerpc/pr70243.c: New test. Back port from master 04/10/2023.
2023-05-22Daily bump.GCC Administrator3-1/+24
2023-05-21Darwin: Update rules for handling alignment of globals.Iain Sandoe7-12/+68
The current rule was too strict and has not been required since Darwin11. This relaxes the constraint to allow up to 2^28 alignment for non-common entities. Common is still restricted to a maximum aligment of 2^15. When the host is an older version of Darwin ( earlier that 11 ) then the existing constraint is still applied. Note that this is a host constraint not a target one (so that a compilation on 10.7 targeting 10.6 is allowed to use a greater alignment than the tools on 10.6 support). This matches the behaviour of clang. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * config.gcc: Emit L2_MAX_OFILE_ALIGNMENT with suitable values for the host. * config/darwin.c (darwin_emit_common): Error for alignment values > 32768. * config/darwin.h (MAX_OFILE_ALIGNMENT): Rework to use the configured L2_MAX_OFILE_ALIGNMENT. gcc/testsuite/ChangeLog: * gcc.dg/darwin-aligned-globals.c: New test. * gcc.dg/darwin-comm-1.c: New test. * gcc.dg/attr-aligned.c: Amend for new alignment values on Darwin. * gcc.target/i386/pr89261.c: Likewise. (cherry picked from commit 19bf83a9a068f2d5293b63c9300f99172b2d278d)
2023-05-21Daily bump.GCC Administrator1-1/+1
2023-05-20Daily bump.GCC Administrator1-1/+1
2023-05-19Daily bump.GCC Administrator1-1/+1
2023-05-18Daily bump.GCC Administrator1-1/+1
2023-05-17Daily bump.GCC Administrator1-1/+1
2023-05-16Daily bump.GCC Administrator1-1/+1
2023-05-15Daily bump.GCC Administrator1-1/+1
2023-05-14Daily bump.GCC Administrator1-1/+1
2023-05-13Daily bump.GCC Administrator1-1/+1
2023-05-12Daily bump.GCC Administrator1-1/+1
2023-05-11Daily bump.GCC Administrator1-1/+1
2023-05-10Daily bump.GCC Administrator3-1/+30
2023-05-09testsuite: Add further testcase for already fixed PR [PR109778]Jakub Jelinek2-0/+29
I came up with a testcase which reproduces all the way to r10-7469. LTO to avoid early inlining it, so that ccp handles rotates and not shifts before they are turned into rotates. 2023-05-09 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109778 * gcc.dg/lto/pr109778_0.c: New test. * gcc.dg/lto/pr109778_1.c: New file. (cherry picked from commit c2cf2dc988eb93551fa1c01d3f8d73ef21f39dc5)
2023-05-09tree-ssa-ccp, wide-int: Fix up handling of [LR]ROTATE_EXPR in bitwise ccp ↵Jakub Jelinek3-4/+35
[PR109778] The following testcase is miscompiled, because bitwise ccp2 handles a rotate with a signed type incorrectly. Seems tree-ssa-ccp.cc has the only callers of wi::[lr]rotate with 3 arguments, all other callers just rotate in the right precision and I think work correctly. ccp works with widest_ints and so rotations by the excessive precision certainly don't match what it wants when it sees a rotate in some specific bitsize. Still, if it is unsigned rotate and the widest_int is zero extended from width, the functions perform left shift and logical right shift on the value and then at the end zero extend the result of left shift and uselessly also the result of logical right shift and return | of that. On the testcase we the signed char rrotate by 4 argument is CONSTANT -75 i.e. 0xffffffff....fffffb5 with mask 2. The mask is correctly rotated to 0x20, but because the 8-bit constant is sign extended to 192-bit one, the logical right shift by 4 doesn't yield expected 0xb, but gives 0xfffffffffff....ffffb, and then return wi::zext (left, width) | wi::zext (right, width); where left is 0xfffffff....fb50, so we return 0xfb instead of the expected 0x5b. The following patch fixes that by doing the zero extension in case of the right variable before doing wi::lrshift rather than after it. Also, wi::[lr]rotate widht width < precision always zero extends the result. I'm afraid it can't do better because it doesn't know if it is done for an unsigned or signed type, but the caller in this case knows that very well, so I've done the extension based on sgn in the caller. E.g. 0x5b rotated right (or left) by 4 with width 8 previously gave 0xb5, but sgn == SIGNED in widest_int it should be 0xffffffff....fffb5 instead. 2023-05-09 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109778 * wide-int.h (wi::lrotate, wi::rrotate): Call wi::lrshift on wi::zext (x, width) rather than x if width != precision, rather than using wi::zext (right, width) after the shift. * tree-ssa-ccp.c (bit_value_binop): Call wi::ext on the results of wi::lrotate or wi::rrotate. * gcc.c-torture/execute/pr109778.c: New test. (cherry picked from commit a8302d2a4669984c7c287d12ef5b37cde6699c80)
2023-05-09Daily bump.GCC Administrator1-1/+1
2023-05-08Daily bump.GCC Administrator1-1/+1
2023-05-07Daily bump.GCC Administrator1-1/+1
2023-05-06Daily bump.GCC Administrator1-1/+1
2023-05-05Daily bump.GCC Administrator3-1/+26
2023-05-04tree-optimization/109724 - new testcaseRichard Biener1-0/+32
The following adds a testcase for PR109724 which was caused by backporting r13-2375-gbe1b42de9c151d and fixed by r11-199-g2b42509f8b7bdf. PR tree-optimization/109724 * g++.dg/torture/pr109724.C: New testcase. (cherry picked from commit ee99aaae4aeecd55f1d945a959652cf07e3b2e9e)
2023-05-04Revert "tree-optimization/106809 - compile time hog in VN"Richard Biener2-58/+27
This reverts commit 051f78a5c1d6994c10ee7c35453ff0ccee94e5c6.
2023-05-04Daily bump.GCC Administrator7-1/+850
2023-05-03call_summary: add missing template keywordAnthony Sharp1-2/+2
Without the 'template', this function template compares 'traverse' to 'f', and then compares the result to 'a'. Evidently it hasn't been instantiated yet. gcc/ChangeLog: * symbol-summary.h: Added missing template keyword. (cherry picked from commit fccd5b48adf568f0aabe5d5f51206a9d42da095a)
2023-05-03reassoc: Fix up another ICE with returns_twice call [PR109410]Jakub Jelinek2-0/+28
The following testcase ICEs in reassoc, unlike the last case I've fixed there here SSA_NAME_USED_IN_ABNORMAL_PHI is not the case anywhere. build_and_add_sum places new statements after the later appearing definition of an operand but if both operands are default defs or constants, we place statement at the start of the function. If the very first statement of a function is a call to returns_twice function, this doesn't work though, because that call has to be the first thing in its basic block, so the following patch splits the entry successor edge such that the new statements are added into a different block from the returns_twice call. I think we should in stage1 reconsider such placements, I think it unnecessarily enlarges the lifetime of the new lhs if its operand(s) are used more than once in the function. Unless something sinks those again. Would be nice to place it closer to the actual uses (or where they will be placed). 2023-04-12 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/109410 * tree-ssa-reassoc.c (build_and_add_sum): Split edge from entry block if first statement of the function is a call to returns_twice function. * gcc.dg/pr109410.c: New test. (cherry picked from commit 51856718a82ce60f067910d9037ca255645b37eb)
2023-05-03sanopt: Return TODO_cleanup_cfg if any .{UB,HWA,A}SAN_* calls were lowered ↵Jakub Jelinek2-1/+20
[PR106190] The following testcase ICEs, because without optimization eh lowering decides not to duplicate finally block of try/finally and so we end up with variable guarded cleanup. The sanopt pass creates a cfg that ought to be cleaned up (some IFN_UBSAN_* functions are lowered in this case with constant conditions in gcond and when not allowing recovery some bbs which end with noreturn calls actually have successor edges), but the cfg cleanup is actually (it is -O0) done only during the optimized pass. We notice there that the d[1][a] = 0; statement which has an EH edge is unreachable (because ubsan would always abort on the out of bounds d[1] access), remove the EH landing pad and block, but because that block just sets a variable and jumps to another one which tests that variable and that one is reachable from normal control flow, the __builtin_eh_pointer (1) later in there is kept in the IL and we ICE during expansion of that statement because the EH region has been removed. The following patch fixes it by doing the cfg cleanup already during sanopt pass if we create something that might need it, while the EH landing pad is then removed already during sanopt pass, there is ehcleanup later and we don't ICE anymore. 2023-03-28 Jakub Jelinek <jakub@redhat.com> PR middle-end/106190 * sanopt.c (pass_sanopt::execute): Return TODO_cleanup_cfg if any of the IFN_{UB,HWA,A}SAN_* internal fns are lowered. * gcc.dg/asan/pr106190.c: New test. (cherry picked from commit 39a43dc336561e0eba0de477b16c7355f19d84ee)
2023-05-03predict: Don't emit -Wsuggest-attribute=cold warning for functions which ↵Jakub Jelinek2-1/+22
already have that attribute [PR105685] In the following testcase, we predict baz to have cold entry regardless of the user supplied attribute (as it call unconditionally a cold function), but still issue a -Wsuggest-attribute=cold warning despite it having that attribute already. The following patch avoids that. 2023-03-26 Jakub Jelinek <jakub@redhat.com> PR ipa/105685 * predict.c (compute_function_frequency): Don't call warn_function_cold if function already has cold attribute. * c-c++-common/cold-2.c: New test. (cherry picked from commit 7eca91d4781bb3df941f25c30b971dac66ba1b3d)
2023-05-03c++: Drop TREE_READONLY on vars (possibly) initialized by tls wrapper [PR109164]Jakub Jelinek7-1/+115
The following two testcases are miscompiled, because we keep TREE_READONLY on the vars even when they are (possibly) dynamically initialized by a TLS wrapper function. Normally cp_finish_decl drops TREE_READONLY from vars which need dynamic initialization, but for TLS we do this kind of initialization upon every access to those variables. Keeping them TREE_READONLY means e.g. PRE can hoist loads from those before loops which contain the TLS wrapper calls, so we can access the TLS variables before they are initialized. 2023-03-20 Jakub Jelinek <jakub@redhat.com> PR c++/109164 * cp-tree.h (var_needs_tls_wrapper): Declare. * decl2.c (var_needs_tls_wrapper): No longer static. * decl.c (cp_finish_decl): Clear TREE_READONLY on TLS variables for which a TLS wrapper will be needed. * g++.dg/tls/thread_local13.C: New test. * g++.dg/tls/thread_local13-aux.cc: New file. * g++.dg/tls/thread_local14.C: New test. * g++.dg/tls/thread_local14-aux.cc: New file. (cherry picked from commit 0a846340b99675d57fc2f2923a0412134eed09d3)
2023-05-03tree-inline: Fix up multiversioning with vector arguments [PR105554]Jakub Jelinek4-11/+16
The following testcase ICEs, because we call tree_function_versioning from old_decl which has target attributes not supporting V4DImode and so DECL_MODE of DECL_ARGUMENTS is BLKmode, while new_decl supports those. tree_function_versioning initially copies DECL_RESULT and DECL_ARGUMENTS from old_decl to new_decl, then calls initialize_cfun to create cfun and only when the cfun is created it can later actually remap_decl DECL_RESULT and DECL_ARGUMENTS etc. The problem is that initialize_cfun -> push_struct_function -> allocate_struct_function calls relayout_decl on DECL_RESULT and DECL_ARGUMENTS, which clobbers DECL_MODE of old_decl and we then ICE because of it. In particular, allocate_struct_function does: if (!abstract_p) { /* Now that we have activated any function-specific attributes that might affect layout, particularly vector modes, relayout each of the parameters and the result. */ relayout_decl (result); for (tree parm = DECL_ARGUMENTS (fndecl); parm; parm = DECL_CHAIN (parm)) relayout_decl (parm); /* Similarly relayout the function decl. */ targetm.target_option.relayout_function (fndecl); } if (!abstract_p && aggregate_value_p (result, fndecl)) { #ifdef PCC_STATIC_STRUCT_RETURN cfun->returns_pcc_struct = 1; #endif cfun->returns_struct = 1; } Now, in the case of tree_function_versioning, I believe all that we need from these is possibly the targetm.target_option.relayout_function (fndecl); call (arm only), we will remap DECL_RESULT and DECL_ARGUMENTS later on and copy_decl_for_dup_finish in that case will handle all we need: /* For vector typed decls make sure to update DECL_MODE according to the new function context. */ if (VECTOR_TYPE_P (TREE_TYPE (copy))) SET_DECL_MODE (copy, TYPE_MODE (TREE_TYPE (copy))); We don't need the cfun->returns_*struct either, because we override it in initialize_cfun a few lines later: /* Copy items we preserve during cloning. */ ... cfun->returns_struct = src_cfun->returns_struct; cfun->returns_pcc_struct = src_cfun->returns_pcc_struct; So, to avoid the clobbering of DECL_RESULT/DECL_ARGUMENTS of old_decl, the following patch arranges allocate_struct_function to be called with abstract_p true and calls targetm.target_option.relayout_function (fndecl); by hand. The removal of DECL_RESULT/DECL_ARGUMENTS copying at the start of initialize_cfun is removed because the only caller - tree_function_versioning, does that unconditionally before. 2023-03-17 Jakub Jelinek <jakub@redhat.com> PR target/105554 * function.h (push_struct_function): Add ABSTRACT_P argument defaulted to false. * function.c (push_struct_function): Add ABSTRACT_P argument, pass it to allocate_struct_function instead of false. * tree-inline.c (initialize_cfun): Don't copy DECL_ARGUMENTS nor DECL_RESULT here. Pass true as ABSTRACT_P to push_struct_function. Call targetm.target_option.relayout_function after it. (tree_function_versioning): Formatting fix. * gcc.target/i386/pr105554.c: New test. (cherry picked from commit 24c06560a7fa39049911eeb8777325d112e0deb9)
2023-05-03c, ubsan: Instrument even shortened divisions [PR109151]Jakub Jelinek2-2/+16
On the following testcase, the C FE decides to shorten the division because it has a guarantee that INT_MIN / -1 division won't be encountered, the first operand is widened from narrower unsigned and/or the second operand is a constant other than all ones (in this case both are true). The problem is that the narrower type in this case is _Bool and ubsan_instrument_division only instruments it if op0's type is INTEGER_TYPE or REAL_TYPE. Strangely this doesn't happen in C++ FE. Anyway, we only shorten divisions if the INT_MIN / -1 case is impossible, so I think we should be fine even with -fstrict-enums in C++ in case it shortened to ENUMERAL_TYPEs. The following patch just instruments those on the ubsan_instrument_division side. Perhaps only the first hunk and testcase might be needed because we shouldn't shorten if the other case could be triggered. 2023-03-17 Jakub Jelinek <jakub@redhat.com> PR c/109151 * c-ubsan.c (ubsan_instrument_division): Handle all scalar integral types rather than just INTEGER_TYPE. * c-c++-common/ubsan/div-by-zero-8.c: New test. (cherry picked from commit 103d423f6ce72ccb03d55b7b1dfa2dabd5854371)
2023-05-03openmp: Fix up handling of doacross loops with noreturn body in loops [PR108685]Jakub Jelinek2-4/+27
The following patch fixes an ICE with doacross loops which have a single entry no exit body, at least one of the ordered > collapse loops isn't guaranteed to have at least one iteration and the whole doacross loop is inside some other loop. The OpenMP constructs aren't represented by struct loop until the omp expansions, so for a normal doacross loop which doesn't have a noreturn body the entry_bb with the GOMP_FOR statement and the first bb of the body typically have the same loop_father, and if the doacross loop isn't inside of some other loop and the body is noreturn as well, both are part of loop 0. The problematic case is when the entry_bb is inside of some deeper loop, but the body, because it falls through into EXIT, has loop 0 as loop_father. l0_bb is created by splitting the entry_bb fallthru edge into l1_bb, and because the two basic blocks have different loop_father, a common loop is found for those (which is loop 0). Now, if the doacross loop has collapse == ordered or all the ordered > collapse loops are guaranteed to iterate at least once, all is still fine, because all enter the l1_bb (body), which doesn't return and so doesn't loop further either. But, if one of those loops could loop 0 times, the user written body wouldn't be reached at all, so unlike the expectations the whole construct actually wouldn't be noreturn if entry_bb is encountered and decides to handle at least one iteration. In this case, we need to fix up, move the l0_bb into the same loop as entry_bb (initially) and for the extra added loops put them as children of that same loop, rather than of loop 0. 2023-03-17 Jakub Jelinek <jakub@redhat.com> PR middle-end/108685 * omp-expand.c (expand_omp_for_ordered_loops): Add L0_BB argument, use its loop_father rather than BODY_BB's loop_father. (expand_omp_for_generic): Adjust expand_omp_for_ordered_loops caller. If broken_loop with ordered > collapse and at least one of those extra loops aren't guaranteed to have at least one iteration, change l0_bb's loop_father to entry_bb's loop_father. Set cont_bb's loop_father to l0_bb's loop_father rather than l1_bb's. * c-c++-common/gomp/doacross-8.c: New test. (cherry picked from commit 713fa5db8ceb4ba8783a0d690ceb4c07f2ff03d0)
2023-05-03c++: Treat unnamed bitfields as padding for ↵Jakub Jelinek2-2/+12
__has_unique_object_representations [PR109096] As reported in the PR, for __has_unique_object_representations we were treating unnamed bitfields as named ones, which is wrong, they are actually padding. THe following patch fixes that. 2023-03-14 Jakub Jelinek <jakub@redhat.com> PR c++/109096 * tree.c (record_has_unique_obj_representations): Ignore unnamed bitfields. * g++.dg/cpp1z/has-unique-obj-representations3.C: New test. (cherry picked from commit c35cf160a0ed81570cff6600dba465cf95fa80fa)
2023-05-03c++: Don't clear TREE_READONLY for -fmerge-all-constants for non-aggregates ↵Jakub Jelinek2-2/+18
[PR107558] The following testcase ICEs, because OpenMP lowering for shared clause on l variable with REFERENCE_TYPE creates POINTER_TYPE to REFERENCE_TYPE. The reason is that the automatic variable has non-trivial construction (reference to a lambda) and -fmerge-all-constants is on and so TREE_READONLY isn't set - omp-low will handle automatic TREE_READONLY vars in shared specially and only copy to the construct and not back, while !TREE_READONLY are assumed to be changeable. The PR91529 change rationale was that the gimplification can change some non-addressable automatic variables to TREE_STATIC with -fmerge-all-constants and therefore TREE_READONLY on them is undesirable. But, the gimplifier does that only for aggregate variables: switch (TREE_CODE (type)) { case RECORD_TYPE: case UNION_TYPE: case QUAL_UNION_TYPE: case ARRAY_TYPE: and not for anything else. So, I think clearing TREE_READONLY for automatic integral or reference or pointer etc. vars for -fmerge-all-constants only is unnecessary. 2023-03-10 Jakub Jelinek <jakub@redhat.com> PR c++/107558 * decl.c (cp_finish_decl): Don't clear TREE_READONLY on automatic non-aggregate variables just because of -fmerge-all-constants. * g++.dg/gomp/pr107558.C: New test. (cherry picked from commit 60b6f5c0a334db3f8f6dffaf0b9aab42fd5c54a2)
2023-05-03c-family: Incremental fix for -Wsign-compare BIT_NOT_EXPR handling [PR107465]Jakub Jelinek2-28/+30
There can be too many extensions and seems I didn't get everything right in the previously posted patch. The following incremental patch ought to fix that. The code can deal with quite a few sign/zero extensions at various spots and it is important to deal with all of them right. On the argument that contains BIT_NOT_EXPR we have: MSB bits#4 bits#3 BIT_NOT_EXPR bits#2 bits#1 LSB where bits#1 is one or more bits (TYPE_PRECISION (TREE_TYPE (arg0)) at the end of the function) we don't know anything about, for the purposes of this warning it is VARYING that is inverted with BIT_NOT_EXPR to some other VARYING bits; bits#2 is one or more bits (TYPE_PRECISION (TREE_TYPE (op0)) - TYPE_PRECISION (TREE_TYPE (arg0)) at the end of the function) which are known to be 0 before the BIT_NOT_EXPR and 1 after it. bits#3 is zero or more bits from the TYPE_PRECISION (TREE_TYPE (op0)) at the end of function to the TYPE_PRECISION (TREE_TYPE (op0)) at the end of the function to TYPE_PRECISION (TREE_TYPE (op0)) at the start of the function, which are either zero extension or sign extension. And bits#4 is zero or more bits from the TYPE_PRECISION (TREE_TYPE (op0)) at the start of the function to TYPE_PRECISION (result_type), which again can be zero or sign extension. Now, vanilla trunk as well as the previously posted patch mishandles the case where bits#3 are sign extended (as bits#2 are known to be all set, that means bits#3 are all set too) but bits#4 are zero extended and are thus all 0. The patch fixes it by tracking the lowest bit which is known to be clear above the known to be set bits (if any, otherwise it is precision of result_type). 2023-03-04 Jakub Jelinek <jakub@redhat.com> PR c/107465 * c-warn.c (warn_for_sign_compare): Don't warn for unset bits above innermost zero extension of BIT_NOT_EXPR result. * c-c++-common/Wsign-compare-2.c (f18): New test. (cherry picked from commit 3ec9a8728086ad86a2d421e067329f305f40e005)
2023-05-03c-family: Fix up -Wsign-compare BIT_NOT_EXPR handling [PR107465]Jakub Jelinek3-31/+184
The following patch fixes multiple bugs in warn_for_sign_compare related to the BIT_NOT_EXPR related warnings. My understanding is that what those 3 warnings are meant to warn (since 1995 apparently) is the case where we have BIT_NOT_EXPR of a zero-extended value, so in result_type the value is something like: 0b11111111XXXXXXXX (e.g. ~ of a 8->16 bit zero extension) 0b000000000000000011111111XXXXXXXX (e.g. ~ of a 8->16 bit zero extension then zero extended to 32 bits) 0b111111111111111111111111XXXXXXXX (e.g. ~ of a 8->16 bit zero extension then sign extended to 32 bits) and the intention of the warning is to warn when this is compared against something that has some 0 bits at the place where the above has guaranteed 1 bits, either ensured through comparison against constant where we know the bits exactly, or through zero extension from some narrower type where again we know at least some upper bits are zero extended. The bugs in the warning code are: 1) misunderstanding of the {,c_common_}get_narrower APIs - the unsignedp it sets is only meaningful if the function actually returns something narrower (in that case it says whether the narrower value is then sign (0) or zero (1) extended to the originally passed value. Though op0 or op1 at this point might be already narrower than result_type, and if the function doesn't return anything narrower, it all depends on whether the passed in op{0,1} had TYPE_UNSIGNED type or not 2) the code didn't check at all whether the BIT_NOT_EXPR operand was actually zero extended (i.e. that it was narrower and unsignedp was set to 1 for it), all it did is check that unsignedp from the call was 1. But that isn't well defined thing, if the argument is returned as is, the function sets unsignedp to 0, but if there is e.g. a useless cast to the same or compatible type in between, it can return 1 if the cast is unsigned; now, if BIT_NOT_EXPR operand is not zero extended, we know nothing at all about any bits in the operand containing BIT_NOT_EXPR, so there is nothing to warn about 3) the code was actually testing both operands after calling c_common_get_narrower on them and on the one with BIT_NOT_EXPR again for constants; I think that is just wrong in case the BIT_NOT_EXPR operand wouldn't be fully folded, the warning makes sense only if the other operand not having BIT_NOT_EXPR in it is constant 4) as can be seen from the above bit pattern examples, the upper bits above (in the patch arg0) aren't always all 1s, there could be some zero extension above it and from it one would have 0s, so that needs to be taken into account for the choice which constant bits to test for being always set otherwise warning is emitted, or for the zero extension guaranteed zero bits 5) the patch also simplifies the handling, we only do it if one but not both operands are BIT_NOT_EXPR after first {,c_common_}get_narrower, so we can just use std::swap to ensure it is the first one 6) the code compared bits against HOST_BITS_PER_LONG, which made sense back in 1995 when the values were stored into long, but now that they are HOST_WIDE_INT should test HOST_BITS_PER_WIDE_INT (or we could rewrite the stuff to wide_int, not done in the patch) 2023-03-04 Jakub Jelinek <jakub@redhat.com> PR c/107465 * c-warn.c (warn_for_sign_compare): If c_common_get_narrower doesn't return a narrower result, use TYPE_UNSIGNED to set unsignedp0 and unsignedp1. For the one BIT_NOT_EXPR case vs. one without, only check for constant in the non-BIT_NOT_EXPR operand, use std::swap to simplify the code, only warn if BIT_NOT_EXPR operand is extended from narrower unsigned, fix up computation of mask for the constant cases and for unsigned other operand case handle differently BIT_NOT_EXPR result being sign vs. zero extended. * c-c++-common/Wsign-compare-2.c: New test. * c-c++-common/pr107465.c: New test. (cherry picked from commit daaf74a714c41c8dbaf9954bcc58462c63062b4f)
2023-05-03diagnostics: Fix up selftests with $COLUMNS < 42 [PR108973]Jakub Jelinek1-0/+1
As mentioned in the PR, GCC's diagnostics self-tests fail if $COLUMNS < 42. Guarding each self-test with if (get_terminal_width () > 41) or similar would be a maintainance nightmare (PR has a patch to do so without reformatting to make it work for $COLUMNS in [30, 41] inclusive, but I'm afraid going down to $COLUMNS 1 would mean marking everything). Furthermore, the self-tests don't really emit stuff to the terminal, but into a buffer, so using get_terminal_width () for it seems inappropriate. The following patch makes sure test_diagnostic_context constructor uses exactly 80 columns wide caret max width, of course some tests override it already if they want to test for behavior in narrower cases. 2023-03-04 Jakub Jelinek <jakub@redhat.com> PR testsuite/108973 * selftest-diagnostic.c (test_diagnostic_context::test_diagnostic_context): Set caret_max_width to 80. (cherry picked from commit 739e7ebb3d378ece25d64b39baae47c584253498)
2023-05-03c++, debug: Fix up locus of DW_TAG_imported_module [PR108716]Jakub Jelinek2-0/+16
Before IMPORTED_DECL has been introduced in PR37410, we used to emit correct DW_AT_decl_line on DW_TAG_imported_module on the testcase below, after that change we haven't emitted it at all for a while and after some time started emitting incorrect locus, in particular the location of } closing the function. The problem is that while we have correct EXPR_LOCATION on the USING_STMT, when genericizing that USING_STMT into IMPORTED_DECL we don't copy the location to DECL_SOURCE_LOCATION, so it gets whatever input_location happens to be when it is created. 2023-03-02 Jakub Jelinek <jakub@redhat.com> PR debug/108716 * cp-gimplify.c (cp_genericize_r) <case USING_STMT>: Set DECL_SOURCE_LOCATION on IMPORTED_DECL to expression location of USING_STMT or input_location. * g++.dg/debug/dwarf2/pr108716.C: New test. (cherry picked from commit 4d82022bfd15d36717bf60a11e75e9ea02204269)
2023-05-03cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial ↵Jakub Jelinek2-1/+48
thunk [PR108854] The following testcase ICEs on x86_64-linux with -m32. The problem is we create an artificial thunk and because of -fPIC, ia32 and thunk destination which doesn't bind locally can't use a mi thunk. The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL, but the PARM_DECL doesn't have DECL_CONTEXT of the current function. This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain only if some arguments need modification. The following patch fixes it by copying the DECL_ARGUMENTS list even if the arguments can stay as is, to update DECL_CONTEXT on them. While for mi thunks it doesn't really matter because we don't use those arguments in any way, for other thunks it is important. 2023-02-23 Jakub Jelinek <jakub@redhat.com> PR middle-end/108854 * cgraphclones.c (duplicate_thunk_for_node): If no parameter changes are needed, copy at least DECL_ARGUMENTS PARM_DECL nodes and adjust their DECL_CONTEXT. * g++.dg/opt/pr108854.C: New test. (cherry picked from commit 2f1691be517fcdcabae9cd671ab511eb0e08b1d5)
2023-05-03i386: Fix up builtins used in avx512bf16vlintrin.h [PR108881]Jakub Jelinek2-18/+32
The builtins used in avx512bf16vlintrin.h implementation need both avx512bf16 and avx512vl ISAs, which the header ensures for them, but the builtins weren't actually requiring avx512vl, so when used by hand with just -mavx512bf16 -mno-avx512vl it resulted in ICEs. Fixed by adding OPTION_MASK_ISA_AVX512VL to their BDESC. 2023-02-24 Jakub Jelinek <jakub@redhat.com> PR target/108881 * config/i386/i386-builtin.def (__builtin_ia32_cvtne2ps2bf16_v16hi, __builtin_ia32_cvtne2ps2bf16_v16hi_mask, __builtin_ia32_cvtne2ps2bf16_v16hi_maskz, __builtin_ia32_cvtne2ps2bf16_v8hi, __builtin_ia32_cvtne2ps2bf16_v8hi_mask, __builtin_ia32_cvtne2ps2bf16_v8hi_maskz, __builtin_ia32_cvtneps2bf16_v8sf_mask, __builtin_ia32_cvtneps2bf16_v8sf_maskz, __builtin_ia32_cvtneps2bf16_v4sf_mask, __builtin_ia32_cvtneps2bf16_v4sf_maskz, __builtin_ia32_dpbf16ps_v8sf, __builtin_ia32_dpbf16ps_v8sf_mask, __builtin_ia32_dpbf16ps_v8sf_maskz, __builtin_ia32_dpbf16ps_v4sf, __builtin_ia32_dpbf16ps_v4sf_mask, __builtin_ia32_dpbf16ps_v4sf_maskz): Require also OPTION_MASK_ISA_AVX512VL. * gcc.target/i386/avx512bf16-pr108881.c: New test. (cherry picked from commit 0ccfa3884f638816af0f5a3f0ee2695e0771ef6d)
2023-05-03c++: Handle structured bindings like anon unions in initializers [PR108474]Jakub Jelinek3-2/+72
As reported by Andrew Pinski, structured bindings (with the exception of the ones using std::tuple_{size,element} and get which are really standalone variables in addition to the binding one) also use DECL_VALUE_EXPR and needs the same treatment in static initializers. On Sun, Jan 22, 2023 at 07:19:07PM -0500, Jason Merrill wrote: > Though, actually, why not instead fix expand_expr_real_1 (and staticp) to > look through DECL_VALUE_EXPR? Doing it when emitting the initializers seems to be too late to me, we in various spots try to put parts of the static var DECL_INITIAL expressions into the IL, or e.g. for varpool purposes remember which vars are referenced there. This patch moves it to record_reference, which is called from varpool_node::analyze and so about the same time as gimplification of the bodies which also replaces DECL_VALUE_EXPRs. 2023-01-24 Jakub Jelinek <jakub@redhat.com> PR c++/108474 * cp-gimplify.c (cp_fold_r): Handle structured bindings vars like anon union artificial vars. * g++.dg/cpp1z/decomp57.C: New test. * g++.dg/cpp1z/decomp58.C: New test. (cherry picked from commit b84e21115700523b4d0ac44275443f7b9c670344)