Age | Commit message (Collapse) | Author | Files | Lines |
|
We used to pointlessly set the pointer of a zero length string
constant to point to a zero byte constant. Instead, just use nil.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/384354
|
|
By definition, a type is marked notinheap doesn't contain any pointers
that the garbage collector cares about, and neither does a pointer to
such a type. Change the type descriptors to consistently treat such
types as not being pointers, by setting ptrdata to 0 and gcdata to nil.
Change-Id: Id8466555ec493456ff5ff09f1670551414619bd2
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/384118
Trust: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
gcc/fortran/ChangeLog:
PR fortran/66193
* arith.cc (reduce_binary_ac): When reducing binary expressions,
try simplification. Handle case of empty constructor.
(reduce_binary_ca): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/66193
* gfortran.dg/array_constructor_55.f90: New test.
|
|
The upcoming Go 1.18 release requires linking against -lrt on GNU/Linux
(only) in order to call timer_create and friends.
Also change gotools to link the runtime test against -lrt.
* gospec.cc (RTLIB, RT_LIBRARY): Define.
(lang_specific_driver): Add -lrt if linking statically on
GNU/Linux.
* configure.ac (RT_LIBS): Define.
* Makefile.am (check-runtime): Set GOLIBS to $(RT_LIBS).
* configure, Makefile.in: Regenerate.
|
|
This issue was observed as a deadlock in
29_atomics/atomic/wait_notify/100334.cc on vxworks. When a wait is
"laundered" (e.g. type T* does not suffice as a waitable address for the
platform's native waiting primitive), the address waited is that of the
_M_ver member of __waiter_pool_base, so several threads may wait on the
same address for unrelated atomic<T> objects. As noted in the PR, the
implementation correctly exits the wait for the thread whose data
changed, but not for any other threads waiting on the same address.
As noted in the PR the __waiter::_M_do_wait_v member was correctly exiting
but the other waiters were not reloading the value of _M_ver before
re-entering the wait.
Moving the spin call inside the loop accomplishes this, and is
consistent with the predicate accepting version of __waiter::_M_do_wait.
libstdc++-v3/ChangeLog:
PR libstdc++/104442
* include/bits/atomic_wait.h (__waiter::_M_do_wait_v): Move spin
loop inside do loop so that threads failing the wait, reload
_M_ver.
|
|
gcc/testsuite/ChangeLog:
* gcc.dg/Wstringop-overflow-69.c: Add -Wno-psabi.
* gcc.dg/loop-unswitch-6.c: Omit -fcompare-debug on AIX.
|
|
Compile PR target/104441 tests with -march=x86-64 to fix test failures
when GCC is configured with --with-arch=native --with-cpu=native.
PR target/104441
* gcc.target/i386/pr104441-1a.c: Compile with -march=x86-64.
* gcc.target/i386/pr104441-1b.c: Likewise.
|
|
The following testcase ICEs, because when creating PAREN_EXPR for
__builtin_assoc_barrier the FE doesn't do the usual tweaks for
EXCESS_PRECISION_EXPR or C_MAYBE_CONST_EXPR. I believe that the
declared effect of the builtin is just association barrier, so
e.g. excess precision should be still handled like if it wasn't
there.
The following patch uses build_unary_op to handle those.
2022-02-09 Jakub Jelinek <jakub@redhat.com>
PR c/104427
* c-parser.cc (c_parser_postfix_expression)
<case RID_BUILTIN_ASSOC_BARRIER>: Use parser_build_unary_op
instead of build1_loc to build PAREN_EXPR.
* c-typeck.cc (build_unary_op): Handle PAREN_EXPR.
* c-fold.cc (c_fully_fold_internal): Likewise.
* gcc.dg/pr104427.c: New test.
|
|
2022-02-09 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/104462
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_XSAVE_UNSET):
Also include OPTION_MASK_ISA2_AVX2_UNSET.
gcc/testsuite/ChangeLog:
PR target/104462
* gcc.target/i386/pr104462.c: New test.
|
|
Input operands can be in the form of:
(subreg:DI (reg:V2SF 96) 0)
which chokes lowpart_subreg. Force inputs to a register, which is
preferable even when the input operand is from memory.
2022-02-09 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/104458
* config/i386/i386-expand.cc (ix86_split_idivmod):
Force operands[2] and operands[3] into a register..
gcc/testsuite/ChangeLog:
PR target/104458
* gcc.target/i386/pr104458.c: New test.
|
|
This isn't technically a regression, but it only impacts the v850 target and
fixes a long standing code correctness issue.
As outlined in slightly more detail in the PR, the v850 is using the pattern
name "fnmasf4" and "fnmssf4" to generate fnmaf.s and fnmsf.s instructions
respectively.
Unfortunately fnmasf4 is expected to produce (-a * b) + c and
fnmssf4 (-a * b) - c. Those v850 instructions actually negate the entire
result.
The fix is trivial. Use a different pattern name so that the combiner can
still generate those instructions, but prevent those instructions from being
used to implement GCC's notion of what fnmas and fnmss should be.
This fixes pr97040 as well as a handful of testsuite failures for the v3e5
multilib.
gcc/
PR target/97040
* config/v850/v850.md (*v850_fnmasf4): Renamed from fnmasf4.
(*v850_fnmssf4): Renamed from fnmssf4
|
|
* godump.cc (go_force_record_alignment): Really name the alignment
field "_" (complete 2021-12-29 change).
* gcc.misc-tests/godump-1.c: Adjust for alignment field rename.
|
|
Due to a pasto error in the documentation, vec_replace_unaligned was
implemented with the same function prototypes as vec_replace_elt. It was
intended that vec_replace_unaligned always specify output vectors as having
type vector unsigned char, to emphasize that elements are potentially
misaligned by this built-in function. This patch corrects the
misimplementation.
2022-02-04 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-builtins.def (VREPLACE_UN_UV2DI): Change
function prototype.
(VREPLACE_UN_UV4SI): Likewise.
(VREPLACE_UN_V2DF): Likewise.
(VREPLACE_UN_V2DI): Likewise.
(VREPLACE_UN_V4SF): Likewise.
(VREPLACE_UN_V4SI): Likewise.
* config/rs6000/rs6000-overload.def (VEC_REPLACE_UN): Change all
function prototypes.
* config/rs6000/vsx.md (vreplace_un_<mode>): Remove define_expand.
(vreplace_un_<mode>): New define_insn.
gcc/testsuite/
* gcc.target/powerpc/vec-replace-word-runnable.c: Handle expected
prototypes for each call to vec_replace_unaligned.
|
|
This patch extends the previous support for 16-byte vec_concat
so that it supports pairs of 4-byte elements. This too isn't
strictly a regression fix, since the 8-byte forms weren't affected
by the same problems as the 16-byte forms, but it leaves things in
a more consistent state.
gcc/
* config/aarch64/iterators.md (VDCSIF): New mode iterator.
(VDBL): Handle SF.
(single_wx, single_type, single_dtype, dblq): New mode attributes.
* config/aarch64/aarch64-simd.md (load_pair_lanes<mode>): Extend
from VDC to VDCSIF.
(store_pair_lanes<mode>): Likewise.
(*aarch64_combine_internal<mode>): Likewise.
(*aarch64_combine_internal_be<mode>): Likewise.
(*aarch64_combinez<mode>): Likewise.
(*aarch64_combinez_be<mode>): Likewise.
* config/aarch64/aarch64.cc (aarch64_classify_address): Handle
8-byte modes for ADDR_QUERY_LDP_STP_N.
(aarch64_print_operand): Likewise for %y.
gcc/testsuite/
* gcc.target/aarch64/vec-init-13.c: New test.
* gcc.target/aarch64/vec-init-14.c: Likewise.
* gcc.target/aarch64/vec-init-15.c: Likewise.
* gcc.target/aarch64/vec-init-16.c: Likewise.
* gcc.target/aarch64/vec-init-17.c: Likewise.
|
|
This patch is the second of two to remove the old
move_lo/hi_quad expanders and move_hi_quad insns.
gcc/
* config/aarch64/aarch64-simd.md (@aarch64_split_simd_mov<mode>):
Use aarch64_combine instead of move_lo/hi_quad. Tabify.
(move_lo_quad_<mode>, aarch64_simd_move_hi_quad_<mode>): Delete.
(aarch64_simd_move_hi_quad_be_<mode>, move_hi_quad_<mode>): Delete.
(vec_pack_trunc_<mode>): Take general_operand elements and use
aarch64_combine rather than move_lo/hi_quad to combine them.
(vec_pack_trunc_df): Likewise.
|
|
After previous patches, we have a (mostly new) group of vec_concat
patterns as well as vestiges of the old move_lo/hi_quad patterns.
(A previous patch removed the move_lo_quad insns, but we still
have the move_hi_quad insns and both sets of expanders.)
This patch is the first of two to remove the old move_lo/hi_quad
stuff. It isn't technically a regression fix, but it seemed
better to make the changes now rather than leave things in
a half-finished and inconsistent state.
This patch defines an aarch64_vec_concat expander that coerces the
element operands into a valid form, including the ones added by the
previous patch. This in turn lets us get rid of one move_lo/hi_quad
pair.
As a side-effect, it also means that vcombines of 2 vectors make
better use of the available forms, like vec_inits of 2 scalars
already do.
gcc/
* config/aarch64/aarch64-protos.h (aarch64_split_simd_combine):
Delete.
* config/aarch64/aarch64-simd.md (@aarch64_combinez<mode>): Rename
to...
(*aarch64_combinez<mode>): ...this.
(@aarch64_combinez_be<mode>): Rename to...
(*aarch64_combinez_be<mode>): ...this.
(@aarch64_vec_concat<mode>): New expander.
(aarch64_combine<mode>): Use it.
(@aarch64_simd_combine<mode>): Delete.
* config/aarch64/aarch64.cc (aarch64_split_simd_combine): Delete.
(aarch64_expand_vector_init): Use aarch64_vec_concat.
gcc/testsuite/
* gcc.target/aarch64/vec-init-12.c: New test.
|
|
vec_combine is really one instruction on aarch64, provided that
the lowpart element is in the same register as the destination
vector. This patch adds patterns for that.
The patch fixes a regression from GCC 8. Before the patch:
int64x2_t s64q_1(int64_t a0, int64_t a1) {
if (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
return (int64x2_t) { a1, a0 };
else
return (int64x2_t) { a0, a1 };
}
generated:
fmov d0, x0
ins v0.d[1], x1
ins v0.d[1], x1
ret
whereas GCC 8 generated the more respectable:
dup v0.2d, x0
ins v0.d[1], x1
ret
gcc/
* config/aarch64/predicates.md (aarch64_reg_or_mem_pair_operand):
New predicate.
* config/aarch64/aarch64-simd.md (*aarch64_combine_internal<mode>)
(*aarch64_combine_internal_be<mode>): New patterns.
gcc/testsuite/
* gcc.target/aarch64/vec-init-9.c: New test.
* gcc.target/aarch64/vec-init-10.c: Likewise.
* gcc.target/aarch64/vec-init-11.c: Likewise.
|
|
move_lo_quad_internal_<mode> and move_lo_quad_internal_be_<mode>
partially duplicate the later aarch64_combinez{,_be}<mode> patterns.
The duplication itself is a regression.
The only substantive differences between the two are:
* combinez uses vector MOV (ORR) instead of element MOV (DUP).
The former seems more likely to be handled via renaming.
* combinez disparages the GPR->FPR alternative whereas move_lo_quad
gave it equal cost. The new test gives a token example of when
the combinez behaviour helps.
gcc/
* config/aarch64/aarch64-simd.md (move_lo_quad_internal_<mode>)
(move_lo_quad_internal_be_<mode>): Delete.
(move_lo_quad_<mode>): Use aarch64_combine<Vhalf> instead of the above.
gcc/testsuite/
* gcc.target/aarch64/vec-init-8.c: New test.
|
|
This patch generalises the load_pair_lanes<mode> guard so that
it uses aarch64_check_consecutive_mems to check for consecutive
mems. It also allows the pattern to be used for STRICT_ALIGNMENT
targets if the alignment is high enough.
The main aim is to avoid an inline test, for the sake of a later patch
that needs to repeat it. Reusing aarch64_check_consecutive_mems seemed
simpler than writing an entirely new function.
gcc/
* config/aarch64/aarch64-protos.h (aarch64_mergeable_load_pair_p):
Declare.
* config/aarch64/aarch64-simd.md (load_pair_lanes<mode>): Use
aarch64_mergeable_load_pair_p instead of inline check.
* config/aarch64/aarch64.cc (aarch64_expand_vector_init): Likewise.
(aarch64_check_consecutive_mems): Allow the reversed parameter
to be null.
(aarch64_mergeable_load_pair_p): New function.
|
|
The aarch64_simd_vec_set<mode> define_insn takes memory operands,
so this patch makes the vec_set<mode> optab expander do the same.
gcc/
* config/aarch64/aarch64-simd.md (vec_set<mode>): Allow the
element to be an aarch64_simd_nonimmediate_operand.
|
|
This patch fixes some case in which *general_operand was used over
*nonimmediate_operand by patterns that don't accept immediates.
This avoids some complication with later patches.
gcc/
* config/aarch64/aarch64-simd.md (aarch64_simd_vec_set<mode>): Use
aarch64_simd_nonimmediate_operand instead of
aarch64_simd_general_operand.
(@aarch64_combinez<mode>): Use nonimmediate_operand instead of
general_operand.
(@aarch64_combinez_be<mode>): Likewise.
|
|
In filter_memfn_lookup, we weren't correctly recognizing and matching up
member functions introduced via a non-dependent using-decl. This caused
us to crash in the below testcases in which we correctly pruned the
overload set for the non-dependent call ahead of time, but then at
instantiation time filter_memfn_lookup failed to match the selected
function (introduced in each case by a non-dependent using-decl) to the
corresponding function from the new lookup set. Such member functions
need special handling in filter_memfn_lookup because they look exactly
the same in the old and new lookup sets, whereas ordinary member
functions that're defined in the (dependent) current class become more
specialized in the new lookup set.
This patch reworks the matching logic in filter_memfn_lookup so that it
handles (member functions introduced by) non-dependent using-decls
correctly, and is hopefully simpler overall.
PR c++/104432
gcc/cp/ChangeLog:
* call.cc (build_new_method_call): When a non-dependent call
resolves to a specialization of a member template, always build
the pruned overload set using the member template, not the
specialization.
* pt.cc (filter_memfn_lookup): New parameter newtype. Simplify
and correct how members from the new lookup set are matched to
those from the old one.
(tsubst_baselink): Pass binfo_type as newtype to
filter_memfn_lookup.
gcc/testsuite/ChangeLog:
* g++.dg/template/non-dependent19.C: New test.
* g++.dg/template/non-dependent19a.C: New test.
* g++.dg/template/non-dependent20.C: New test.
|
|
We weren't streaming a C++20 dependent explicit-specifier.
PR c++/103752
gcc/cp/ChangeLog:
* module.cc (trees_out::core_vals): Stream explicit specifier.
(trees_in::core_vals): Likewise.
* pt.cc (store_explicit_specifier): No longer static.
(tsubst_function_decl): Clear DECL_HAS_DEPENDENT_EXPLICIT_SPEC_P.
* cp-tree.h (lookup_explicit_specifier): Declare.
gcc/testsuite/ChangeLog:
* g++.dg/modules/explicit-bool-1_b.C: New test.
* g++.dg/modules/explicit-bool-1_a.H: New test.
|
|
The following adjusts the earlier change to still allow an
uncritical replacement.
2022-02-09 Richard Biener <rguenther@suse.de>
PR middle-end/104464
* gimple-isel.cc (gimple_expand_vec_cond_expr): Postpone
throwing check to after unproblematic replacement.
* gcc.dg/pr104464.c: New testcase.
|
|
The C++ committee just updated the values of these macros to reflect some
late C++20 papers that we implement but others don't yet; see PR103891.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Update values
of __cpp_constexpr and __cpp_concepts for C++20.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/feat-cxx2b.C: Adjust.
* g++.dg/cpp2a/feat-cxx2a.C: Adjust.
|
|
This patch resolves PR tree-optimization/104420, which is a P1 regression
where, as observed by Jakub Jelinek, the conditions for constant folding
x*0.0 are incorrect (following my patch for PR tree-optimization/96392).
The multiplication x*0.0 may yield a negative zero result, -0.0, if X is
negative (not just if x may be negative zero). Hence (without -ffast-math)
(int)x*0.0 can't be optimized to 0.0, but (unsigned)x*0.0 can be constant
folded. This adds a bunch of test cases to confirm the desired behaviour,
and removes an incorrect test from gcc.dg/pr96392.c which checked for the
wrong behaviour.
2022-02-09 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/104420
* match.pd (mult @0 real_zerop): Tweak conditions for constant
folding X*0.0 (or X*-0.0) to HONOR_SIGNED_ZEROS when appropriate.
gcc/testsuite/ChangeLog
PR tree-optimization/104420
* gcc.dg/pr104420-1.c: New test case.
* gcc.dg/pr104420-2.c: New test case.
* gcc.dg/pr104420-3.c: New test case.
* gcc.dg/pr104420-4.c: New test case.
* gcc.dg/pr96392.c: Remove incorrect test.
|
|
As mentioned in the PR, since PR96690 r11-2834 we call rtl_for_decl_init
which can call expand_expr already during early_dwarf. The comment and PR
explains it that the intent is to ensure the referenced vars and functions
are properly mangled because free_lang_data doesn't cover everything, like
template parameters etc. It doesn't work well though, because expand_expr
can set DECL_RTLs e.g. on referenced vars and keep them there, and they can
be created e.g. with different MEM_ALIGN compared to what they would be
created with if they were emitted later.
So, the following patch stops calling rtl_for_decl_init and instead
for cases for which rtl_for_decl_init does anything at all walks the
initializer and ensures referenced vars or functions are mangled.
2022-02-09 Jakub Jelinek <jakub@redhat.com>
PR debug/104407
* dwarf2out.cc (mangle_referenced_decls): New function.
(tree_add_const_value_attribute): Don't call rtl_for_decl_init if
early_dwarf. Instead walk the initializer and try to mangle vars or
functions referenced from it.
* g++.dg/debug/dwarf2/pr104407.C: New test.
|
|
This patch adjusts uses of nonnull to accurately reflect "somewhere in block".
It also adds the ability to register statement side effects within a block
for ranger which will apply for the rest of the block.
PR tree-optimization/104288
gcc/
* gimple-range-cache.cc (non_null_ref::set_nonnull): New.
(non_null_ref::adjust_range): Move to header.
(ranger_cache::range_of_def): Don't check non-null.
(ranger_cache::entry_range): Don't check non-null.
(ranger_cache::range_on_edge): Check for nonnull on normal edges.
(ranger_cache::update_to_nonnull): New.
(non_null_loadstore): New.
(ranger_cache::block_apply_nonnull): New.
* gimple-range-cache.h (class non_null_ref): Update prototypes.
(non_null_ref::adjust_range): Move to here and inline.
(class ranger_cache): Update prototypes.
* gimple-range-path.cc (path_range_query::range_defined_in_block): Do
not search dominators.
(path_range_query::adjust_for_non_null_uses): Ditto.
* gimple-range.cc (gimple_ranger::range_of_expr): Check on-entry for
def overrides. Do not check nonnull.
(gimple_ranger::range_on_entry): Check dominators for nonnull.
(gimple_ranger::range_on_edge): Check for nonnull on normal edges..
(gimple_ranger::register_side_effects): New.
* gimple-range.h (gimple_ranger::register_side_effects): New.
* tree-vrp.cc (rvrp_folder::fold_stmt): Call register_side_effects.
gcc/testsuite/
* gcc.dg/pr104288.c: New.
|
|
This adds a missing check to epilogue reduction re-use, namely
that we can do hi/lo extracts from the vector when demoting it
to the epilogue vector size.
I've chosen to add a can_vec_extract helper to optabs-query.h,
in the future we might want to simplify the vectorizers life by
handling vector-from-vector extraction via BIT_FIELD_REFs during
RTL expansion via the mode punning when the vec_extract is not
directly supported.
I'm not 100% sure we can always do the punning of the
vec_extract result to a vector mode of the same size, but then
I'm also not sure how to check for that (the vectorizer doesn't
in other places it does that at the moment, but I suppose we
eventually just go through memory there)?
2022-02-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/104445
PR tree-optimization/102832
* optabs-query.h (can_vec_extract): New.
* optabs-query.cc (can_vec_extract): Likewise.
* tree-vect-loop.cc (vect_find_reusable_accumulator): Check
we can extract a hi/lo part from the larger vector, rework
check iteration from larger to smaller sizes.
* gcc.dg/vect/pr104445.c: New testcase.
|
|
Add -m[no-]direct-extern-access and nodirect_extern_access attribute.
-mdirect-extern-access is the default. With nodirect_extern_access
attribute, GOT is always used to access undefined data and function
symbols with nodirect_extern_access attribute, including in PIE and
non-PIE. With -mno-direct-extern-access:
1. Always use GOT to access undefined data and function symbols,
including in PIE and non-PIE. These will avoid copy relocations
in executables. This is compatible with existing executables and
shared libraries.
2. In executable and shared library, bind symbols with the STV_PROTECTED
visibility locally:
a. The address of data symbol is the address of data body.
b. For systems without function descriptor, the function pointer is
the address of function body.
c. The resulting shared libraries may not be incompatible with
executables which have copy relocations on protected symbols or
use executable PLT entries as function addresses for protected
functions in shared libraries.
3. Update asm_preferred_eh_data_format to select PC relative EH encoding
format with -mno-direct-extern-access to avoid copy relocation.
4. Add ix86_reloc_rw_mask for TARGET_ASM_RELOC_RW_MASK to avoid copy
relocation with -mno-direct-extern-access.
gcc/
PR target/35513
PR target/100593
* config/i386/gnu-property.cc: Include "i386-protos.h".
(file_end_indicate_exec_stack_and_gnu_property): Generate
a GNU_PROPERTY_1_NEEDED note for -mno-direct-extern-access or
nodirect_extern_access attribute.
* config/i386/i386-options.cc
(handle_nodirect_extern_access_attribute): New function.
(ix86_attribute_table): Add nodirect_extern_access attribute.
* config/i386/i386-protos.h (ix86_force_load_from_GOT_p): Add a
bool argument.
(ix86_has_no_direct_extern_access): New.
* config/i386/i386.cc (ix86_has_no_direct_extern_access): New.
(ix86_force_load_from_GOT_p): Add a bool argument to indicate
call operand. Force non-call load from GOT for
-mno-direct-extern-access or nodirect_extern_access attribute.
(legitimate_pic_address_disp_p): Avoid copy relocation in PIE
for -mno-direct-extern-access or nodirect_extern_access attribute.
(ix86_print_operand): Pass true to ix86_force_load_from_GOT_p
for call operand.
(asm_preferred_eh_data_format): Use PC-relative format for
-mno-direct-extern-access to avoid copy relocation. Check
ptr_mode instead of TARGET_64BIT when selecting DW_EH_PE_sdata4.
(ix86_binds_local_p): Set ix86_has_no_direct_extern_access to
true for -mno-direct-extern-access or nodirect_extern_access
attribute. Don't treat protected data as extern and avoid copy
relocation on common symbol with -mno-direct-extern-access or
nodirect_extern_access attribute.
(ix86_reloc_rw_mask): New to avoid copy relocation for
-mno-direct-extern-access.
(TARGET_ASM_RELOC_RW_MASK): New.
* config/i386/i386.opt: Add -mdirect-extern-access.
* doc/extend.texi: Document nodirect_extern_access attribute.
* doc/invoke.texi: Document -m[no-]direct-extern-access.
gcc/testsuite/
PR target/35513
PR target/100593
* g++.target/i386/pr35513-1.C: New file.
* g++.target/i386/pr35513-2.C: Likewise.
* gcc.target/i386/pr35513-1a.c: Likewise.
* gcc.target/i386/pr35513-1b.c: Likewise.
* gcc.target/i386/pr35513-2a.c: Likewise.
* gcc.target/i386/pr35513-2b.c: Likewise.
* gcc.target/i386/pr35513-3a.c: Likewise.
* gcc.target/i386/pr35513-3b.c: Likewise.
* gcc.target/i386/pr35513-4a.c: Likewise.
* gcc.target/i386/pr35513-4b.c: Likewise.
* gcc.target/i386/pr35513-5a.c: Likewise.
* gcc.target/i386/pr35513-5b.c: Likewise.
* gcc.target/i386/pr35513-6a.c: Likewise.
* gcc.target/i386/pr35513-6b.c: Likewise.
* gcc.target/i386/pr35513-7a.c: Likewise.
* gcc.target/i386/pr35513-7b.c: Likewise.
* gcc.target/i386/pr35513-8.c: Likewise.
* gcc.target/i386/pr35513-9a.c: Likewise.
* gcc.target/i386/pr35513-9b.c: Likewise.
* gcc.target/i386/pr35513-10a.c: Likewise.
* gcc.target/i386/pr35513-10b.c: Likewise.
* gcc.target/i386/pr35513-11a.c: Likewise.
* gcc.target/i386/pr35513-11b.c: Likewise.
* gcc.target/i386/pr35513-12a.c: Likewise.
* gcc.target/i386/pr35513-12b.c: Likewise.
|
|
commit 9775e465c1fbfc32656de77c618c61acf5bd905d
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Jul 27 07:46:04 2021 -0700
x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register
called ix86_check_avx_upper_register to check mode on source operand.
But ix86_check_avx_upper_register doesn't work on source operand like
(vec_select:V2DI (reg/v:V4DI 23 xmm3 [orig:91 ymm ] [91])
(parallel [
(const_int 2 [0x2])
(const_int 3 [0x3])
]))
Add ix86_avx_u128_mode_source to check mode for each component of source
operand.
gcc/
PR target/104441
* config/i386/i386.cc (ix86_avx_u128_mode_source): New function.
(ix86_avx_u128_mode_needed): Return AVX_U128_ANY for debug INSN.
Call ix86_avx_u128_mode_source to check mode for each component
of source operand.
gcc/testsuite/
PR target/104441
* gcc.target/i386/pr104441-1a.c: New test.
* gcc.target/i386/pr104441-1b.c: Likewise.
|
|
ashlv16qi3.
ix86_expand_vector_init expects vals to be a parallel containing
values of individual fields which should be either element mode of the
vector mode, or a vector mode with the same element mode and smaller
number of elements.
But in the expander ashlv16qi3, the second operand is SImode which
can't be directly passed to gen_vec_initv16qiqi.
gcc/ChangeLog:
PR target/104451
* config/i386/sse.md (<insn><mode>3): lowpart_subreg
operands[2] from SImode to QImode.
gcc/testsuite/ChangeLog:
PR target/104451
* gcc.target/i386/pr104451.c: New test.
|
|
The following avoids merging a vector compare with EH with a
VEC_COND_EXPR. We should be able to do fallback expansion and if
we really are for the optimization we need quite some shuffling
to arrange for the proper EH redirection in all cases, IMHO not
worth it.
2022-02-09 Richard Biener <rguenther@suse.de>
PR middle-end/104450
* gimple-isel.cc: Pass cfun around.
(+gimple_expand_vec_cond_expr): Do not combine a throwing
comparison with the select.
* g++.dg/torture/pr104450.C: New testcase.
|
|
This guards shift builtin folding to do nothing when there is
no LHS, similar to what other foldings do.
2022-02-09 Richard Biener <rguenther@suse.de>
PR target/104453
* config/i386/i386.cc (ix86_gimple_fold_builtin): Guard shift
folding for NULL LHS.
* gcc.target/i386/pr104453.c: New testcase.
|
|
The Go 1.18 library introduces specific types in runtime/internal/atomic.
Recognize and optimize the methods on those types, as we do with the
functions in runtime/internal/atomic.
While we're here avoid getting confused by methods in any other
package that we recognize specially.
* go-gcc.cc (Gcc_backend::Gcc_backend): Define builtins
__atomic_load_1 and __atomic_store_1.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/383654
|
|
The Go 1.18 standard library uses an internal/abi package with two
functions that are implemented in the compiler. This patch implements
them in the gofrontend, to support the upcoming update to 1.18.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/383514
|
|
In recent versions of glibc fopen has __attribute__((malloc)).
Since we can not detect wether this attribute is present or not,
we avoid including stdio.h and instead forward declare what we
need in each test.
Signed-off-by: Joel Teichroeb <joel@teichroeb.net>
gcc/testsuite/ChangeLog:
PR analyzer/101081
* gcc.dg/analyzer/analyzer-verbosity-2a.c: Replace #include of
stdio.h with declarations needed by the test.
* gcc.dg/analyzer/analyzer-verbosity-3a.c: Likewise.
* gcc.dg/analyzer/edges-1.c: Likewise.
* gcc.dg/analyzer/file-1.c: Likewise.
* gcc.dg/analyzer/file-2.c: Likewise.
* gcc.dg/analyzer/file-paths-1.c: Likewise.
* gcc.dg/analyzer/file-pr58237.c: Likewise.
* gcc.dg/analyzer/pr99716-1.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/analyzer/ChangeLog:
PR analyzer/104452
* region-model.cc (selftest::test_bit_range_regions): New.
(selftest::analyzer_region_model_cc_tests): Call it.
* region.h (bit_range_region::key_t::hash): Fix hashing of m_bits
to avoid using uninitialized data.
gcc/testsuite/ChangeLog:
PR analyzer/104452
* gcc.dg/analyzer/pr104452.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This is a case missed by my recent fixes to aggregate initialization and
exception cleanup for PR94041 et al: we also need to clean up members with
constant initialization if initialization of a later member throws.
It also occurs to me that we needn't bother building the cleanups if
-fno-exceptions; build_vec_init already doesn't.
PR c++/96876
gcc/cp/ChangeLog:
* typeck2.cc (split_nonconstant_init_1): Push cleanups for
preceding members with constant initialization.
(maybe_push_temp_cleanup): Do nothing if -fno-exceptions.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/aggr-base11.C: New test.
* g++.dg/eh/aggregate2.C: New test.
|
|
|
|
This replaces the _Dir constructor that takes ownership of an existing
DIR* resource with one that takes a _Dir_base rvalue instead. This means
a raw DIR* is never passed around, but is always owned by a _Dir_base
object.
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (_Dir(DIR*, const path&)): Change first
parameter to _Dir_base&&.
* src/filesystem/dir-common.h (_Dir_base(DIR*)): Remove.
* src/filesystem/dir.cc (_Dir(DIR*, const path&)): Change first
parameter to _Dir_base&&.
|
|
This is a bugfix for r12-6747-gaa8cfe785953a0 which caused an ICE
on or1k (PR104153) and broke SPARC bootstrap (PR104198).
cond_exec_get_condition () returns the jump condition directly and we
now pass it to the backend. The or1k backend modified the condition
in-place (other backends do that as well) but this modification is not
reverted when the sequence in question is discarded. Therefore we copy
the RTX instead of using it directly.
The SPARC problem is due to the SPARC backend recreating the initial
condition when being passed a CC comparison. This causes the sequence
to read from an already overwritten condition operand. Generally, this
could also happen on other targets. The workaround is to always first
emit to a temporary. In a second run of noce_convert_multiple_sets_1
we know which sequences actually require the comparison and will use no
temporaries if all sequences after the current one do not require it.
PR rtl-optimization/104198
PR rtl-optimization/104153
gcc/ChangeLog:
* ifcvt.cc (noce_convert_multiple_sets_1): Copy rtx instead of
using it directly. Rework comparison handling and always
perform a second pass.
gcc/testsuite/ChangeLog:
* gcc.dg/pr104198.c: New test.
|
|
The following patch suppresses extraneous -Wshadow warnings.
On the testcase without the patch we emit 14 -Wshadow warnings,
with the patch just 4. It is enough to warn once e.g. during parsing of the
template or the abstract ctor, while previously we'd warn also on the clones
of the ctors and on instantiation.
In GCC 8 and earlier we didn't warn because check_local_shadow did
/* Inline decls shadow nothing. */
if (DECL_FROM_INLINE (decl))
return;
2022-02-08 Jakub Jelinek <jakub@redhat.com>
PR c++/104379
* name-lookup.cc (check_local_shadow): When diagnosing shadowing
of a member or global declaration, add warning suppression for
the decl and don't warn again on it.
* g++.dg/warn/Wshadow-18.C: New test.
|
|
I've added the assert because start_decl diagnoses such vars for C++20 and
earlier:
if (current_function_decl && VAR_P (decl)
&& DECL_DECLARED_CONSTEXPR_P (current_function_decl)
&& cxx_dialect < cxx23)
but as can be seen, we cam trigger the assert in older standards e.g. during
non-manifestly constant evaluation. Rather than refining the assert that
DECL_EXPRs for such vars don't appear for C++20 and older if they are inside
of functions declared constexpr this patch just removes the assert, the
code rejects encountering those vars in constant expressions anyway.
2022-02-08 Jakub Jelinek <jakub@redhat.com>
PR c++/104403
* constexpr.cc (cxx_eval_constant_expression): Don't assert DECL_EXPRs
of TREE_STATIC vars may only appear in -std=c++23.
* g++.dg/cpp0x/lambda/lambda-104403.C: New test.
|
|
The following testcase ICEs, because
(const_vector:V4SI [
(const_int 0 [0]) repeated x3
(const_int -2147483648 [0xffffffff80000000])
])
is recognized as valid easy_vector_constant in between split1 pass and
end of RA.
The problem is that such constants need to be split, and the only
splitter for that is:
(define_split
[(set (match_operand:VM 0 "altivec_register_operand")
(match_operand:VM 1 "easy_vector_constant_vsldoi"))]
"VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()"
There is only a single splitting pass before RA, so after that finishes,
if something gets matched in between that and end of RA (after that
can_create_pseudo_p () would be no longer true), it will never be
successfully split and we ICE at final.cc time or earlier.
The i386 backend (and a few others) already use
(cfun->curr_properties & PROP_rtl_split_insns)
as a test for split1 pass finished, so that some insns that should be split
during split1 and shouldn't be matched afterwards are properly guarded.
So, the following patch does that for vspltis_shifted too.
2022-02-08 Jakub Jelinek <jakub@redhat.com>
PR target/102140
* config/rs6000/rs6000.cc (vspltis_shifted): Return false also if
split1 pass has finished already.
* gcc.dg/pr102140.c: New test.
|
|
2022-02-08 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-builtins.def (VMSUMCUD): New.
* config/rs6000/rs6000-overload.def (VEC_MSUMC): New.
* config/rs6000/vsx.md (UNSPEC_VMSUMCUD): New constant.
(vmsumcud): New define_insn.
gcc/testsuite/
* gcc.target/powerpc/vec-msumc.c: New test.
|
|
Fixed by r12-1829.
PR c++/104425
gcc/testsuite/ChangeLog:
* g++.dg/template/partial-specialization10.C: New test.
|
|
With the commit "[nvptx] Choose -mptx default based on -misa" I introduced a
use of PTX_ISA_SM70, without adding it first.
Add it, as well as the corresponding TARGET_SM70.
Build for x86_64 with nvptx accelerator.
gcc/ChangeLog:
2022-02-08 Tom de Vries <tdevries@suse.de>
* config/nvptx/nvptx-opts.h (enum ptx_isa): Add PTX_ISA_SM70.
* config/nvptx/nvptx.h (TARGET_SM70): Define.
|
|
This patch changes the costs for a load on condition from 5 to 6 in
order to ensure that we only if-convert two and not three or more SETS like
if (cond)
{
a = b;
c = d;
e = f;
}
In the movqicc expander we emit a paradoxical subreg directly that
combine would otherwise try to create by using a non-optimal sequence
(which would be too expensive).
Also, fix two oversights in ifcvt testcases.
gcc/ChangeLog:
* config/s390/s390.cc (s390_rtx_costs): Increase costs for load
on condition.
* config/s390/s390.md: Use paradoxical subreg.
gcc/testsuite/ChangeLog:
* gcc.target/s390/ifcvt-two-insns-int.c: Fix array size.
* gcc.target/s390/ifcvt-two-insns-long.c: Dito.
|
|
This adds a check for a paradoxical subreg in reg_subword_p ()
in order to prevent an ICE on s390 in try_combine () triggered
by the movqicc expander.
gcc/ChangeLog:
* combine.cc (reg_subword_p): Check for paradoxical subreg.
|