Age | Commit message (Collapse) | Author | Files | Lines |
|
Include V_REGS for ALL_REGS.
gcc/ChangeLog:
* config/riscv/riscv.h (REG_CLASS_CONTENTS): Fix ALL_REGS.
|
|
|
|
'libobjc/thr.c' includes 'gthr.h'. While all the other gthread headers
have `#ifdef _LIBOBJC` checks, and provide a different set of inline
functions, I think having one header provide two completely unrelated
set of APIs is unsatisfactory, complicates maintenance, and hinders
further development.
This commit references a new header for libobjc, and adds a copyright
notice, as in other headers.
libgcc/ChangeLog:
* config/i386/gthr-mcf.h: Include 'gthr_libobjc.h' when building
libobjc, instead of 'gthr.h'
|
|
|
|
Check for use of previously uninitialized variables; call gcc_unreachable().
Replace abort() with gcc_unreachable().
2022-10-22 Michael Eager <eager@eagercon.com>
gcc/
* config/microblaze/microblaze.cc
(microblaze_legitimize_address): Initialize 'reg' to NULL, check for NULL.
(microblaze_address_insns): Replace abort() with gcc_unreachable().
(print_operand_address): Same.
(microblaze_expand_move): Initialize 'p1' to NULL, check for NULL.
(get_branch_target): Replace abort() with gcc_unreachable().
|
|
[-Inf, +Inf] +-NAN gets normalized as VARYING. There is a test that
drops the NAN possibility, and tests that the range is no longer
VARYING but [-Inf, +Inf]. However, for -ffinite-math-only targets
(Vax, RX, etc) the range would still be VARYING because the VARYING
range never had a NAN to begin with. This fixes the test.
I have a precommit hook that does self-tests with
-fno-finite-math-only, -ffinite-math-only, and -ffast-math as a sanity
check, but my precommit hook last week was disabled because there was
a tree-ssa.exp in mainline failing which was throwing off my scripts.
My apologies.
gcc/ChangeLog:
* value-range.cc (range_tests_floats): Predicate [-Inf, +Inf] test
with !flag_finite_math_only.
|
|
This patch offers an additional allocable register by RA for the CALL0
ABI.
> Register a0 holds the return address upon entry to a function, but
> unlike the windowed register ABI, it is not reserved for this purpose
> and may hold other values after the return address has been saved.
- Xtensa ISA Reference Manual,
8.1.2 "CALL0 Register Usage and Stack Layout" [p.589]
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_conditional_register_usage):
Remove register A0 from FIXED_REGS if the CALL0 ABI.
(xtensa_expand_epilogue): Change to emit '(use (reg:SI A0_REG))'
unconditionally after restoring callee-saved registers for
sibling-call functions, in order to prevent misleading that
register A0 is free to use.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/100097
PR fortran/100098
* trans-array.cc (gfc_trans_class_array): New function to
initialize class descriptor's TKR information.
* trans-array.h (gfc_trans_class_array): Add function prototype.
* trans-decl.cc (gfc_trans_deferred_vars): Add calls to the new
function for both pointers and allocatables.
gcc/testsuite/ChangeLog:
PR fortran/100097
PR fortran/100098
* gfortran.dg/PR100097.f90: New test.
* gfortran.dg/PR100098.f90: New test.
|
|
As the testcase shows, when cbranchbf4/cstorebf4 patterns are defined,
we can get ICEs for conditional moves.
The problem is that the generic conditional move expansion just calls
prepare_cmp_insn which just checks that such a cbranch<mode>4 exists
and returns directly such comparison and passes it down to the conditional
move optabs.
The following patch fixes it by punting if the comparisons aren't
ix86_fp_comparison_operator (to tell the generic code it should separately
compare) and to handle the promotion of BFmode comparison operands to
SFmode such that comparison is performed in SFmode.
2022-10-21 Jakub Jelinek <jakub@redhat.com>
PR target/107322
* config/i386/i386-expand.cc (ix86_prepare_fp_compare_args): For
BFmode comparisons promote arguments to SFmode and recurse.
(ix86_expand_int_movcc, ix86_expand_fp_movcc): Return false early
if comparison operands are BFmode and operands[1] is not
ix86_fp_comparison_operator.
* gcc.target/i386/pr107322.c: New test.
|
|
cxx_eval_constant_expression [PR107295]
The excess precision support broke building skia (dependency of firefox)
on ia32 (it has something like the a constexpr variable), but as the other
cases show, it is actually a preexisting problem if one uses casts from
constants with wider floating point types.
The problem is that cxx_eval_constant_expression tries to short-cut
processing of TREE_CONSTANT CONSTRUCTORs if they satisfy
reduced_constant_expression_p - instead of calling cxx_eval_bare_aggregate
on them it just verifies flags and if they are TREE_CONSTANT even after
that, just fold.
Now, on the testcase we have a TREE_CONSTANT CONSTRUCTOR containing
TREE_CONSTANT NOP_EXPR of REAL_CST. And, fold, which isn't recursive,
doesn't optimize that into VECTOR_CST, while later on we are only able
to optimize VECTOR_CST arithmetics, not arithmetics with vector
CONSTRUCTORs.
The following patch fixes that by rejecting CONSTRUCTORs with vector type
in reduced_constant_expression_p regardless of whether they have
CONSTRUCTOR_NO_CLEARING set or not, folding result in cxx_eval_bare_aggregate
even if nothing has changed but it wasn't non-constant and removing folding
from the TREE_CONSTANT reduced_constant_expression_p short-cut.
2022-10-21 Jakub Jelinek <jakub@redhat.com>
PR c++/107295
* constexpr.cc (reduced_constant_expression_p) <case CONSTRUCTOR>:
Return false for VECTOR_TYPE CONSTRUCTORs even without
CONSTRUCTOR_NO_CLEARING set on them.
(cxx_eval_bare_aggregate): If constant but !changed, fold before
returning VECTOR_TYPE_P CONSTRUCTOR.
(cxx_eval_constant_expression) <case CONSTRUCTOR>: Don't fold
TREE_CONSTANT CONSTRUCTOR, just return it.
* g++.dg/ext/vector42.C: New test.
|
|
2022-09-28 Tejas Joshi <TejasSanjay.Joshi@amd.com>
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver4.
* common/config/i386/i386-common.cc (processor_names): Add znver4.
(processor_alias_table): Add znver4 and modularize old znvers.
* common/config/i386/i386-cpuinfo.h (processor_subtypes):
AMDFAM19H_ZNVER4.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
-march=native recognize znver4 cpus.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add znver4.
* config/i386/i386-options.cc (m_ZNVER4): New definition.
(m_ZNVER): Include m_ZNVER4.
(processor_cost_table): Add znver4.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER4.
(PTA_ZNVER1): New definition.
(PTA_ZNVER2): Likewise.
(PTA_ZNVER3): Likewise.
(PTA_ZNVER4): Likewise.
* config/i386/i386.md (define_attr "cpu"): Add znver4 and rename
md file.
* config/i386/x86-tune-costs.h (znver4_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver4.
(ix86_adjust_cost): Likewise.
* config/i386/znver1.md: Rename to znver.md.
* config/i386/znver.md: Add new reservations for znver4.
* doc/extend.texi: Add details about znver4.
* doc/invoke.texi: Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.target/i386/mv29.C: Likewise.
|
|
This saves us a build flag when building for freestanding targets.
libstdc++-v3/ChangeLog:
* acinclude.m4: Default hosted to off if building without
headers and without newlib.
* configure: Regenerate.
|
|
The std::move_only_function::__param_t alias template attempts to
optimize argument passing for the invoker, by passing by rvalue
reference for types that are non-trivial or large. However, the
precondition for is_trivally_copyable makes it unsuitable for using
here, and can cause ODR violations. Just use is_scalar instead, and pass
all class types (even small, trivial ones) by value.
libstdc++-v3/ChangeLog:
* include/bits/mofunc_impl.h (move_only_function::__param_t):
Use __is_scalar instead of is_trivially_copyable.
* testsuite/20_util/move_only_function/call.cc: Check parameters
involving incomplete types.
|
|
[PR107195, PR107344]
That is, adjust for optimization introduced with recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0", where GCC now
understands that after 'r *= 2;', 'r & 1' will never hold here, and thus
transforms/optimizes/"disturbs" the original code such that GCC/nvptx's later
"Neuter whole SESE regions" optimization no longer is applicable to it:
UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}"
Same for C++.
It's unclear to me if this is an actual "problem", which optimization is "more
important", so I've filed PR107344 "GCC/nvptx SESE region optimization" to
capture this question, and here restore what we intend to be testing (to my
understanding) in 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.
PR tree-optimization/107195
PR target/107344
libgomp/
* testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c: Restore SESE
regions checking.
|
|
... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".
PR tree-optimization/107195
gcc/testsuite/
* gcc.dg/tree-ssa/pr107195-3.c: New.
|
|
The following reverts part of the PR94125 fix which causes us to
use a bogus partition ordering after applying versioning for
alias to the testcase in PR107323. Instead PR94125 is fixed by
appropriately considering to be merged SCCs when skipping edges
we want to ignore because of the alias versioning.
PR tree-optimization/107323
* tree-loop-distribution.cc (pg_unmark_merged_alias_ddrs):
New function.
(loop_distribution::break_alias_scc_partitions): Revert
postorder save/restore from the PR94125 fix. Instead
make sure to not ignore edges from SCCs we are going to
merge.
* gcc.dg/tree-ssa/pr107323.c: New testcase.
|
|
gcc/ChangeLog:
* config/riscv/riscv.md: Add atomic type attribute.
* config/riscv/sync.md: Add atomic type for atomic instructions.
|
|
The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple
reasons. One is the trivial missing -Wno-psabi which the following patch
adds, but that isn't enough. The thing is that without native vector
support, we have VEC_PERM_EXPRs in the IL and are actually considering
the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt
because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.
Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs
that can be handled by the backend natively into one VEC_PERM_EXPR
that can't be handled. But if both of the original VEC_PERM_EXPRs
can't be handled natively either, having just one VEC_PERM_EXPR that will be
lowered by generic vec lowering is IMHO still better than 2.
Or even if we trade just one VEC_PERM_EXPR that can't be handled plus
one that can to one that can't be handled.
Also, removing the testcase's executable permissions...
2022-10-21 <jakub@redhat.com>
PR tree-optimization/54346
* match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0 VECTOR_CST)):
Optimize nested VEC_PERM_EXPRs even if target can't handle the
new one provided we don't increase number of VEC_PERM_EXPRs the
target can't handle.
* gcc.dg/pr54346.c: Add -Wno-psabi to dg-options.
|
|
We ICE on the following testcase during mangling, finish_compound_literal
returns for void{} void_node and the mangler doesn't handle it.
Handling void_node in the mangler seems problematic to me, because
we don't know for which case it has been created.
The following patch arranges to mangle it as other compound literals
with no operands, so it demangles as void{}, by returning a void type
COMPOUND_LITERAL_P with no elements if processing_template_decl.
Otherwise it keeps returning void_node.
2022-10-21 Jakub Jelinek <jakub@redhat.com>
PR c++/106863
* semantics.cc (finish_compound_literal): For void{}, if
processing_template_decl return a COMPOUND_LITERAL_P
CONSTRUCTOR rather than void_node.
* g++.dg/cpp0x/dr2351-2.C: New test.
|
|
https://sourceware.org/bugzilla/show_bug.cgi?id=18632
The bundled libreadline is always built, even if the system is
./configure'd --with-system-readline and the build libreadline.a is not
used.
Proposed patch:
Fix ./configure.ac not to proceed readline/, when --with-system-
readline is provided
* configure.ac: Don't configure readline if --with-system-readline is
used.
* configure: Re-generate.
|
|
gcc/ChangeLog:
* config.gcc: Add riscv-vector-builtins-bases.o and riscv-vector-builtins-shapes.o
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_I_OPS): New macro.
(DEF_RVV_FUNCTION): Ditto.
(handle_pragma_vector): Add intrinsic framework.
* config/riscv/riscv.cc (riscv_print_operand): Add operand print for vsetvl/vsetvlmax.
* config/riscv/riscv.md: include vector.md.
* config/riscv/t-riscv: Add riscv-vector-builtins-bases.o and riscv-vector-builtins-shapes.o
* config/riscv/riscv-vector-builtins-bases.cc: New file.
* config/riscv/riscv-vector-builtins-bases.h: New file.
* config/riscv/riscv-vector-builtins-functions.def: New file.
* config/riscv/riscv-vector-builtins-shapes.cc: New file.
* config/riscv/riscv-vector-builtins-shapes.h: New file.
* config/riscv/riscv-vector-builtins-types.def: New file.
* config/riscv/vector.md: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vsetvl-1.c: New test.
|
|
gcc/ChangeLog:
* config.gcc: Add gt files since function_instance is GTY ((user)).
* config/riscv/riscv-builtins.cc (riscv_init_builtins): Add RVV intrinsic framework.
(riscv_builtin_decl): Ditto.
(riscv_expand_builtin): Ditto.
* config/riscv/riscv-protos.h (builtin_decl): New function.
(expand_builtin): Ditto.
(enum riscv_builtin_class): New enum to classify RVV intrinsic and RISC-V general built-in.
* config/riscv/riscv-vector-builtins.cc (class GTY): New declaration.
(struct registered_function_hasher): New struct.
(DEF_RVV_OP_TYPE): New macro.
(DEF_RVV_TYPE): Ditto.
(DEF_RVV_PRED_TYPE): Ditto.
(GTY): New declaration.
(add_attribute): New function.
(check_required_extensions): Ditto.
(rvv_arg_type_info::get_tree_type): Ditto.
(function_instance::function_instance): Ditto.
(function_instance::operator==): Ditto.
(function_instance::any_type_float_p): Ditto.
(function_instance::get_return_type): Ditto.
(function_instance::get_arg_type): Ditto.
(function_instance::hash): Ditto.
(function_instance::call_properties): Ditto.
(function_instance::reads_global_state_p): Ditto.
(function_instance::modifies_global_state_p): Ditto.
(function_instance::could_trap_p): Ditto.
(function_builder::function_builder): Ditto.
(function_builder::~function_builder): Ditto.
(function_builder::allocate_argument_types): Ditto.
(function_builder::register_function_group): Ditto.
(function_builder::append_name): Ditto.
(function_builder::finish_name): Ditto.
(function_builder::get_attributes): Ditto.
(function_builder::add_function): Ditto.
(function_builder::add_unique_function): Ditto.
(function_call_info::function_call_info): Ditto.
(function_expander::function_expander): Ditto.
(function_expander::add_input_operand): Ditto.
(function_expander::generate_insn): Ditto.
(registered_function_hasher::hash): Ditto.
(registered_function_hasher::equal): Ditto.
(builtin_decl): Ditto.
(expand_builtin): Ditto.
(gt_ggc_mx): Define for using GCC garbage collect.
(gt_pch_nx): Define for using GCC garbage collect.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_OP_TYPE): New macro.
(DEF_RVV_PRED_TYPE): Ditto.
(vbool64_t): Add suffix.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
(vint8mf8_t): Ditto.
(vuint8mf8_t): Ditto.
(vint8mf4_t): Ditto.
(vuint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vuint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vuint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vuint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vuint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vuint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vuint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vuint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vuint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vuint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vuint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vuint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vuint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vuint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vuint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vuint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vuint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vuint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vuint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vuint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
(vfloat64m8_t): Ditto.
(vv): Ditto.
(vx): Ditto.
(v): Ditto.
(wv): Ditto.
(wx): Ditto.
(x_x_v): Ditto.
(vf2): Ditto.
(vf4): Ditto.
(vf8): Ditto.
(vvm): Ditto.
(vxm): Ditto.
(x_x_w): Ditto.
(v_v): Ditto.
(v_x): Ditto.
(vs): Ditto.
(mm): Ditto.
(m): Ditto.
(vf): Ditto.
(vm): Ditto.
(wf): Ditto.
(vfm): Ditto.
(v_f): Ditto.
(ta): Ditto.
(tu): Ditto.
(ma): Ditto.
(mu): Ditto.
(tama): Ditto.
(tamu): Ditto.
(tuma): Ditto.
(tumu): Ditto.
(tam): Ditto.
(tum): Ditto.
* config/riscv/riscv-vector-builtins.h (GCC_RISCV_VECTOR_BUILTINS_H): New macro.
(RVV_REQUIRE_RV64BIT): Ditto.
(RVV_REQUIRE_ZVE64): Ditto.
(RVV_REQUIRE_ELEN_FP_32): Ditto.
(RVV_REQUIRE_ELEN_FP_64): Ditto.
(enum operand_type_index): New enum.
(DEF_RVV_OP_TYPE): New macro.
(enum predication_type_index): New enum.
(DEF_RVV_PRED_TYPE): New macro.
(enum rvv_base_type): New enum.
(struct rvv_builtin_suffixes): New struct.
(struct rvv_arg_type_info): Ditto.
(struct rvv_type_info): Ditto.
(struct rvv_op_info): Ditto.
(class registered_function): New class.
(class function_base): Ditto.
(class function_shape): Ditto.
(struct function_group_info): New struct.
(class GTY): New class.
(class function_builder): Ditto.
(class function_call_info): Ditto.
(function_call_info::function_returns_void_p): New function.
(class function_expander): New class.
(function_instance::operator!=): New function.
(function_expander::expand): Ditto.
(function_expander::add_input_operand): Ditto.
(function_base::call_properties): Ditto.
|
|
gcc/ChangeLog:
* config/i386/sse.md (ssedvecmode): Rename from VI1SI.
(ssedvecmodelower): Rename from vi1si.
(sdot_prod<mode>): New define_expand.
(udot_prod<mode>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/vnniint8-auto-vectorize-1.c: New test.
* gcc.target/i386/vnniint8-auto-vectorize-2.c: Ditto.
|
|
gcc/ChangeLog
* common/config/i386/cpuinfo.h (get_available_features): Detect
avxvnniint8.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVXVNNIINT8_SET): New.
(OPTION_MASK_ISA2_AVXVNNIINT8_UNSET): Ditto.
(ix86_handle_option): Handle -mavxvnniint8.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVXVNNIINT8.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxvnniint8.
* config.gcc: Add avxvnniint8intrin.h.
* config/i386/avxvnniint8intrin.h: New file.
* config/i386/cpuid.h (bit_AVXVNNIINT8): New.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXVNNIINT8__.
* config/i386/i386-options.cc (isa2_opts): Add -mavxvnniint8.
(ix86_valid_target_attribute_inner_p): Handle avxvnniint8.
* config/i386/i386-isa.def: Add DEF_PTA(AVXVNNIINT8) New..
* config/i386/i386.opt: Add option -mavxvnniint8.
* config/i386/immintrin.h: Include avxvnniint8intrin.h.
* config/i386/sse.md (UNSPEC_VPMADDUBSWACCD
UNSPEC_VPMADDUBSWACCSSD,UNSPEC_VPMADDWDACCD,
UNSPEC_VPMADDWDACCSSD): Rename according to new style.
(vpdp<vpdotprodtype>_<mode>): New define_insn.
* doc/extend.texi: Document avxvnniint8.
* doc/invoke.texi: Document -mavxvnniint8.
* doc/sourcebuild.texi: Document target avxvnniint8.
gcc/testsuite/ChangeLog
* g++.dg/other/i386-2.C: Add -mavxvnniint8.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-check.h: Add avxvnniint8 check.
* gcc.target/i386/sse-12.c: Add -mavxvnniint8.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxvnniint8): New.
* gcc.target/i386/avxvnniint8-1.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbuuds-2.c: Ditto.
Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gcc/
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA_AVXIFMA_SET, OPTION_MASK_ISA2_AVXIFMA_UNSET,
OPTION_MASK_ISA2_AVX2_UNSET): New macro.
(ix86_handle_option): Handle -mavxifma.
* common/config/i386/i386-cpuinfo.h (processor_types): Add
FEATURE_AVXIFMA.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxifma.
* common/config/i386/cpuinfo.h (get_available_features):
Detect avxifma.
* config.gcc: Add avxifmaintrin.h
* config/i386/avx512ifmavlintrin.h: (_mm_madd52lo_epu64): Change
to macro.
(_mm_madd52hi_epu64): Likewise.
(_mm256_madd52lo_epu64): Likewise.
(_mm256_madd52hi_epu64): Likewise.
* config/i386/avxifmaintrin.h: New header.
* config/i386/cpuid.h (bit_AVXIFMA): New.
* config/i386/i386-builtin.def: Add new builtins, and correct
pattern names for AVX512IFMA.
* config/i386/i386-builtins.cc (def_builtin): Handle AVX-IFMA
builtins like AVX-VNNI.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXIFMA__.
* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
Relax ISA masks for AVXIFMA.
* config/i386/i386-isa.def: Add AVXIFMA.
* config/i386/i386-options.cc (isa2_opts): Add -mavxifma.
(ix86_valid_target_attribute_inner_p): Handle avxifma.
* config/i386/i386.md (isa): Add attr avxifma and avxifmavl.
* config/i386/i386.opt: Add option -mavxifma.
* config/i386/immintrin.h: Inculde avxifmaintrin.h.
* config/i386/sse.md (avx_vpmadd52<vpmadd52type>_<mode>):
Remove.
(vpamdd52<vpmadd52type><mode><sd_maskz_name>): Remove.
(vpamdd52huq<mode>_maskz): Rename to ...
(vpmadd52huq<mode>_maskz): ... this.
(vpamdd52luq<mode>_maskz): Rename to ...
(vpmadd52luq<mode>_maskz): ... this.
(vpmadd52<vpmadd52type><mode>): New define_insn.
(vpmadd52<vpmadd52type>v8di): Likewise.
(vpmadd52<vpmadd52type><mode>_maskz_1): Likewise.
(vpamdd52<vpmadd52type><mode>_mask): Rename to ...
(vpmadd52<vpmadd52type><mode>_mask): ... this.
* doc/invoke.texi: Document -mavxifma.
* doc/extend.texi: Document avxifma.
* doc/sourcebuild.texi: Document target avxifma.
gcc/testsuite/
* gcc.target/i386/avx-check.h: Add avxifma check.
* gcc.target/i386/avx512ifma-vpmaddhuq-1.c: Remane..
* gcc.target/i386/avx512ifma-vpmaddhuq-1a.c: To this.
* gcc.target/i386/avx512ifma-vpmaddluq-1.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddluq-1a.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddhuq-1b.c: New Test.
* gcc.target/i386/avx512ifma-vpmaddluq-1b.c: Ditto.
* gcc.target/i386/avx-ifma-1.c: Ditto.
* gcc.target/i386/avx-ifma-2.c: Ditto.
* gcc.target/i386/avx-ifma-3.c: Ditto.
* gcc.target/i386/avx-ifma-4.c: Ditto.
* gcc.target/i386/avx-ifma-5.c: Ditto.
* gcc.target/i386/avx-ifma-6.c: Ditto.
* gcc.target/i386/avx-ifma-vpmaddhuq-2.c: Ditto.
* gcc.target/i386/avx-ifma-vpmaddluq-2.c: Ditto.
* gcc.target/i386/sse-12.c: Add -mavxifma.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxifma): New.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/105633
* expr.cc (find_array_section): Move check for NULL pointers so
that both subscript triplets and vector subscripts are covered.
gcc/testsuite/ChangeLog:
PR fortran/105633
* gfortran.dg/pr105633.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
|
|
With the upcoming [[assume]] work, Andrew has pointed out that
non-irange ranges are not handled in get_range_global for
SSA_NAME_IS_DEFAULT_DEF. This patch fixes the oversight.
PR c++/106654
gcc/ChangeLog:
* value-query.cc (get_range_global): Handle non integer ranges for
default def SSA names.
|
|
gcc/ChangeLog:
* range-op-float.cc (foperator_unordered_lt::op1_range): New.
(foperator_unordered_lt::op2_range): New.
|
|
This patch stops reporting fails for Arm targets with single
precision floating point unit for types wider than 32 bits (the width
of float on arm-none-eabi).
As reported in PR102017, fenv is reported as supported in recent
versions of newlib. At the same time, for some Arm targets, the
implementation in libgcc does not support exceptions and thus, the
test fails with a call to abort().
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_fenv_exceptions_double): New.
(check_effective_target_fenv_exceptions_long_double): New.
* gcc.dg/c2x-float-7.c: Split into 3 tests...
* gcc.dg/c2x-float-7a.c: Float part of c2x-float-7.c.
* gcc.dg/c2x-float-7b.c: Double part of c2x-float-7.c.
* gcc.dg/c2x-float-7c.c: Long double part of c2x-float-7.c.
* gcc.dg/pr95115.c: Switch to fenv_exceptions_double.
* gcc.dg/torture/float32x-nan-floath.c: Likewise.
* gcc.dg/torture/float32x-nan.c: Likewise.
* gcc.dg/torture/float64-nan-floath.c: Likewise.
* gcc.dg/torture/float64-nan.c: Likewise.
* gcc.dg/torture/inf-compare-1.c: Likewise.
* gcc.dg/torture/inf-compare-2.c: Likewise.
* gcc.dg/torture/inf-compare-3.c: Likewise.
* gcc.dg/torture/inf-compare-4.c: Likewise.
* gcc.dg/torture/inf-compare-5.c: Likewise.
* gcc.dg/torture/inf-compare-6.c: Likewise.
* gcc.dg/torture/inf-compare-7.c: Likewise.
* gcc.dg/torture/inf-compare-8.c: Likewise.
* gcc.dg/torture/pr52451.c: Likewise.
* gcc.dg/torture/pr82692.c: Likewise.
* gcc.dg/torture/inf-compare-1-float.c: New test.
* gcc.dg/torture/inf-compare-2-float.c: New test.
* gcc.dg/torture/inf-compare-3-float.c: New test.
* gcc.dg/torture/inf-compare-4-float.c: New test.
* gcc.dg/torture/inf-compare-5-float.c: New test.
* gcc.dg/torture/inf-compare-6-float.c: New test.
* gcc.dg/torture/inf-compare-7-float.c: New test.
* gcc.dg/torture/inf-compare-8-float.c: New test.
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
Here we're crashing during constraint matching for the instantiated
hidden friends due to two issues with dependent substitution into a
TEMPLATE_ID_EXPR that names a template from the current instantiation
(as for C<1> with T=T from maybe_substitute_reqs_for):
* tsubst_copy substitutes into such a TEMPLATE_DECL by looking it up
from the substituted class scope. But for this lookup to work when
the args are dependent, we need to substitute the class scope with
entering_scope=true so that we obtain the primary template type
A<T> (which has TYPE_BINFO) instead of the implicit instantiation
A<T> (which doesn't).
* lookup_and_finish_template_variable shouldn't instantiate a
TEMPLATE_ID_EXPR that names a TEMPLATE_DECL which has more than
one level of (unsubstituted) parameters (such as A<T>::C).
gcc/cp/ChangeLog:
* pt.cc (lookup_and_finish_template_variable): Don't
instantiate if the template's scope is dependent.
(tsubst_copy) <case TEMPLATE_DECL>: Pass entering_scope=true
when substituting the class scope.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-friend10.C: New test.
|
|
Fix PR99619, which asks to optimize TLS model based on visibility.
The fix is implemented as an IPA optimization: this allows to take
optimized visibility status into account (as well as avoid modifying
all language frontends).
2022-04-17 Artem Klimov <jakmobius@gmail.com>
gcc/ChangeLog:
PR middle-end/99619
* ipa-visibility.cc (function_and_variable_visibility): Promote
TLS access model afer visibility optimizations.
* varasm.cc (have_optimized_refs): New helper.
(optimize_dyn_tls_for_decl_p): New helper. Use it ...
(decl_default_tls_model): ... here in place of 'optimize' check.
gcc/testsuite/ChangeLog:
PR middle-end/99619
* gcc.dg/tls/vis-attr-gd.c: New test.
* gcc.dg/tls/vis-attr-hidden-gd.c: New test.
* gcc.dg/tls/vis-attr-hidden.c: New test.
* gcc.dg/tls/vis-flag-hidden-gd.c: New test.
* gcc.dg/tls/vis-flag-hidden.c: New test.
* gcc.dg/tls/vis-pragma-hidden-gd.c: New test.
* gcc.dg/tls/vis-pragma-hidden.c: New test.
Co-Authored-By: Alexander Monakov <amonakov@gcc.gnu.org>
Signed-off-by: Artem Klimov <jakmobius@gmail.com>
|
|
The false side of UNORDERED_<cond> means neither operand can be a NAN.
Adjust all the op[12]_range entries for the UNORDERED operators such
that a known NAN on one operands means the other operands is
undefined.
gcc/ChangeLog:
* range-op-float.cc (foperator_unordered_le::op1_range): Adjust
false side with a NAN operand.
(foperator_unordered_le::op2_range): Same.
(foperator_unordered_gt::op1_range): Same.
(foperator_unordered_gt::op2_range): Same.
(foperator_unordered_ge::op1_range): Same.
(foperator_unordered_ge::op2_range): Same.
(foperator_unordered_equal::op1_range): Same.
|
|
Here node_template_info is overlooking that CONCEPT_DECL has TEMPLATE_INFO
too, which causes get_originating_module_decl for the CONCEPT_DECL to not
return the corresponding TEMPLATE_DECL, which leads to an ICE from
import_entity_index while pretty printing the CONCEPT_DECL's module
suffix as part of the static assert failure elaboration.
PR c++/102963
gcc/cp/ChangeLog:
* module.cc (node_template_info): Handle CONCEPT_DECL.
gcc/testsuite/ChangeLog:
* g++.dg/modules/concept-7_a.C: New test.
* g++.dg/modules/concept-7_b.C: New test.
|
|
The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the vectype
when widening the container.
gcc/ChangeLog:
PR tree-optimization/107326
* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Change
vectype when widening container.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr107326.c: New test.
* gcc.dg/vect/vect-bitfield-read-7.c: New test.
|
|
After the C++23 constexpr <charconv> patch r13-3313-g378a0f1840e694 we
have some modules testsuite regressions:
FAIL: g++.dg/modules/xtreme-header-4_b.C -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_b.C -std=c++2b (test for excess errors)
Like with PR105297, the cause seems to be the deduced type of __table
resolving ahead of time to a local class type, which trips up modules.
And unfortunately that PR's minimal workaround of making __tables's
initializer value dependent doesn't help in this case.
So this patch works around this by avoiding using a local class for the
table type. And I suppose we should use a static data member to define
the table once for all dialects (including C++14) instead of having to
define it twice in C++23 mode (once as a static local variable and again
as a variable template for sake of constexpr evaluation).
libstdc++-v3/ChangeLog:
* include/std/charconv (__detail::__from_chars_alnum_to_val_table):
Redefine as a class template containing the members type, value and
_S_make_table. Don't use a local class as the table type.
(__detail::__table): Remove.
(__detail::__from_chars_alnum_to_val): Adjust after the above.
|
|
Since NANs can't appear in ranges for !HONOR_NANS, there's no reason
to set them in a VARYING range.
gcc/ChangeLog:
* value-range.h (frange::set_varying): Do not set NAN flags for
!HONOR_NANS.
* value-range.cc (frange::normalize_kind): Adjust for no NAN when
!HONOR_NANS.
(frange::verify_range): Same.
* range-op-float.cc (maybe_isnan): Remove flag_finite_math_only check.
|
|
The finite_operands_p function was incorrectly named, as it only
returned TRUE when !NAN. This was leftover from the initial
implementation of frange. Using the maybe_isnan() nomenclature is
more consistent and easier to understand.
gcc/ChangeLog:
* range-op-float.cc (finite_operand_p): Remove.
(finite_operands_p): Rename to...
(maybe_isnan): ...this.
(frelop_early_resolve): Use maybe_isnan instead of finite_operands_p.
(foperator_equal::fold_range): Same.
(foperator_equal::op1_range): Same.
(foperator_not_equal::fold_range): Same.
(foperator_lt::fold_range): Same.
(foperator_le::fold_range): Same.
(foperator_gt::fold_range): Same.
(foperator_ge::fold_range): Same.
|
|
The following testcases FAIL on i686-linux due to excess diagnostics
for -Wpsabi.
2022-10-20 Jakub Jelinek <jakub@redhat.com>
* gcc.target/i386/pr107271.c: Add -Wno-psabi to dg-options.
* gcc.dg/debug/btf/btf-function-3.c: Likewise.
|
|
This patch fixes a single typo in comment.
2022-10-20 Jakub Jelinek <jakub@redhat.com>
* passes.cc (pass_manager::register_pass): Fix a comment
typo - copmilation -> compilation.
|
|
Duplicate libgomp.c-c++-common/requires-4.c (as ...-4a.c) but
with using a heap-allocated instead of static memory for a variable.
This change and the added offload_device_gcn check prepare for
pseudo-USM, where the device hardware cannot access all host
memory but only managed and pinned memory; for those, requires-4.c
will fail and the new check permits to add
target { ! { offload_device_nvptx || offload_device_gcn } }
to requires-4.c; however, it has not been added yet as pseuo-USM
support is not yet on mainline. (Review is pending for the USM
patches.)
include/ChangeLog:
* gomp-constants.h (GOMP_DEVICE_HSA): Comment out unused define.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp (check_effective_target_offload_device_gcn):
New.
* testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_gcn,
on_device_arch_gcn): New.
* testsuite/libgomp.c-c++-common/requires-4a.c: New test; copied from
requires-4.c but using heap-allocated memory.
|
|
The reported regression of libgomp loop-14.C shows that there isn't
generally a good reliable place to insert the permute upfront so
the following simply restricts recurrence vectorization to the cases
where the latch value isn't defined by a PHI.
* tree-vect-loop.cc (vect_phi_first_order_recurrence_p):
Disallow latch PHI defs.
(vectorizable_recurr): Revert previous change.
|
|
After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5
"amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]",
"big" private data now works for GCN offloading, too.
PR target/105421
libgomp/
* testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New.
|
|
The GCN backend uses a heuristic to determine whether to use FLAT or
GLOBAL addressing in a particular (offload) function: namely, if a
function takes a pointer-to-scalar parameter, it is assumed that the
pointer may refer to "flat scratch" space, and thus FLAT addressing must
be used instead of GLOBAL.
I came up with this heuristic initially whilst working on support for
moving OpenACC gang-private variables into local-data share (scratch)
memory. The assumption that only scalar variables would be transformed in
that way turned out to be wrong. For example, prior to the next patch in
the series, Fortran compiler-generated temporary structures were treated
as gang private and moved to LDS space, typically overflowing the region
allocated for such variables. That will no longer happen after that
patch is applied, but there may be other cases of structs moving to LDS
space now or in the future that this patch may be needed for.
2022-10-14 Julian Brown <julian@codesourcery.com>
PR target/105421
gcc/
* config/gcn/gcn.cc (gcn_detect_incoming_pointer_arg): Any pointer
argument forces FLAT addressing mode, not just
pointer-to-non-aggregate.
|
|
With that, we may then run plain 'autoreconf' for all of GCC's subpackages,
instead of for some of those (that don't use Automake) manually having to run
the applicable combination of 'aclocal', 'autoconf', 'autoheader'.
See also 'AC_CONFIG_MACRO_DIRS'/'AC_CONFIG_MACRO_DIR' usage elsewhere.
gcc/
* configure.ac (AC_CONFIG_MACRO_DIRS): Instantiate.
* configure: Regenerate.
libobjc/
* configure.ac (AC_CONFIG_MACRO_DIRS): Instantiate.
* configure: Regenerate.
|
|
Add an aarch64_sve::gimple_folder helper for folding calls
to integer constants. SME will make more use of this.
gcc/
* config/aarch64/aarch64-sve-builtins.h
(gimple_folder::fold_to_cstu): New member function.
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::fold_to_cstu): Define.
* config/aarch64/aarch64-sve-builtins-base.cc
(svcnt_bhwd_impl::fold): Use it.
|
|
Now that the codebase is C++11, we can use using directives
to inherit constructors from base classes.
gcc/
* config/aarch64/aarch64-sve-builtins-functions.h (quiet)
(rtx_code_function, rtx_code_function_rotated, unspec_based_function)
(unspec_based_function_rotated, unspec_based_function_exact_insn)
(unspec_based_fused_function, unspec_based_fused_lane_function):
Replace constructors with using directives.
* config/aarch64/aarch64-sve-builtins-base.cc (svcnt_bhwd_pat_impl)
(svcreate_impl, svdotprod_lane_impl, svget_impl, svld1_extend_impl)
(svld1_gather_extend_impl, svld234_impl, svldff1_gather_extend)
(svset_impl, svst1_scatter_truncate_impl, svst1_truncate_impl)
(svst234_impl, svundef_impl): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svldnt1_gather_extend_impl, svmovl_lb_impl): Likewise.
(svstnt1_scatter_truncate_impl): Likewise.
|
|
Move away from the pre-C++11 compatibility macro CONSTEXPR.
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc: Replace CONSTEXPR
with constexpr throughout.
* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
|
|
Bit of a brown-paper-bag bug, but: GCC was generating
non-existent merging forms of BRKAS and BRKBS. Those
instructions only support zero predication (although
BRKA and BRKB support both).
gcc/
* config/aarch64/aarch64-sve.md (*aarch64_brk<brk_op>_cc): Remove
merging alternative.
(*aarch64_brk<brk_op>_ptest): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/brka_1.c: Expect a separate
PTEST instruction.
* gcc.target/aarch64/sve/acle/general/brkb_1.c: Likewise.
|