Age | Commit message (Collapse) | Author | Files | Lines |
|
The problem here is that the threader is coming up with a path where
the only valid result is a NAN. When the abs op1_range entry is
trying to add the negative posibility, it attempts to get the bounds
of the working range. NANs don't have bounds so they need to be
special cased.
PR tree-optimization/107355
gcc/ChangeLog:
* range-op-float.cc (foperator_abs::op1_range): Handle NAN.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr107355.c: New test.
|
|
For 'target parallel' and similarly nested directives, cgraph_node's
calls_declare_variant_alt was not set in the parent region node but in
cfun->decl. Hence, pass_omp_device_lower did not process handle the
internal function GOMP_TARGET_REV. - Solution is to set it to the
DECL_CONTEXT, which is set in adjust_context_and_scope.
The cgraph_node::create_clone issue is exposed with -O2 for the existing
libgomp.fortran/reverse-offload-1.f90.
PR middle-end/107236
gcc/ChangeLog:
* omp-expand.cc (expand_omp_target): Set calls_declare_variant_alt
in DECL_CONTEXT and not to cfun->decl.
* cgraphclones.cc (cgraph_node::create_clone): Copy also the
node's calls_declare_variant_alt value.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/target-device-ancestor-6.f90: New test.
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_tunes): New.
(riscv_get_valid_option_values): New.
(TARGET_GET_VALID_OPTION_VALUES): New.
* config/riscv/riscv-cores.def (RISCV_TUNE): New, define options
for tune here.
(RISCV_CORE): Fix comment.
* config/riscv/riscv.cc (riscv_tune_info_table): Move definition to
riscv-cores.def.
|
|
[Jakub and other FP experts, would this be OK, or am I missing
something?]
Vax does not seem to have !flag_finite_math_only, but float_type_node
does not HONOR_NANS. The check in frange::verify_range dependend on
flag_finite_math_only, which is technically not correct since
frange::set_varying() checks HONOR_NANS instead of
flag_finite_math_only.
I'm actually getting tired of flag_finite_math_only and
!flag_finite_math_only discrepancies in the selftests (Vax and rx-elf
come to mind). I think we should just test both alternatives in the
selftests as in this patch.
We could also check flag_finite_math_only=0 with a float_type_node
that does not HONOR_NANs, but I have no idea how to twiddle
FLOAT_MODE_FORMAT temporarily, and that may be over thinking it.
PR tree-optimization/107365
gcc/ChangeLog:
* value-range.cc (frange::verify_range): Predicate NAN check in
VARYING range on HONOR_NANS instead of flag_finite_math_only.
(range_tests_floats): Same.
(range_tests_floats_various): New.
(range_tests): Call range_tests_floats_various.
|
|
When generating the makefile, make sure that the paths are quoted so
that a native Windows path works within Cygwin.
Without this patch, this error is reported by the DejaGNU test suite:
make: [T:\ccMf0kI3.mk:3: T:\ccGEvdDp.ltrans0.ltrans.o] Error 1 (ignored)
The generated makefile fragment without the patch:
T:\ccGEvdDp.ltrans0.ltrans.o:
@T:\build\bin\arm-none-eabi-g++.exe '-xlto' ... '-o' 'T:\ccGEvdDp.ltrans0.ltrans.o' 'T:\ccGEvdDp.ltrans0.o'
@-touch -r T:\ccGEvdDp.ltrans0.o T:\ccGEvdDp.ltrans0.o.tem > /dev/null 2>&1 && mv T:\ccGEvdDp.ltrans0.o.tem T:\ccGEvdDp.ltrans0.o
.PHONY: all
all: \
T:\ccGEvdDp.ltrans0.ltrans.o
With the patch, the touch line would be replace with:
@-touch -r "T:\ccGEvdDp.ltrans0.o" "T:\ccGEvdDp.ltrans0.o.tem" > /dev/null 2>&1 && mv "T:\ccGEvdDp.ltrans0.o.tem" "T:\ccGEvdDp.ltrans0.o"
gcc/ChangeLog:
* lto-wrapper.cc: Quote paths in makefile.
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimize_move): Support (set (mem) (const_poly_int)).
|
|
Move away from the pre-C++11 compatibility macro CONSTEXPR.
This patch is inspired by aarch64:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603974.html.
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Replace CONSTEXPR
with constexpr throughout.
* config/riscv/riscv-vector-builtins-shapes.cc (SHAPE): Likewise.
* config/riscv/riscv-vector-builtins.cc
(struct registered_function_hasher): Likewise.
* config/riscv/riscv-vector-builtins.h (struct rvv_arg_type_info):
Likewise.
|
|
gcc/ChangeLog:
* config/riscv/riscv-vector-switch.def (ENTRY): Remove unused TI/TF vector modes.
|
|
Include V_REGS for ALL_REGS.
gcc/ChangeLog:
* config/riscv/riscv.h (REG_CLASS_CONTENTS): Fix ALL_REGS.
|
|
|
|
|
|
Check for use of previously uninitialized variables; call gcc_unreachable().
Replace abort() with gcc_unreachable().
2022-10-22 Michael Eager <eager@eagercon.com>
gcc/
* config/microblaze/microblaze.cc
(microblaze_legitimize_address): Initialize 'reg' to NULL, check for NULL.
(microblaze_address_insns): Replace abort() with gcc_unreachable().
(print_operand_address): Same.
(microblaze_expand_move): Initialize 'p1' to NULL, check for NULL.
(get_branch_target): Replace abort() with gcc_unreachable().
|
|
[-Inf, +Inf] +-NAN gets normalized as VARYING. There is a test that
drops the NAN possibility, and tests that the range is no longer
VARYING but [-Inf, +Inf]. However, for -ffinite-math-only targets
(Vax, RX, etc) the range would still be VARYING because the VARYING
range never had a NAN to begin with. This fixes the test.
I have a precommit hook that does self-tests with
-fno-finite-math-only, -ffinite-math-only, and -ffast-math as a sanity
check, but my precommit hook last week was disabled because there was
a tree-ssa.exp in mainline failing which was throwing off my scripts.
My apologies.
gcc/ChangeLog:
* value-range.cc (range_tests_floats): Predicate [-Inf, +Inf] test
with !flag_finite_math_only.
|
|
This patch offers an additional allocable register by RA for the CALL0
ABI.
> Register a0 holds the return address upon entry to a function, but
> unlike the windowed register ABI, it is not reserved for this purpose
> and may hold other values after the return address has been saved.
- Xtensa ISA Reference Manual,
8.1.2 "CALL0 Register Usage and Stack Layout" [p.589]
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_conditional_register_usage):
Remove register A0 from FIXED_REGS if the CALL0 ABI.
(xtensa_expand_epilogue): Change to emit '(use (reg:SI A0_REG))'
unconditionally after restoring callee-saved registers for
sibling-call functions, in order to prevent misleading that
register A0 is free to use.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/100097
PR fortran/100098
* trans-array.cc (gfc_trans_class_array): New function to
initialize class descriptor's TKR information.
* trans-array.h (gfc_trans_class_array): Add function prototype.
* trans-decl.cc (gfc_trans_deferred_vars): Add calls to the new
function for both pointers and allocatables.
gcc/testsuite/ChangeLog:
PR fortran/100097
PR fortran/100098
* gfortran.dg/PR100097.f90: New test.
* gfortran.dg/PR100098.f90: New test.
|
|
As the testcase shows, when cbranchbf4/cstorebf4 patterns are defined,
we can get ICEs for conditional moves.
The problem is that the generic conditional move expansion just calls
prepare_cmp_insn which just checks that such a cbranch<mode>4 exists
and returns directly such comparison and passes it down to the conditional
move optabs.
The following patch fixes it by punting if the comparisons aren't
ix86_fp_comparison_operator (to tell the generic code it should separately
compare) and to handle the promotion of BFmode comparison operands to
SFmode such that comparison is performed in SFmode.
2022-10-21 Jakub Jelinek <jakub@redhat.com>
PR target/107322
* config/i386/i386-expand.cc (ix86_prepare_fp_compare_args): For
BFmode comparisons promote arguments to SFmode and recurse.
(ix86_expand_int_movcc, ix86_expand_fp_movcc): Return false early
if comparison operands are BFmode and operands[1] is not
ix86_fp_comparison_operator.
* gcc.target/i386/pr107322.c: New test.
|
|
cxx_eval_constant_expression [PR107295]
The excess precision support broke building skia (dependency of firefox)
on ia32 (it has something like the a constexpr variable), but as the other
cases show, it is actually a preexisting problem if one uses casts from
constants with wider floating point types.
The problem is that cxx_eval_constant_expression tries to short-cut
processing of TREE_CONSTANT CONSTRUCTORs if they satisfy
reduced_constant_expression_p - instead of calling cxx_eval_bare_aggregate
on them it just verifies flags and if they are TREE_CONSTANT even after
that, just fold.
Now, on the testcase we have a TREE_CONSTANT CONSTRUCTOR containing
TREE_CONSTANT NOP_EXPR of REAL_CST. And, fold, which isn't recursive,
doesn't optimize that into VECTOR_CST, while later on we are only able
to optimize VECTOR_CST arithmetics, not arithmetics with vector
CONSTRUCTORs.
The following patch fixes that by rejecting CONSTRUCTORs with vector type
in reduced_constant_expression_p regardless of whether they have
CONSTRUCTOR_NO_CLEARING set or not, folding result in cxx_eval_bare_aggregate
even if nothing has changed but it wasn't non-constant and removing folding
from the TREE_CONSTANT reduced_constant_expression_p short-cut.
2022-10-21 Jakub Jelinek <jakub@redhat.com>
PR c++/107295
* constexpr.cc (reduced_constant_expression_p) <case CONSTRUCTOR>:
Return false for VECTOR_TYPE CONSTRUCTORs even without
CONSTRUCTOR_NO_CLEARING set on them.
(cxx_eval_bare_aggregate): If constant but !changed, fold before
returning VECTOR_TYPE_P CONSTRUCTOR.
(cxx_eval_constant_expression) <case CONSTRUCTOR>: Don't fold
TREE_CONSTANT CONSTRUCTOR, just return it.
* g++.dg/ext/vector42.C: New test.
|
|
2022-09-28 Tejas Joshi <TejasSanjay.Joshi@amd.com>
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver4.
* common/config/i386/i386-common.cc (processor_names): Add znver4.
(processor_alias_table): Add znver4 and modularize old znvers.
* common/config/i386/i386-cpuinfo.h (processor_subtypes):
AMDFAM19H_ZNVER4.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
-march=native recognize znver4 cpus.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add znver4.
* config/i386/i386-options.cc (m_ZNVER4): New definition.
(m_ZNVER): Include m_ZNVER4.
(processor_cost_table): Add znver4.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER4.
(PTA_ZNVER1): New definition.
(PTA_ZNVER2): Likewise.
(PTA_ZNVER3): Likewise.
(PTA_ZNVER4): Likewise.
* config/i386/i386.md (define_attr "cpu"): Add znver4 and rename
md file.
* config/i386/x86-tune-costs.h (znver4_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver4.
(ix86_adjust_cost): Likewise.
* config/i386/znver1.md: Rename to znver.md.
* config/i386/znver.md: Add new reservations for znver4.
* doc/extend.texi: Add details about znver4.
* doc/invoke.texi: Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.target/i386/mv29.C: Likewise.
|
|
... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".
PR tree-optimization/107195
gcc/testsuite/
* gcc.dg/tree-ssa/pr107195-3.c: New.
|
|
The following reverts part of the PR94125 fix which causes us to
use a bogus partition ordering after applying versioning for
alias to the testcase in PR107323. Instead PR94125 is fixed by
appropriately considering to be merged SCCs when skipping edges
we want to ignore because of the alias versioning.
PR tree-optimization/107323
* tree-loop-distribution.cc (pg_unmark_merged_alias_ddrs):
New function.
(loop_distribution::break_alias_scc_partitions): Revert
postorder save/restore from the PR94125 fix. Instead
make sure to not ignore edges from SCCs we are going to
merge.
* gcc.dg/tree-ssa/pr107323.c: New testcase.
|
|
gcc/ChangeLog:
* config/riscv/riscv.md: Add atomic type attribute.
* config/riscv/sync.md: Add atomic type for atomic instructions.
|
|
The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple
reasons. One is the trivial missing -Wno-psabi which the following patch
adds, but that isn't enough. The thing is that without native vector
support, we have VEC_PERM_EXPRs in the IL and are actually considering
the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt
because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.
Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs
that can be handled by the backend natively into one VEC_PERM_EXPR
that can't be handled. But if both of the original VEC_PERM_EXPRs
can't be handled natively either, having just one VEC_PERM_EXPR that will be
lowered by generic vec lowering is IMHO still better than 2.
Or even if we trade just one VEC_PERM_EXPR that can't be handled plus
one that can to one that can't be handled.
Also, removing the testcase's executable permissions...
2022-10-21 <jakub@redhat.com>
PR tree-optimization/54346
* match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0 VECTOR_CST)):
Optimize nested VEC_PERM_EXPRs even if target can't handle the
new one provided we don't increase number of VEC_PERM_EXPRs the
target can't handle.
* gcc.dg/pr54346.c: Add -Wno-psabi to dg-options.
|
|
We ICE on the following testcase during mangling, finish_compound_literal
returns for void{} void_node and the mangler doesn't handle it.
Handling void_node in the mangler seems problematic to me, because
we don't know for which case it has been created.
The following patch arranges to mangle it as other compound literals
with no operands, so it demangles as void{}, by returning a void type
COMPOUND_LITERAL_P with no elements if processing_template_decl.
Otherwise it keeps returning void_node.
2022-10-21 Jakub Jelinek <jakub@redhat.com>
PR c++/106863
* semantics.cc (finish_compound_literal): For void{}, if
processing_template_decl return a COMPOUND_LITERAL_P
CONSTRUCTOR rather than void_node.
* g++.dg/cpp0x/dr2351-2.C: New test.
|
|
gcc/ChangeLog:
* config.gcc: Add riscv-vector-builtins-bases.o and riscv-vector-builtins-shapes.o
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_I_OPS): New macro.
(DEF_RVV_FUNCTION): Ditto.
(handle_pragma_vector): Add intrinsic framework.
* config/riscv/riscv.cc (riscv_print_operand): Add operand print for vsetvl/vsetvlmax.
* config/riscv/riscv.md: include vector.md.
* config/riscv/t-riscv: Add riscv-vector-builtins-bases.o and riscv-vector-builtins-shapes.o
* config/riscv/riscv-vector-builtins-bases.cc: New file.
* config/riscv/riscv-vector-builtins-bases.h: New file.
* config/riscv/riscv-vector-builtins-functions.def: New file.
* config/riscv/riscv-vector-builtins-shapes.cc: New file.
* config/riscv/riscv-vector-builtins-shapes.h: New file.
* config/riscv/riscv-vector-builtins-types.def: New file.
* config/riscv/vector.md: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/vsetvl-1.c: New test.
|
|
gcc/ChangeLog:
* config.gcc: Add gt files since function_instance is GTY ((user)).
* config/riscv/riscv-builtins.cc (riscv_init_builtins): Add RVV intrinsic framework.
(riscv_builtin_decl): Ditto.
(riscv_expand_builtin): Ditto.
* config/riscv/riscv-protos.h (builtin_decl): New function.
(expand_builtin): Ditto.
(enum riscv_builtin_class): New enum to classify RVV intrinsic and RISC-V general built-in.
* config/riscv/riscv-vector-builtins.cc (class GTY): New declaration.
(struct registered_function_hasher): New struct.
(DEF_RVV_OP_TYPE): New macro.
(DEF_RVV_TYPE): Ditto.
(DEF_RVV_PRED_TYPE): Ditto.
(GTY): New declaration.
(add_attribute): New function.
(check_required_extensions): Ditto.
(rvv_arg_type_info::get_tree_type): Ditto.
(function_instance::function_instance): Ditto.
(function_instance::operator==): Ditto.
(function_instance::any_type_float_p): Ditto.
(function_instance::get_return_type): Ditto.
(function_instance::get_arg_type): Ditto.
(function_instance::hash): Ditto.
(function_instance::call_properties): Ditto.
(function_instance::reads_global_state_p): Ditto.
(function_instance::modifies_global_state_p): Ditto.
(function_instance::could_trap_p): Ditto.
(function_builder::function_builder): Ditto.
(function_builder::~function_builder): Ditto.
(function_builder::allocate_argument_types): Ditto.
(function_builder::register_function_group): Ditto.
(function_builder::append_name): Ditto.
(function_builder::finish_name): Ditto.
(function_builder::get_attributes): Ditto.
(function_builder::add_function): Ditto.
(function_builder::add_unique_function): Ditto.
(function_call_info::function_call_info): Ditto.
(function_expander::function_expander): Ditto.
(function_expander::add_input_operand): Ditto.
(function_expander::generate_insn): Ditto.
(registered_function_hasher::hash): Ditto.
(registered_function_hasher::equal): Ditto.
(builtin_decl): Ditto.
(expand_builtin): Ditto.
(gt_ggc_mx): Define for using GCC garbage collect.
(gt_pch_nx): Define for using GCC garbage collect.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_OP_TYPE): New macro.
(DEF_RVV_PRED_TYPE): Ditto.
(vbool64_t): Add suffix.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
(vint8mf8_t): Ditto.
(vuint8mf8_t): Ditto.
(vint8mf4_t): Ditto.
(vuint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vuint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vuint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vuint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vuint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vuint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vuint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vuint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vuint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vuint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vuint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vuint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vuint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vuint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vuint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vuint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vuint32m8_t): Ditto.
(vint64m1_t): Ditto.
(vuint64m1_t): Ditto.
(vint64m2_t): Ditto.
(vuint64m2_t): Ditto.
(vint64m4_t): Ditto.
(vuint64m4_t): Ditto.
(vint64m8_t): Ditto.
(vuint64m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
(vfloat64m1_t): Ditto.
(vfloat64m2_t): Ditto.
(vfloat64m4_t): Ditto.
(vfloat64m8_t): Ditto.
(vv): Ditto.
(vx): Ditto.
(v): Ditto.
(wv): Ditto.
(wx): Ditto.
(x_x_v): Ditto.
(vf2): Ditto.
(vf4): Ditto.
(vf8): Ditto.
(vvm): Ditto.
(vxm): Ditto.
(x_x_w): Ditto.
(v_v): Ditto.
(v_x): Ditto.
(vs): Ditto.
(mm): Ditto.
(m): Ditto.
(vf): Ditto.
(vm): Ditto.
(wf): Ditto.
(vfm): Ditto.
(v_f): Ditto.
(ta): Ditto.
(tu): Ditto.
(ma): Ditto.
(mu): Ditto.
(tama): Ditto.
(tamu): Ditto.
(tuma): Ditto.
(tumu): Ditto.
(tam): Ditto.
(tum): Ditto.
* config/riscv/riscv-vector-builtins.h (GCC_RISCV_VECTOR_BUILTINS_H): New macro.
(RVV_REQUIRE_RV64BIT): Ditto.
(RVV_REQUIRE_ZVE64): Ditto.
(RVV_REQUIRE_ELEN_FP_32): Ditto.
(RVV_REQUIRE_ELEN_FP_64): Ditto.
(enum operand_type_index): New enum.
(DEF_RVV_OP_TYPE): New macro.
(enum predication_type_index): New enum.
(DEF_RVV_PRED_TYPE): New macro.
(enum rvv_base_type): New enum.
(struct rvv_builtin_suffixes): New struct.
(struct rvv_arg_type_info): Ditto.
(struct rvv_type_info): Ditto.
(struct rvv_op_info): Ditto.
(class registered_function): New class.
(class function_base): Ditto.
(class function_shape): Ditto.
(struct function_group_info): New struct.
(class GTY): New class.
(class function_builder): Ditto.
(class function_call_info): Ditto.
(function_call_info::function_returns_void_p): New function.
(class function_expander): New class.
(function_instance::operator!=): New function.
(function_expander::expand): Ditto.
(function_expander::add_input_operand): Ditto.
(function_base::call_properties): Ditto.
|
|
gcc/ChangeLog:
* config/i386/sse.md (ssedvecmode): Rename from VI1SI.
(ssedvecmodelower): Rename from vi1si.
(sdot_prod<mode>): New define_expand.
(udot_prod<mode>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/vnniint8-auto-vectorize-1.c: New test.
* gcc.target/i386/vnniint8-auto-vectorize-2.c: Ditto.
|
|
gcc/ChangeLog
* common/config/i386/cpuinfo.h (get_available_features): Detect
avxvnniint8.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVXVNNIINT8_SET): New.
(OPTION_MASK_ISA2_AVXVNNIINT8_UNSET): Ditto.
(ix86_handle_option): Handle -mavxvnniint8.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVXVNNIINT8.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxvnniint8.
* config.gcc: Add avxvnniint8intrin.h.
* config/i386/avxvnniint8intrin.h: New file.
* config/i386/cpuid.h (bit_AVXVNNIINT8): New.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXVNNIINT8__.
* config/i386/i386-options.cc (isa2_opts): Add -mavxvnniint8.
(ix86_valid_target_attribute_inner_p): Handle avxvnniint8.
* config/i386/i386-isa.def: Add DEF_PTA(AVXVNNIINT8) New..
* config/i386/i386.opt: Add option -mavxvnniint8.
* config/i386/immintrin.h: Include avxvnniint8intrin.h.
* config/i386/sse.md (UNSPEC_VPMADDUBSWACCD
UNSPEC_VPMADDUBSWACCSSD,UNSPEC_VPMADDWDACCD,
UNSPEC_VPMADDWDACCSSD): Rename according to new style.
(vpdp<vpdotprodtype>_<mode>): New define_insn.
* doc/extend.texi: Document avxvnniint8.
* doc/invoke.texi: Document -mavxvnniint8.
* doc/sourcebuild.texi: Document target avxvnniint8.
gcc/testsuite/ChangeLog
* g++.dg/other/i386-2.C: Add -mavxvnniint8.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-check.h: Add avxvnniint8 check.
* gcc.target/i386/sse-12.c: Add -mavxvnniint8.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxvnniint8): New.
* gcc.target/i386/avxvnniint8-1.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbuuds-2.c: Ditto.
Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gcc/
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA_AVXIFMA_SET, OPTION_MASK_ISA2_AVXIFMA_UNSET,
OPTION_MASK_ISA2_AVX2_UNSET): New macro.
(ix86_handle_option): Handle -mavxifma.
* common/config/i386/i386-cpuinfo.h (processor_types): Add
FEATURE_AVXIFMA.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxifma.
* common/config/i386/cpuinfo.h (get_available_features):
Detect avxifma.
* config.gcc: Add avxifmaintrin.h
* config/i386/avx512ifmavlintrin.h: (_mm_madd52lo_epu64): Change
to macro.
(_mm_madd52hi_epu64): Likewise.
(_mm256_madd52lo_epu64): Likewise.
(_mm256_madd52hi_epu64): Likewise.
* config/i386/avxifmaintrin.h: New header.
* config/i386/cpuid.h (bit_AVXIFMA): New.
* config/i386/i386-builtin.def: Add new builtins, and correct
pattern names for AVX512IFMA.
* config/i386/i386-builtins.cc (def_builtin): Handle AVX-IFMA
builtins like AVX-VNNI.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXIFMA__.
* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
Relax ISA masks for AVXIFMA.
* config/i386/i386-isa.def: Add AVXIFMA.
* config/i386/i386-options.cc (isa2_opts): Add -mavxifma.
(ix86_valid_target_attribute_inner_p): Handle avxifma.
* config/i386/i386.md (isa): Add attr avxifma and avxifmavl.
* config/i386/i386.opt: Add option -mavxifma.
* config/i386/immintrin.h: Inculde avxifmaintrin.h.
* config/i386/sse.md (avx_vpmadd52<vpmadd52type>_<mode>):
Remove.
(vpamdd52<vpmadd52type><mode><sd_maskz_name>): Remove.
(vpamdd52huq<mode>_maskz): Rename to ...
(vpmadd52huq<mode>_maskz): ... this.
(vpamdd52luq<mode>_maskz): Rename to ...
(vpmadd52luq<mode>_maskz): ... this.
(vpmadd52<vpmadd52type><mode>): New define_insn.
(vpmadd52<vpmadd52type>v8di): Likewise.
(vpmadd52<vpmadd52type><mode>_maskz_1): Likewise.
(vpamdd52<vpmadd52type><mode>_mask): Rename to ...
(vpmadd52<vpmadd52type><mode>_mask): ... this.
* doc/invoke.texi: Document -mavxifma.
* doc/extend.texi: Document avxifma.
* doc/sourcebuild.texi: Document target avxifma.
gcc/testsuite/
* gcc.target/i386/avx-check.h: Add avxifma check.
* gcc.target/i386/avx512ifma-vpmaddhuq-1.c: Remane..
* gcc.target/i386/avx512ifma-vpmaddhuq-1a.c: To this.
* gcc.target/i386/avx512ifma-vpmaddluq-1.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddluq-1a.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddhuq-1b.c: New Test.
* gcc.target/i386/avx512ifma-vpmaddluq-1b.c: Ditto.
* gcc.target/i386/avx-ifma-1.c: Ditto.
* gcc.target/i386/avx-ifma-2.c: Ditto.
* gcc.target/i386/avx-ifma-3.c: Ditto.
* gcc.target/i386/avx-ifma-4.c: Ditto.
* gcc.target/i386/avx-ifma-5.c: Ditto.
* gcc.target/i386/avx-ifma-6.c: Ditto.
* gcc.target/i386/avx-ifma-vpmaddhuq-2.c: Ditto.
* gcc.target/i386/avx-ifma-vpmaddluq-2.c: Ditto.
* gcc.target/i386/sse-12.c: Add -mavxifma.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxifma): New.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/105633
* expr.cc (find_array_section): Move check for NULL pointers so
that both subscript triplets and vector subscripts are covered.
gcc/testsuite/ChangeLog:
PR fortran/105633
* gfortran.dg/pr105633.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
|
|
With the upcoming [[assume]] work, Andrew has pointed out that
non-irange ranges are not handled in get_range_global for
SSA_NAME_IS_DEFAULT_DEF. This patch fixes the oversight.
PR c++/106654
gcc/ChangeLog:
* value-query.cc (get_range_global): Handle non integer ranges for
default def SSA names.
|
|
gcc/ChangeLog:
* range-op-float.cc (foperator_unordered_lt::op1_range): New.
(foperator_unordered_lt::op2_range): New.
|
|
This patch stops reporting fails for Arm targets with single
precision floating point unit for types wider than 32 bits (the width
of float on arm-none-eabi).
As reported in PR102017, fenv is reported as supported in recent
versions of newlib. At the same time, for some Arm targets, the
implementation in libgcc does not support exceptions and thus, the
test fails with a call to abort().
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_fenv_exceptions_double): New.
(check_effective_target_fenv_exceptions_long_double): New.
* gcc.dg/c2x-float-7.c: Split into 3 tests...
* gcc.dg/c2x-float-7a.c: Float part of c2x-float-7.c.
* gcc.dg/c2x-float-7b.c: Double part of c2x-float-7.c.
* gcc.dg/c2x-float-7c.c: Long double part of c2x-float-7.c.
* gcc.dg/pr95115.c: Switch to fenv_exceptions_double.
* gcc.dg/torture/float32x-nan-floath.c: Likewise.
* gcc.dg/torture/float32x-nan.c: Likewise.
* gcc.dg/torture/float64-nan-floath.c: Likewise.
* gcc.dg/torture/float64-nan.c: Likewise.
* gcc.dg/torture/inf-compare-1.c: Likewise.
* gcc.dg/torture/inf-compare-2.c: Likewise.
* gcc.dg/torture/inf-compare-3.c: Likewise.
* gcc.dg/torture/inf-compare-4.c: Likewise.
* gcc.dg/torture/inf-compare-5.c: Likewise.
* gcc.dg/torture/inf-compare-6.c: Likewise.
* gcc.dg/torture/inf-compare-7.c: Likewise.
* gcc.dg/torture/inf-compare-8.c: Likewise.
* gcc.dg/torture/pr52451.c: Likewise.
* gcc.dg/torture/pr82692.c: Likewise.
* gcc.dg/torture/inf-compare-1-float.c: New test.
* gcc.dg/torture/inf-compare-2-float.c: New test.
* gcc.dg/torture/inf-compare-3-float.c: New test.
* gcc.dg/torture/inf-compare-4-float.c: New test.
* gcc.dg/torture/inf-compare-5-float.c: New test.
* gcc.dg/torture/inf-compare-6-float.c: New test.
* gcc.dg/torture/inf-compare-7-float.c: New test.
* gcc.dg/torture/inf-compare-8-float.c: New test.
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
Here we're crashing during constraint matching for the instantiated
hidden friends due to two issues with dependent substitution into a
TEMPLATE_ID_EXPR that names a template from the current instantiation
(as for C<1> with T=T from maybe_substitute_reqs_for):
* tsubst_copy substitutes into such a TEMPLATE_DECL by looking it up
from the substituted class scope. But for this lookup to work when
the args are dependent, we need to substitute the class scope with
entering_scope=true so that we obtain the primary template type
A<T> (which has TYPE_BINFO) instead of the implicit instantiation
A<T> (which doesn't).
* lookup_and_finish_template_variable shouldn't instantiate a
TEMPLATE_ID_EXPR that names a TEMPLATE_DECL which has more than
one level of (unsubstituted) parameters (such as A<T>::C).
gcc/cp/ChangeLog:
* pt.cc (lookup_and_finish_template_variable): Don't
instantiate if the template's scope is dependent.
(tsubst_copy) <case TEMPLATE_DECL>: Pass entering_scope=true
when substituting the class scope.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-friend10.C: New test.
|
|
Fix PR99619, which asks to optimize TLS model based on visibility.
The fix is implemented as an IPA optimization: this allows to take
optimized visibility status into account (as well as avoid modifying
all language frontends).
2022-04-17 Artem Klimov <jakmobius@gmail.com>
gcc/ChangeLog:
PR middle-end/99619
* ipa-visibility.cc (function_and_variable_visibility): Promote
TLS access model afer visibility optimizations.
* varasm.cc (have_optimized_refs): New helper.
(optimize_dyn_tls_for_decl_p): New helper. Use it ...
(decl_default_tls_model): ... here in place of 'optimize' check.
gcc/testsuite/ChangeLog:
PR middle-end/99619
* gcc.dg/tls/vis-attr-gd.c: New test.
* gcc.dg/tls/vis-attr-hidden-gd.c: New test.
* gcc.dg/tls/vis-attr-hidden.c: New test.
* gcc.dg/tls/vis-flag-hidden-gd.c: New test.
* gcc.dg/tls/vis-flag-hidden.c: New test.
* gcc.dg/tls/vis-pragma-hidden-gd.c: New test.
* gcc.dg/tls/vis-pragma-hidden.c: New test.
Co-Authored-By: Alexander Monakov <amonakov@gcc.gnu.org>
Signed-off-by: Artem Klimov <jakmobius@gmail.com>
|
|
The false side of UNORDERED_<cond> means neither operand can be a NAN.
Adjust all the op[12]_range entries for the UNORDERED operators such
that a known NAN on one operands means the other operands is
undefined.
gcc/ChangeLog:
* range-op-float.cc (foperator_unordered_le::op1_range): Adjust
false side with a NAN operand.
(foperator_unordered_le::op2_range): Same.
(foperator_unordered_gt::op1_range): Same.
(foperator_unordered_gt::op2_range): Same.
(foperator_unordered_ge::op1_range): Same.
(foperator_unordered_ge::op2_range): Same.
(foperator_unordered_equal::op1_range): Same.
|
|
Here node_template_info is overlooking that CONCEPT_DECL has TEMPLATE_INFO
too, which causes get_originating_module_decl for the CONCEPT_DECL to not
return the corresponding TEMPLATE_DECL, which leads to an ICE from
import_entity_index while pretty printing the CONCEPT_DECL's module
suffix as part of the static assert failure elaboration.
PR c++/102963
gcc/cp/ChangeLog:
* module.cc (node_template_info): Handle CONCEPT_DECL.
gcc/testsuite/ChangeLog:
* g++.dg/modules/concept-7_a.C: New test.
* g++.dg/modules/concept-7_b.C: New test.
|
|
The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the vectype
when widening the container.
gcc/ChangeLog:
PR tree-optimization/107326
* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Change
vectype when widening container.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr107326.c: New test.
* gcc.dg/vect/vect-bitfield-read-7.c: New test.
|
|
Since NANs can't appear in ranges for !HONOR_NANS, there's no reason
to set them in a VARYING range.
gcc/ChangeLog:
* value-range.h (frange::set_varying): Do not set NAN flags for
!HONOR_NANS.
* value-range.cc (frange::normalize_kind): Adjust for no NAN when
!HONOR_NANS.
(frange::verify_range): Same.
* range-op-float.cc (maybe_isnan): Remove flag_finite_math_only check.
|
|
The finite_operands_p function was incorrectly named, as it only
returned TRUE when !NAN. This was leftover from the initial
implementation of frange. Using the maybe_isnan() nomenclature is
more consistent and easier to understand.
gcc/ChangeLog:
* range-op-float.cc (finite_operand_p): Remove.
(finite_operands_p): Rename to...
(maybe_isnan): ...this.
(frelop_early_resolve): Use maybe_isnan instead of finite_operands_p.
(foperator_equal::fold_range): Same.
(foperator_equal::op1_range): Same.
(foperator_not_equal::fold_range): Same.
(foperator_lt::fold_range): Same.
(foperator_le::fold_range): Same.
(foperator_gt::fold_range): Same.
(foperator_ge::fold_range): Same.
|
|
The following testcases FAIL on i686-linux due to excess diagnostics
for -Wpsabi.
2022-10-20 Jakub Jelinek <jakub@redhat.com>
* gcc.target/i386/pr107271.c: Add -Wno-psabi to dg-options.
* gcc.dg/debug/btf/btf-function-3.c: Likewise.
|
|
This patch fixes a single typo in comment.
2022-10-20 Jakub Jelinek <jakub@redhat.com>
* passes.cc (pass_manager::register_pass): Fix a comment
typo - copmilation -> compilation.
|
|
The reported regression of libgomp loop-14.C shows that there isn't
generally a good reliable place to insert the permute upfront so
the following simply restricts recurrence vectorization to the cases
where the latch value isn't defined by a PHI.
* tree-vect-loop.cc (vect_phi_first_order_recurrence_p):
Disallow latch PHI defs.
(vectorizable_recurr): Revert previous change.
|
|
The GCN backend uses a heuristic to determine whether to use FLAT or
GLOBAL addressing in a particular (offload) function: namely, if a
function takes a pointer-to-scalar parameter, it is assumed that the
pointer may refer to "flat scratch" space, and thus FLAT addressing must
be used instead of GLOBAL.
I came up with this heuristic initially whilst working on support for
moving OpenACC gang-private variables into local-data share (scratch)
memory. The assumption that only scalar variables would be transformed in
that way turned out to be wrong. For example, prior to the next patch in
the series, Fortran compiler-generated temporary structures were treated
as gang private and moved to LDS space, typically overflowing the region
allocated for such variables. That will no longer happen after that
patch is applied, but there may be other cases of structs moving to LDS
space now or in the future that this patch may be needed for.
2022-10-14 Julian Brown <julian@codesourcery.com>
PR target/105421
gcc/
* config/gcn/gcn.cc (gcn_detect_incoming_pointer_arg): Any pointer
argument forces FLAT addressing mode, not just
pointer-to-non-aggregate.
|
|
With that, we may then run plain 'autoreconf' for all of GCC's subpackages,
instead of for some of those (that don't use Automake) manually having to run
the applicable combination of 'aclocal', 'autoconf', 'autoheader'.
See also 'AC_CONFIG_MACRO_DIRS'/'AC_CONFIG_MACRO_DIR' usage elsewhere.
gcc/
* configure.ac (AC_CONFIG_MACRO_DIRS): Instantiate.
* configure: Regenerate.
libobjc/
* configure.ac (AC_CONFIG_MACRO_DIRS): Instantiate.
* configure: Regenerate.
|
|
Add an aarch64_sve::gimple_folder helper for folding calls
to integer constants. SME will make more use of this.
gcc/
* config/aarch64/aarch64-sve-builtins.h
(gimple_folder::fold_to_cstu): New member function.
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::fold_to_cstu): Define.
* config/aarch64/aarch64-sve-builtins-base.cc
(svcnt_bhwd_impl::fold): Use it.
|
|
Now that the codebase is C++11, we can use using directives
to inherit constructors from base classes.
gcc/
* config/aarch64/aarch64-sve-builtins-functions.h (quiet)
(rtx_code_function, rtx_code_function_rotated, unspec_based_function)
(unspec_based_function_rotated, unspec_based_function_exact_insn)
(unspec_based_fused_function, unspec_based_fused_lane_function):
Replace constructors with using directives.
* config/aarch64/aarch64-sve-builtins-base.cc (svcnt_bhwd_pat_impl)
(svcreate_impl, svdotprod_lane_impl, svget_impl, svld1_extend_impl)
(svld1_gather_extend_impl, svld234_impl, svldff1_gather_extend)
(svset_impl, svst1_scatter_truncate_impl, svst1_truncate_impl)
(svst234_impl, svundef_impl): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svldnt1_gather_extend_impl, svmovl_lb_impl): Likewise.
(svstnt1_scatter_truncate_impl): Likewise.
|
|
Move away from the pre-C++11 compatibility macro CONSTEXPR.
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc: Replace CONSTEXPR
with constexpr throughout.
* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
|
|
Bit of a brown-paper-bag bug, but: GCC was generating
non-existent merging forms of BRKAS and BRKBS. Those
instructions only support zero predication (although
BRKA and BRKB support both).
gcc/
* config/aarch64/aarch64-sve.md (*aarch64_brk<brk_op>_cc): Remove
merging alternative.
(*aarch64_brk<brk_op>_ptest): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/brka_1.c: Expect a separate
PTEST instruction.
* gcc.target/aarch64/sve/acle/general/brkb_1.c: Likewise.
|