Age | Commit message (Collapse) | Author | Files | Lines |
|
Here we ICE trying to get DECL_SOURCE_LOCATION of the parm that happens
to be error_mark_node in this ill-formed test. I kept running into this
while reducing code, so it'd be good to have it fixed.
PR c++/106311
gcc/cp/ChangeLog:
* pt.cc (redeclare_class_template): Check DECL_P before accessing
DECL_SOURCE_LOCATION.
gcc/testsuite/ChangeLog:
* g++.dg/template/redecl5.C: New test.
|
|
Technically iranges only exist in constant form, but we allow symbolic
ones before arriving in the ranger, so legacy VRP can work. This fixes the
ICE when attempting to print symbolic iranges in the pretty printer.
For consistency's sake, I have made sure irange::get_nonzero_bits does
not similarly ICE on a symbolic range, even though no one should be
querying nonzero bits on such a range. This should all melt away
when legacy disappears, because all these methods are slated for
removal (min, max, kind, symbolic_p, constant_p, etc).
Finally, Richi suggested using pp_wide_int in the pretty printer
instead of going through trees. I've adapted a test, since
dump_generic_node seems to work slightly different.
PR tree-optimization/106444
gcc/ChangeLog:
* value-range-pretty-print.cc (vrange_printer::visit): Handle
legacy ranges.
(vrange_printer::print_irange_bound): Work on wide_int's.
* value-range-pretty-print.h (print_irange_bound): Same.
* value-range.cc (irange::get_nonzero_bits): Handle legacy ranges.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/evrp4.c: Adjust.
|
|
When the first pointer happens to be a pointer to a STRING_CST we
give up too early since the 2nd pointer handling could still end
up with a DECL for example which can disambiguate against a STRING_CST
just fine.
* tree-ssa-alias.cc (ptr_derefs_may_alias_p): If ptr1
points to a constant continue checking ptr2.
|
|
This removes a significant number of intrinsic definitions from the arm_neon.h
header file, and reduces the amount of code duplication. The new macros and
data structures are intended to also facilitate moving other intrinsic
definitions out of the header file in future.
There is a a slight change in the behaviour of the bf16 vreinterpret intrinsics
when compiling without bf16 support. Expressions like:
b = vreinterpretq_s32_bf16(vreinterpretq_bf16_s64(a))
are now compiled successfully, instead of causing a 'target specific option
mismatch' during inlining.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(MODE_d_bf16, MODE_d_f16, MODE_d_f32, MODE_d_f64, MODE_d_s8)
(MODE_d_s16, MODE_d_s32, MODE_d_s64, MODE_d_u8, MODE_d_u16)
(MODE_d_u32, MODE_d_u64, MODE_d_p8, MODE_d_p16, MODE_d_p64)
(MODE_q_bf16, MODE_q_f16, MODE_q_f32, MODE_q_f64, MODE_q_s8)
(MODE_q_s16, MODE_q_s32, MODE_q_s64, MODE_q_u8, MODE_q_u16)
(MODE_q_u32, MODE_q_u64, MODE_q_p8, MODE_q_p16, MODE_q_p64)
(MODE_q_p128): Define macro to map to corresponding mode name.
(QUAL_bf16, QUAL_f16, QUAL_f32, QUAL_f64, QUAL_s8, QUAL_s16)
(QUAL_s32, QUAL_s64, QUAL_u8, QUAL_u16, QUAL_u32, QUAL_u64)
(QUAL_p8, QUAL_p16, QUAL_p64, QUAL_p128): Define macro to map to
corresponding qualifier name.
(LENGTH_d, LENGTH_q): Define macro to map to "" or "q" suffix.
(SIMD_INTR_MODE, SIMD_INTR_QUAL, SIMD_INTR_LENGTH_CHAR): Macro
functions for the above mappings
(VREINTERPRET_BUILTIN2, VREINTERPRET_BUILTINS1, VREINTERPRET_BUILTINS)
(VREINTERPRETQ_BUILTIN2, VREINTERPRETQ_BUILTINS1)
(VREINTERPRETQ_BUILTINS, VREINTERPRET_BUILTIN)
(AARCH64_SIMD_VREINTERPRET_BUILTINS): New macros to create definitions
for all vreinterpret intrinsics
(enum aarch64_builtins): Add vreinterpret function codes
(aarch64_init_simd_intrinsics): New
(handle_arm_neon_h): Improved comment.
(aarch64_general_fold_builtin): Fold vreinterpret calls
* config/aarch64/arm_neon.h
(vreinterpret_p8_f16, vreinterpret_p8_f64, vreinterpret_p8_s8)
(vreinterpret_p8_s16, vreinterpret_p8_s32, vreinterpret_p8_s64)
(vreinterpret_p8_f32, vreinterpret_p8_u8, vreinterpret_p8_u16)
(vreinterpret_p8_u32, vreinterpret_p8_u64, vreinterpret_p8_p16)
(vreinterpret_p8_p64, vreinterpretq_p8_f64, vreinterpretq_p8_s8)
(vreinterpretq_p8_s16, vreinterpretq_p8_s32, vreinterpretq_p8_s64)
(vreinterpretq_p8_f16, vreinterpretq_p8_f32, vreinterpretq_p8_u8)
(vreinterpretq_p8_u16, vreinterpretq_p8_u32, vreinterpretq_p8_u64)
(vreinterpretq_p8_p16, vreinterpretq_p8_p64, vreinterpretq_p8_p128)
(vreinterpret_p16_f16, vreinterpret_p16_f64, vreinterpret_p16_s8)
(vreinterpret_p16_s16, vreinterpret_p16_s32, vreinterpret_p16_s64)
(vreinterpret_p16_f32, vreinterpret_p16_u8, vreinterpret_p16_u16)
(vreinterpret_p16_u32, vreinterpret_p16_u64, vreinterpret_p16_p8)
(vreinterpret_p16_p64, vreinterpretq_p16_f64, vreinterpretq_p16_s8)
(vreinterpretq_p16_s16, vreinterpretq_p16_s32, vreinterpretq_p16_s64)
(vreinterpretq_p16_f16, vreinterpretq_p16_f32, vreinterpretq_p16_u8)
(vreinterpretq_p16_u16, vreinterpretq_p16_u32, vreinterpretq_p16_u64)
(vreinterpretq_p16_p8, vreinterpretq_p16_p64, vreinterpretq_p16_p128)
(vreinterpret_p64_f16, vreinterpret_p64_f64, vreinterpret_p64_s8)
(vreinterpret_p64_s16, vreinterpret_p64_s32, vreinterpret_p64_s64)
(vreinterpret_p64_f32, vreinterpret_p64_u8, vreinterpret_p64_u16)
(vreinterpret_p64_u32, vreinterpret_p64_u64, vreinterpret_p64_p8)
(vreinterpret_p64_p16, vreinterpretq_p64_f64, vreinterpretq_p64_s8)
(vreinterpretq_p64_s16, vreinterpretq_p64_s32, vreinterpretq_p64_s64)
(vreinterpretq_p64_f16, vreinterpretq_p64_f32, vreinterpretq_p64_p128)
(vreinterpretq_p64_u8, vreinterpretq_p64_u16, vreinterpretq_p64_p16)
(vreinterpretq_p64_u32, vreinterpretq_p64_u64, vreinterpretq_p64_p8)
(vreinterpretq_p128_p8, vreinterpretq_p128_p16, vreinterpretq_p128_f16)
(vreinterpretq_p128_f32, vreinterpretq_p128_p64, vreinterpretq_p128_s64)
(vreinterpretq_p128_u64, vreinterpretq_p128_s8, vreinterpretq_p128_s16)
(vreinterpretq_p128_s32, vreinterpretq_p128_u8, vreinterpretq_p128_u16)
(vreinterpretq_p128_u32, vreinterpret_f16_f64, vreinterpret_f16_s8)
(vreinterpret_f16_s16, vreinterpret_f16_s32, vreinterpret_f16_s64)
(vreinterpret_f16_f32, vreinterpret_f16_u8, vreinterpret_f16_u16)
(vreinterpret_f16_u32, vreinterpret_f16_u64, vreinterpret_f16_p8)
(vreinterpret_f16_p16, vreinterpret_f16_p64, vreinterpretq_f16_f64)
(vreinterpretq_f16_s8, vreinterpretq_f16_s16, vreinterpretq_f16_s32)
(vreinterpretq_f16_s64, vreinterpretq_f16_f32, vreinterpretq_f16_u8)
(vreinterpretq_f16_u16, vreinterpretq_f16_u32, vreinterpretq_f16_u64)
(vreinterpretq_f16_p8, vreinterpretq_f16_p128, vreinterpretq_f16_p16)
(vreinterpretq_f16_p64, vreinterpret_f32_f16, vreinterpret_f32_f64)
(vreinterpret_f32_s8, vreinterpret_f32_s16, vreinterpret_f32_s32)
(vreinterpret_f32_s64, vreinterpret_f32_u8, vreinterpret_f32_u16)
(vreinterpret_f32_u32, vreinterpret_f32_u64, vreinterpret_f32_p8)
(vreinterpret_f32_p16, vreinterpret_f32_p64, vreinterpretq_f32_f16)
(vreinterpretq_f32_f64, vreinterpretq_f32_s8, vreinterpretq_f32_s16)
(vreinterpretq_f32_s32, vreinterpretq_f32_s64, vreinterpretq_f32_u8)
(vreinterpretq_f32_u16, vreinterpretq_f32_u32, vreinterpretq_f32_u64)
(vreinterpretq_f32_p8, vreinterpretq_f32_p16, vreinterpretq_f32_p64)
(vreinterpretq_f32_p128, vreinterpret_f64_f16, vreinterpret_f64_f32)
(vreinterpret_f64_p8, vreinterpret_f64_p16, vreinterpret_f64_p64)
(vreinterpret_f64_s8, vreinterpret_f64_s16, vreinterpret_f64_s32)
(vreinterpret_f64_s64, vreinterpret_f64_u8, vreinterpret_f64_u16)
(vreinterpret_f64_u32, vreinterpret_f64_u64, vreinterpretq_f64_f16)
(vreinterpretq_f64_f32, vreinterpretq_f64_p8, vreinterpretq_f64_p16)
(vreinterpretq_f64_p64, vreinterpretq_f64_s8, vreinterpretq_f64_s16)
(vreinterpretq_f64_s32, vreinterpretq_f64_s64, vreinterpretq_f64_u8)
(vreinterpretq_f64_u16, vreinterpretq_f64_u32, vreinterpretq_f64_u64)
(vreinterpret_s64_f16, vreinterpret_s64_f64, vreinterpret_s64_s8)
(vreinterpret_s64_s16, vreinterpret_s64_s32, vreinterpret_s64_f32)
(vreinterpret_s64_u8, vreinterpret_s64_u16, vreinterpret_s64_u32)
(vreinterpret_s64_u64, vreinterpret_s64_p8, vreinterpret_s64_p16)
(vreinterpret_s64_p64, vreinterpretq_s64_f64, vreinterpretq_s64_s8)
(vreinterpretq_s64_s16, vreinterpretq_s64_s32, vreinterpretq_s64_f16)
(vreinterpretq_s64_f32, vreinterpretq_s64_u8, vreinterpretq_s64_u16)
(vreinterpretq_s64_u32, vreinterpretq_s64_u64, vreinterpretq_s64_p8)
(vreinterpretq_s64_p16, vreinterpretq_s64_p64, vreinterpretq_s64_p128)
(vreinterpret_u64_f16, vreinterpret_u64_f64, vreinterpret_u64_s8)
(vreinterpret_u64_s16, vreinterpret_u64_s32, vreinterpret_u64_s64)
(vreinterpret_u64_f32, vreinterpret_u64_u8, vreinterpret_u64_u16)
(vreinterpret_u64_u32, vreinterpret_u64_p8, vreinterpret_u64_p16)
(vreinterpret_u64_p64, vreinterpretq_u64_f64, vreinterpretq_u64_s8)
(vreinterpretq_u64_s16, vreinterpretq_u64_s32, vreinterpretq_u64_s64)
(vreinterpretq_u64_f16, vreinterpretq_u64_f32, vreinterpretq_u64_u8)
(vreinterpretq_u64_u16, vreinterpretq_u64_u32, vreinterpretq_u64_p8)
(vreinterpretq_u64_p16, vreinterpretq_u64_p64, vreinterpretq_u64_p128)
(vreinterpret_s8_f16, vreinterpret_s8_f64, vreinterpret_s8_s16)
(vreinterpret_s8_s32, vreinterpret_s8_s64, vreinterpret_s8_f32)
(vreinterpret_s8_u8, vreinterpret_s8_u16, vreinterpret_s8_u32)
(vreinterpret_s8_u64, vreinterpret_s8_p8, vreinterpret_s8_p16)
(vreinterpret_s8_p64, vreinterpretq_s8_f64, vreinterpretq_s8_s16)
(vreinterpretq_s8_s32, vreinterpretq_s8_s64, vreinterpretq_s8_f16)
(vreinterpretq_s8_f32, vreinterpretq_s8_u8, vreinterpretq_s8_u16)
(vreinterpretq_s8_u32, vreinterpretq_s8_u64, vreinterpretq_s8_p8)
(vreinterpretq_s8_p16, vreinterpretq_s8_p64, vreinterpretq_s8_p128)
(vreinterpret_s16_f16, vreinterpret_s16_f64, vreinterpret_s16_s8)
(vreinterpret_s16_s32, vreinterpret_s16_s64, vreinterpret_s16_f32)
(vreinterpret_s16_u8, vreinterpret_s16_u16, vreinterpret_s16_u32)
(vreinterpret_s16_u64, vreinterpret_s16_p8, vreinterpret_s16_p16)
(vreinterpret_s16_p64, vreinterpretq_s16_f64, vreinterpretq_s16_s8)
(vreinterpretq_s16_s32, vreinterpretq_s16_s64, vreinterpretq_s16_f16)
(vreinterpretq_s16_f32, vreinterpretq_s16_u8, vreinterpretq_s16_u16)
(vreinterpretq_s16_u32, vreinterpretq_s16_u64, vreinterpretq_s16_p8)
(vreinterpretq_s16_p16, vreinterpretq_s16_p64, vreinterpretq_s16_p128)
(vreinterpret_s32_f16, vreinterpret_s32_f64, vreinterpret_s32_s8)
(vreinterpret_s32_s16, vreinterpret_s32_s64, vreinterpret_s32_f32)
(vreinterpret_s32_u8, vreinterpret_s32_u16, vreinterpret_s32_u32)
(vreinterpret_s32_u64, vreinterpret_s32_p8, vreinterpret_s32_p16)
(vreinterpret_s32_p64, vreinterpretq_s32_f64, vreinterpretq_s32_s8)
(vreinterpretq_s32_s16, vreinterpretq_s32_s64, vreinterpretq_s32_f16)
(vreinterpretq_s32_f32, vreinterpretq_s32_u8, vreinterpretq_s32_u16)
(vreinterpretq_s32_u32, vreinterpretq_s32_u64, vreinterpretq_s32_p8)
(vreinterpretq_s32_p16, vreinterpretq_s32_p64, vreinterpretq_s32_p128)
(vreinterpret_u8_f16, vreinterpret_u8_f64, vreinterpret_u8_s8)
(vreinterpret_u8_s16, vreinterpret_u8_s32, vreinterpret_u8_s64)
(vreinterpret_u8_f32, vreinterpret_u8_u16, vreinterpret_u8_u32)
(vreinterpret_u8_u64, vreinterpret_u8_p8, vreinterpret_u8_p16)
(vreinterpret_u8_p64, vreinterpretq_u8_f64, vreinterpretq_u8_s8)
(vreinterpretq_u8_s16, vreinterpretq_u8_s32, vreinterpretq_u8_s64)
(vreinterpretq_u8_f16, vreinterpretq_u8_f32, vreinterpretq_u8_u16)
(vreinterpretq_u8_u32, vreinterpretq_u8_u64, vreinterpretq_u8_p8)
(vreinterpretq_u8_p16, vreinterpretq_u8_p64, vreinterpretq_u8_p128)
(vreinterpret_u16_f16, vreinterpret_u16_f64, vreinterpret_u16_s8)
(vreinterpret_u16_s16, vreinterpret_u16_s32, vreinterpret_u16_s64)
(vreinterpret_u16_f32, vreinterpret_u16_u8, vreinterpret_u16_u32)
(vreinterpret_u16_u64, vreinterpret_u16_p8, vreinterpret_u16_p16)
(vreinterpret_u16_p64, vreinterpretq_u16_f64, vreinterpretq_u16_s8)
(vreinterpretq_u16_s16, vreinterpretq_u16_s32, vreinterpretq_u16_s64)
(vreinterpretq_u16_f16, vreinterpretq_u16_f32, vreinterpretq_u16_u8)
(vreinterpretq_u16_u32, vreinterpretq_u16_u64, vreinterpretq_u16_p8)
(vreinterpretq_u16_p16, vreinterpretq_u16_p64, vreinterpretq_u16_p128)
(vreinterpret_u32_f16, vreinterpret_u32_f64, vreinterpret_u32_s8)
(vreinterpret_u32_s16, vreinterpret_u32_s32, vreinterpret_u32_s64)
(vreinterpret_u32_f32, vreinterpret_u32_u8, vreinterpret_u32_u16)
(vreinterpret_u32_u64, vreinterpret_u32_p8, vreinterpret_u32_p16)
(vreinterpret_u32_p64, vreinterpretq_u32_f64, vreinterpretq_u32_s8)
(vreinterpretq_u32_s16, vreinterpretq_u32_s32, vreinterpretq_u32_s64)
(vreinterpretq_u32_f16, vreinterpretq_u32_f32, vreinterpretq_u32_u8)
(vreinterpretq_u32_u16, vreinterpretq_u32_u64, vreinterpretq_u32_p8)
(vreinterpretq_u32_p16, vreinterpretq_u32_p64, vreinterpretq_u32_p128)
(vreinterpretq_f64_p128, vreinterpretq_p128_f64, vreinterpret_bf16_u8)
(vreinterpret_bf16_u16, vreinterpret_bf16_u32, vreinterpret_bf16_u64)
(vreinterpret_bf16_s8, vreinterpret_bf16_s16, vreinterpret_bf16_s32)
(vreinterpret_bf16_s64, vreinterpret_bf16_p8, vreinterpret_bf16_p16)
(vreinterpret_bf16_p64, vreinterpret_bf16_f16, vreinterpret_bf16_f32)
(vreinterpret_bf16_f64, vreinterpretq_bf16_u8, vreinterpretq_bf16_u16)
(vreinterpretq_bf16_u32, vreinterpretq_bf16_u64, vreinterpretq_bf16_s8)
(vreinterpretq_bf16_s16, vreinterpretq_bf16_s32, vreinterpretq_bf16_s64)
(vreinterpretq_bf16_p8, vreinterpretq_bf16_p16, vreinterpretq_bf16_p64)
(vreinterpretq_bf16_p128, vreinterpretq_bf16_f16)
(vreinterpretq_bf16_f32, vreinterpretq_bf16_f64, vreinterpret_s8_bf16)
(vreinterpret_s16_bf16, vreinterpret_s32_bf16, vreinterpret_s64_bf16)
(vreinterpret_u8_bf16, vreinterpret_u16_bf16, vreinterpret_u32_bf16)
(vreinterpret_u64_bf16, vreinterpret_f16_bf16, vreinterpret_f32_bf16)
(vreinterpret_f64_bf16, vreinterpret_p8_bf16, vreinterpret_p16_bf16)
(vreinterpret_p64_bf16, vreinterpretq_s8_bf16, vreinterpretq_s16_bf16)
(vreinterpretq_s32_bf16, vreinterpretq_s64_bf16, vreinterpretq_u8_bf16)
(vreinterpretq_u16_bf16, vreinterpretq_u32_bf16, vreinterpretq_u64_bf16)
(vreinterpretq_f16_bf16, vreinterpretq_f32_bf16, vreinterpretq_f64_bf16)
(vreinterpretq_p8_bf16, vreinterpretq_p16_bf16, vreinterpretq_p64_bf16)
(vreinterpretq_p128_bf16): Delete
|
|
There were several similarly-named functions, which each built or looked up an
operand type using a different subset of valid modes or qualifiers.
This change provides a single function to return operand types, which can
additionally handle const and pointer qualifiers. For clarity, the existing
functionality is kept in separate helper functions.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_builtin_std_type): Rename to...
(aarch64_int_or_fp_type): ...this, and allow irrelevant qualifiers.
(aarch64_lookup_simd_builtin_type): Rename to...
(aarch64_simd_builtin_type): ...this. Add const/pointer
support, and extract table lookup to...
(aarch64_lookup_simd_type_in_table): ...this function.
(aarch64_init_crc32_builtins): Update to use aarch64_simd_builtin_type.
(aarch64_init_fcmla_laneq_builtins): Ditto.
(aarch64_init_simd_builtin_functions): Ditto.
|
|
This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables
better optimisation during GIMPLE passes.
gcc/
* config/aarch64/aarch64-builtins.cc
(aarch64_general_gimple_fold_builtin): Add combine.
gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/combine.c:
New test.
|
|
The diagnostic code can end up with zero sized array elements
with T[][0] and the wide-int code nicely avoids exceptions when
dividing by zero in one codepath but not in another. The following
fixes the exception by using wide-int in both paths.
PR tree-optimization/106189
* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
Divide using offset_ints.
* gcc.dg/pr106189.c: New testcase.
|
|
Add compilation option '-mexplicit-relocs', and if enable '-mexplicit-relocs'
the symbolic address load instruction 'la.*' will be split into two instructions.
This compilation option enabled by default.
gcc/ChangeLog:
* common/config/loongarch/loongarch-common.cc:
Enable '-fsection-anchors' when O1 and more advanced optimization.
* config/loongarch/genopts/loongarch.opt.in: Add new option
'-mexplicit-relocs', and enable by default.
* config/loongarch/loongarch-protos.h (loongarch_split_move_insn_p):
Delete function declaration.
(loongarch_split_move_insn): Delete function declaration.
(loongarch_split_symbol_type): Add function declaration.
* config/loongarch/loongarch.cc (enum loongarch_address_type):
Add new address type 'ADDRESS_LO_SUM'.
(loongarch_classify_symbolic_expression): New function definitions.
Classify the base of symbolic expression X, given that X appears in
context CONTEXT.
(loongarch_symbol_insns): Add a judgment condition TARGET_EXPLICIT_RELOCS.
(loongarch_split_symbol_type): New function definitions.
Determines whether the symbol load should be split into two instructions.
(loongarch_valid_lo_sum_p): New function definitions.
Return true if a LO_SUM can address a value of mode MODE when the LO_SUM
symbol has type SYMBOL_TYPE.
(loongarch_classify_address): Add handling of 'LO_SUM'.
(loongarch_address_insns): Add handling of 'ADDRESS_LO_SUM'.
(loongarch_signed_immediate_p): Sort code.
(loongarch_12bit_offset_address_p): Return true if address type is ADDRESS_LO_SUM.
(loongarch_const_insns): Add handling of 'HIGH'.
(loongarch_split_move_insn_p): Add the static attribute to the function.
(loongarch_emit_set): New function definitions.
(loongarch_call_tls_get_addr): Add symbol handling when defining TARGET_EXPLICIT_RELOCS.
(loongarch_legitimize_tls_address): Add symbol handling when defining the
TARGET_EXPLICIT_RELOCS macro.
(loongarch_split_symbol): New function definitions. Split symbol.
(loongarch_legitimize_address): Add codes see if the address can split into a high part
and a LO_SUM.
(loongarch_legitimize_const_move): Add codes split moves of symbolic constants into
high and low.
(loongarch_split_move_insn): Delete function definitions.
(loongarch_output_move): Add support for HIGH and LO_SUM.
(loongarch_print_operand_reloc): New function definitions.
Print symbolic operand OP, which is part of a HIGH or LO_SUM in context CONTEXT.
(loongarch_memmodel_needs_release_fence): Sort code.
(loongarch_print_operand): Rearrange alphabetical order and add H and L to support HIGH
and LOW output.
(loongarch_print_operand_address): Add handling of 'ADDRESS_LO_SUM'.
(TARGET_MIN_ANCHOR_OFFSET): Define macro to -IMM_REACH/2.
(TARGET_MAX_ANCHOR_OFFSET): Define macro to IMM_REACH/2-1.
* config/loongarch/loongarch.md (movti): Delete the template.
(*movti): Delete the template.
(movtf): Delete the template.
(*movtf): Delete the template.
(*low<mode>): New template of normal symbol low address.
(@tls_low<mode>): New template of tls symbol low address.
(@ld_from_got<mode>): New template load address from got table.
(@ori_l_lo12<mode>): New template.
* config/loongarch/loongarch.opt: Update from loongarch.opt.in.
* config/loongarch/predicates.md: Add support for symbol_type HIGH.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/func-call-1.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-2.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-3.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-4.c: Add build option '-mno-explicit-relocs'.
* gcc.target/loongarch/func-call-5.c: New test.
* gcc.target/loongarch/func-call-6.c: New test.
* gcc.target/loongarch/func-call-7.c: New test.
* gcc.target/loongarch/func-call-8.c: New test.
* gcc.target/loongarch/relocs-symbol-noaddend.c: New test.
|
|
1. Remove cModel type support other than normal.
2. The method for calling global functions changed from 'la.global + jirl' to 'bl'
when complied add '-fplt'.
gcc/ChangeLog:
* config/loongarch/constraints.md (a): Delete the constraint.
(b): A constant call not local address.
(h): Delete the constraint.
(t): Delete the constraint.
* config/loongarch/loongarch-opts.cc (loongarch_config_target):
Remove cModel type support other than normal.
* config/loongarch/loongarch-protos.h (enum loongarch_symbol_type):
Add new symbol type 'SYMBOL_PCREL', 'SYMBOL_TLS_IE' and 'SYMBOL_TLS_LE'.
(loongarch_split_symbol): Delete useless function declarations.
(loongarch_split_symbol_type): Delete useless function declarations.
* config/loongarch/loongarch.cc (enum loongarch_address_type):
Delete unnecessary comment information.
(loongarch_symbol_binds_local_p): Modified the judgment order of label
and symbol.
(loongarch_classify_symbol): Return symbol type. If symbol is a label,
or symbol is a local symbol return SYMBOL_PCREL. If is a tls symbol,
return SYMBOL_TLS. If is a not local symbol return SYMBOL_GOT_DISP.
(loongarch_symbolic_constant_p): Add handling of 'SYMBOL_TLS_IE'
'SYMBOL_TLS_LE' and 'SYMBOL_PCREL'.
(loongarch_symbol_insns): Add handling of 'SYMBOL_TLS_IE' 'SYMBOL_TLS_LE'
and 'SYMBOL_PCREL'.
(loongarch_address_insns): Sort code.
(loongarch_12bit_offset_address_p): Sort code.
(loongarch_14bit_shifted_offset_address_p): Sort code.
(loongarch_call_tls_get_addr): Sort code.
(loongarch_legitimize_tls_address): Sort code.
(loongarch_output_move): Remove schema support for cmodel other than normal.
(loongarch_memmodel_needs_release_fence): Sort code.
(loongarch_print_operand): Sort code.
* config/loongarch/loongarch.h (LARCH_U12BIT_OFFSET_P):
Rename to LARCH_12BIT_OFFSET_P.
(LARCH_12BIT_OFFSET_P): New macro.
* config/loongarch/loongarch.md: Reimplement the function call. Remove schema
support for cmodel other than normal.
* config/loongarch/predicates.md (is_const_call_weak_symbol): Delete this predicate.
(is_const_call_plt_symbol): Delete this predicate.
(is_const_call_global_noplt_symbol): Delete this predicate.
(is_const_call_no_local_symbol): New predicate, determines whether it is a local
symbol or label.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/func-call-1.c: New test.
* gcc.target/loongarch/func-call-2.c: New test.
* gcc.target/loongarch/func-call-3.c: New test.
* gcc.target/loongarch/func-call-4.c: New test.
|
|
As test case in PR106091 shows, rs6000 specific pass swaps
doesn't preserve the reg_note REG_EH_REGION when replacing
some load insn at the end of basic block, it causes the
flow info verification to fail unexpectedly. Since memory
reference rtx may trap, this patch is to ensure we copy
REG_EH_REGION reg_note while replacing swapped aligned load
or store.
PR target/106091
gcc/ChangeLog:
* config/rs6000/rs6000-p8swap.cc (replace_swapped_aligned_store): Copy
REG_EH_REGION when replacing one store insn having it.
(replace_swapped_aligned_load): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106091.c: New test.
|
|
|
|
Since my PR94041 work on temporary lifetime in aggregate initialization, we
end up calling build_vec_init to initialize the reference-extended temporary
for the artificial __for_range variable. And build_vec_init uses
finish_for_stmt to implement its loop. That function assumes that if
__for_range is in current_binding_level, we're finishing a range-for, and we
should fix up the variable as it goes out of scope. But when called from
build_vec_init we aren't finishing a range-for, and do_poplevel doesn't
remove the variable from scope because stmts_are_full_exprs_p is false. So
let's check that here as well, and leave the DECL_NAME alone.
PR c++/106230
gcc/cp/ChangeLog:
* semantics.cc (finish_for_stmt): Check stmts_are_full_exprs_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/range-for38.C: New test.
|
|
This modifies the range-op dispatch code to handle floats. Also
provided are the stub routines for the floating point range-ops, as we
need something to dispatch to ;-).
I am not ecstatic about the dispatch code, but there's no getting
around having to switch on the tree code and type in some manner. All
the other alternatives I played with ended up being slower, or harder
to maintain. At least, this one is self-contained in the
range_op_handler API, and less than 0.16% slower for VRP in our
benchmarks.
Tested on x86-64 Linux.
gcc/ChangeLog:
* Makefile.in (OBJS): Add range-op-float.o.
* range-op.cc (get_float_handler): New.
(range_op_handler::range_op_handler): Save code and type for
delayed querying.
(range_op_handler::oeprator bool): Move from header file, and
add support for floats.
(range_op_handler::fold_range): Add support for floats.
(range_op_handler::op1_range): Same.
(range_op_handler::op2_range): Same.
(range_op_handler::lhs_op1_relation): Same.
(range_op_handler::lhs_op2_relation): Same.
(range_op_handler::op1_op2_relation): Same.
* range-op.h (class range_operator_float): New.
(class floating_op_table): New.
* value-query.cc (range_query::get_tree_range): Add case for
REAL_CST.
* range-op-float.cc: New file.
|
|
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-2.c: Convert Windows endlines to Unix
style.
* gcc.dg/analyzer/fd-3.c: Likewise.
* gcc.dg/analyzer/fd-4.c: Likewise.
* gcc.dg/analyzer/fd-5.c: Likewise.
* c-c++-common/attr-fd.c: Likewise.
|
|
gcc/analyzer/ChangeLog:
* sm-fd.cc: Run dos2unix and fix coding style issues.
|
|
Technically, PR target/91681 has already been resolved; we now recognize the
highpart multiplication at the tree-level, we no longer use the stack, and
we currently generate the same number of instructions as LLVM. However, it
is still possible to do better, the current x86_64 code to generate a double
word addition of a zero extended operand, looks like:
xorl %r11d, %r11d
addq %r10, %rax
adcq %r11, %rdx
when it's possible (as LLVM does) to use an immediate constant:
addq %r10, %rax
adcq $0, %rdx
This is implemented by introducing a zero_extendditi2 pattern,
for zero extension from DImode to TImode on TARGET_64BIT that is
split after reload. With zero extension now visible to combine,
we add two new define_insn_and_split that add/subtract a zero
extended operand in double word mode. These apply to both 32-bit
and 64-bit code generation, to produce adc $0 and sbb $0.
One consequence of this is that these new patterns interfere with
the optimization that recognizes DW:DI = (HI:SI<<32)+LO:SI as a pair
of register moves, or more accurately the combine splitter no longer
triggers as we're now converting two instructions into two instructions
(not three instructions into two instructions). This is easily
repaired (and extended to handle TImode) by changing from a pair
of define_split (that handle operand commutativity) to a set of
four define_insn_and_split (again to handle operand commutativity).
2022-07-25 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/91681
* config/i386/i386-expand.cc (split_double_concat): A new helper
function for setting a double word value from two word values.
* config/i386/i386-protos.h (split_double_concat): Prototype here.
* config/i386/i386.md (zero_extendditi2): New define_insn_and_split.
(*add<dwi>3_doubleword_zext): New define_insn_and_split.
(*sub<dwi>3_doubleword_zext): New define_insn_and_split.
(*concat<mode><dwi>3_1): New define_insn_and_split replacing
previous define_split for implementing DST = (HI<<32)|LO as
pair of move instructions, setting lopart and hipart.
(*concat<mode><dwi>3_2): Likewise.
(*concat<mode><dwi>3_3): Likewise, where HI is zero_extended.
(*concat<mode><dwi>3_4): Likewise, where HI is zero_extended.
gcc/testsuite/ChangeLog
PR target/91681
* g++.target/i386/pr91681.C: New test case (from the PR).
* gcc.target/i386/pr91681-1.c: New int128 test case.
* gcc.target/i386/pr91681-2.c: Likewise.
* gcc.target/i386/pr91681-3.c: Likewise, but for ia32.
|
|
A cleaner approach to fix this PR has been suggested by Andrew, which
is to just return false on range_on_edge for unsupported range types.
Tested on x86-64 Linux.
PR middle-end/106432
gcc/ChangeLog:
* gimple-range.cc (gimple_ranger::range_on_edge): Return false
when the result range type is unsupported.
|
|
My attempt to shortcut unnecessary checking after finding a match was
also wrong for multiple inheritance, so let's give up on it.
PR c++/87729
gcc/cp/ChangeLog:
* class.cc (warn_hidden): Remove shortcut.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Woverloaded-virt4.C: New test.
|
|
gcc/ChangeLog:
* config/rs6000/rtems.h (CPLUSPLUS_CPP_SPEC): Undef.
|
|
When compares are integer typed the inversion with ~ isn't properly
preserved by the equality comparison even when converting the
result properly. The following fixes this by restricting the
input precisions accordingly.
PR middle-end/106414
* match.pd (~(x ^ y) -> x == y): Restrict to single bit
precision types.
* gcc.dg/torture/pr106414-1.c: New testcase.
* gcc.dg/torture/pr106414-2.c: Likewise.
|
|
This patch adds support for the ACLE Data Intrinsics to the AArch64 port.
gcc/ChangeLog:
2022-07-25 Andre Vieira <andre.simoesdiasvieira@arm.com>
* config/aarch64/aarch64.md (rbit<mode>2): Rename this ...
(@aarch64_rbit<mode>): ... to this and change it in...
(ffs<mode>2,ctz<mode>2): ... here.
(@aarch64_rev16<mode>): New.
* config/aarch64/aarch64-builtins.cc: (aarch64_builtins):
Define the following enum AARCH64_REV16, AARCH64_REV16L,
AARCH64_REV16LL, AARCH64_RBIT, AARCH64_RBITL, AARCH64_RBITLL.
(aarch64_init_data_intrinsics): New.
(aarch64_general_init_builtins): Add call to
aarch64_init_data_intrinsics.
(aarch64_expand_builtin_data_intrinsic): New.
(aarch64_general_expand_builtin): Add call to
aarch64_expand_builtin_data_intrinsic.
* config/aarch64/arm_acle.h (__clz, __clzl, __clzll, __cls, __clsl,
__clsll, __rbit, __rbitl, __rbitll, __rev, __revl, __revll, __rev16,
__rev16l, __rev16ll, __ror, __rorl, __rorll, __revsh): New.
gcc/testsuite/ChangeLog:
2022-07-25 Andre Vieira <andre.simoesdiasvieira@arm.com>
* gcc.target/aarch64/acle/data-intrinsics.c: New test.
|
|
gcc/ChangeLog:
* doc/extend.texi: Remove trailing whitespaces.
* doc/invoke.texi: Likewise.
|
|
This implements a basic frange class to represent floating point
ranges. Although it is meant to be a base for further development, it
is enough to handle relations and propagate NAN and other properties.
For ranger clients to become floating point aware, we still need the
range-op entries, which I will submit later this week. Since those
entries require specialized FP knowledge, I will ask for a review from
the FP experts before committing.
Once range-op entries come live, all ranger clients that have been
converted to the type agnostic vrange API will become FP aware: evrp,
DOM, the threaders, loop-ch, etc. (Still missing is loop unswitching,
as a lot of the int_range* temporaries should be Value_Range. I don't
have enough cycles to convert loop unswitching, but could gladly give
guidance. It should be straightforward for those familiar with the
code ;-)).
Samples things we handle:
* We can set the FP properties (!NAN, !INF, etc) at assignment from
constants (and propagate them throughout the CFG):
float z = 0.0;
if (__builtin_isnan (z))
link_error ();
* The relation oracle works in tandem with the FP ranges:
if (x > y)
;
else if (!__builtin_isnan (x) && !__builtin_isnan (y))
{
// If x and y are not NAN, the x <= y relationship holds, and the
// following conditional can be folded away.
if (x <= y)
bar ();
}
* We know the true side of all ordered conditionals (except !=)
implies !NAN:
if (x > y)
{
if (__builtin_isnan (x) || __builtin_isnan (y))
link_error ();
}
Range-ops also works correctly with -ffinite-math-only, and avoids
checking for NANs, etc.
I believe this is enough to get a fully fleshed out floating point
support for evrp and friends, but doing so is beyond my limited FP
knowledge. For example, frange could be enhanced to track constant
endpoints, and we could track other FP properties aside from NAN.
Further discussion is gladly welcome.
Tested on x86-64 Linux.
gcc/ChangeLog:
* value-range-pretty-print.cc (vrange_printer::visit): New.
(vrange_printer::print_frange_prop): New.
* value-range-pretty-print.h (class vrange_printer): Add visit and
print_frange_prop.
* value-range-storage.h (vrange_allocator::alloc_vrange): Handle frange.
(vrange_allocator::alloc_frange): New.
* value-range.cc (vrange::operator=): Handle frange.
(vrange::operator==): Same.
(frange::accept): New.
(frange::set): New.
(frange::normalize_kind): New.
(frange::union_): New.
(frange::intersect): New.
(frange::operator=): New.
(frange::operator==): New.
(frange::supports_type_p): New.
(frange::verify_range): New.
* value-range.h (enum value_range_discriminator): Handle frange.
(class fp_prop): New.
(FP_PROP_ACCESSOR): New.
(class frange_props): New.
(FRANGE_PROP_ACCESSOR): New.
(class frange): New.
(Value_Range::init): Handle frange.
(Value_Range::operator=): Same.
(Value_Range::supports_type_p): Same.
(frange_props::operator==): New.
(frange_props::union_): New.
(frange_props::intersect): New
(frange::frange): New.
(frange::type): New.
(frange::set_varying): New.
(frange::set_undefined): New.
|
|
As PR106345 shows, when configuring compiler with an explicit
option --with-tune=<value>, it would cause some test cases to
fail if their test points are sensitive to tune setting, such
as: group_ending_nop, loop align etc. It doesn't help that
even to specify one explicit -mcpu=.
This patch is to adjust the behavior of -mdejagnu-cpu by
filtering out all -mcpu= and -mtune= options, then test cases
would use <cpu> as tune as the one specified by -mdejagnu-cpu.
2022-07-25 Peter Bergner <bergner@linux.ibm.com>
Kewen Lin <linkw@linux.ibm.com>
PR testsuite/106345
gcc/ChangeLog:
* config/rs6000/rs6000.h (DRIVER_SELF_SPECS): Adjust -mdejagnu-cpu
to filter out all -mtune options.
|
|
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/nsdmi-union7.C: Fix PR number.
|
|
|
|
The legacy code in vr_values mostly works on integral types (with few
exceptions such as some conversions from float). This patch makes
vr_values::range_of_expr not die when asked for a range of an
unsupported type. It also keeps the min/max simplification code from
being called on non integrals, similarly to what many of the other
assignment code is doing.
This is all a nop on the current code, but will keep us from
misbehaving when VRP starts working on non-integrals.
Tested on x86-64 Linux.
gcc/ChangeLog:
* value-query.cc (range_query::get_value_range): Add assert.
* vr-values.cc (vr_values::range_of_expr): Make sure we don't ICE
on unsupported types in vr_values.
(simplify_using_ranges::simplify): Same.
|
|
The global get_nonzero_bits was previously returning -1 for
unsupported types. I dropped this in the conversion to global ranges
and it's causing a problem in the frange work, where CCP is asking for
the nonzero bits of non-integral types. CCP may require further
tweaks, but for now, restore the original behavior.
Also, I'm removing old checks for precision that no longer hold, now
that we handle various types for global ranges.
Tested on x86-64 Linux.
gcc/ChangeLog:
* tree-ssanames.cc (get_nonzero_bits): Return -1 for unsupported
types.
* value-query.cc (get_ssa_name_range_info): Remove precision check.
|
|
Similarly to what we did for the relation oracle, but for the path
oracle. This was found while working on frange, where we can test for
x == x while checking for NANness.
Tested on x86-64 Linux.
gcc/ChangeLog:
* value-relation.cc (value_relation::set_relation): Remove assert.
(path_oracle::register_relation): Exit when trying to register
same SSA name relations.
|
|
Here are a few conversions to type agnostic vrange I found while
working on frange.
Tested on x86-64 Linux.
gcc/ChangeLog:
* gimple-range-cache.cc (ranger_cache::edge_range): Convert to vrange.
(ranger_cache::range_from_dom): Same.
* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges): Same.
|
|
This patch resolves PR target/106303 (and the related PRs 106347,
106404, 106407) which are ICEs caused by my improvements to x86_64's
128-bit TImode to V1TImode Scalar to Vector (STV) pass. My apologies
for the breakage. The issue is that data flow analysis is used to
partition usage of each TImode pseudo into "chains", where each
chain is analyzed and if suitable converted to vector operations.
The problems appears when some chains for a pseudo are converted,
and others aren't as RTL sharing can result in some mode changes
leaking into other instructions that aren't/shouldn't/can't be
converted, which eventually leads to an ICE for mismatched modes.
My first approach to a fix was to unify more of the STV infrastructure,
reasoning that if TImode STV was exhibiting these problems, but DImode
and SImode STV weren't, the issue was likely to be caused/resolved by
these remaining differences. This appeared to fix some but not all of
the reported PRs. A better solution was then proposed by H.J. Lu in
Bugzilla, that we need to iterate the removal of candidates in the
function timode_remove_non_convertible_regs until there are no further
changes. As each chain is removed from consideration, it in turn may
affect whether other insns/chains can safely be converted.
2022-07-24 Roger Sayle <roger@nextmovesoftware.com>
H.J. Lu <hjl.tools@gmail.com>
gcc/ChangeLog
PR target/106303
PR target/106347
* config/i386/i386-features.cc (make_vector_copies): Move from
general_scalar_chain to scalar_chain.
(convert_reg): Likewise.
(convert_insn_common): New scalar_chain method split out from
general_scalar_chain convert_insn.
(convert_registers): Move from general_scalar_chain to
scalar_chain.
(scalar_chain::convert): Call convert_insn_common before calling
convert_insn.
(timode_remove_non_convertible_regs): Iterate until there are
no further changes to the candidates.
* config/i386/i386-features.h (scalar_chain::hash_map): Move
from general_scalar_chain.
(scalar_chain::convert_reg): Likewise.
(scalar_chain::convert_insn_common): New shared method.
(scalar_chain::make_vector_copies): Move from general_scalar_chain.
(scalar_chain::convert_registers): Likewise. No longer virtual.
(general_scalar_chain::hash_map): Delete. Moved to scalar_chain.
(general_scalar_chain::convert_reg): Likewise.
(general_scalar_chain::make_vector_copies): Likewise.
(general_scalar_chain::convert_registers): Delete virtual method.
(timode_scalar_chain::convert_registers): Likewise.
gcc/testsuite/ChangeLog
PR target/106303
PR target/106347
* gcc.target/i386/pr106303.c: New test case.
* gcc.target/i386/pr106347.c: New test case.
|
|
|
|
This patch adds three new function attributes to GCC that
are used for static analysis of usage of file descriptors:
1) __attribute__ ((fd_arg(N))): The attributes may be applied to a function that
takes an open file descriptor at refrenced argument N.
It indicates that the passed filedescriptor must not have been closed.
Therefore, when the analyzer is enabled with -fanalyzer, the
analyzer may emit a -Wanalyzer-fd-use-after-close diagnostic
if it detects a code path in which a function with this attribute is
called with a closed file descriptor.
The attribute also indicates that the file descriptor must have been checked for
validity before usage. Therefore, analyzer may emit
-Wanalyzer-fd-use-without-check diagnostic if it detects a code path in
which a function with this attribute is called with a file descriptor that has
not been checked for validity.
2) __attribute__((fd_arg_read(N))): The attribute is identical to
fd_arg, but with the additional requirement that it might read from
the file descriptor, and thus, the file descriptor must not have been opened
as write-only.
The analyzer may emit a -Wanalyzer-access-mode-mismatch
diagnostic if it detects a code path in which a function with this
attribute is called on a file descriptor opened with O_WRONLY.
3) __attribute__((fd_arg_write(N))): The attribute is identical to fd_arg_read
except that the analyzer may emit a -Wanalyzer-access-mode-mismatch diagnostic if
it detects a code path in which a function with this attribute is called on a
file descriptor opened with O_RDONLY.
gcc/analyzer/ChangeLog:
* sm-fd.cc (fd_param_diagnostic): New diagnostic class.
(fd_access_mode_mismatch): Change inheritance from fd_diagnostic
to fd_param_diagnostic. Add new overloaded constructor.
(fd_use_after_close): Likewise.
(unchecked_use_of_fd): Likewise and also change name to fd_use_without_check.
(double_close): Change name to fd_double_close.
(enum access_directions): New.
(fd_state_machine::on_stmt): Handle calls to function with the
new three function attributes.
(fd_state_machine::check_for_fd_attrs): New.
(fd_state_machine::on_open): Use the new overloaded constructors
of diagnostic classes.
gcc/c-family/ChangeLog:
* c-attribs.cc: (c_common_attribute_table): add three new attributes
namely: fd_arg, fd_arg_read and fd_arg_write.
(handle_fd_arg_attribute): New.
gcc/ChangeLog:
* doc/extend.texi: Add fd_arg, fd_arg_read and fd_arg_write under
"Common Function Attributes" section.
* doc/invoke.texi: Add docs to -Wanalyzer-fd-access-mode-mismatch,
-Wanalyzer-use-after-close, -Wanalyzer-fd-use-without-check that these
warnings may be emitted through usage of three function attributes used
for static analysis of file descriptors namely fd_arg, fd_arg_read and
fd_arg_write.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-5.c: New test.
* gcc.dg/analyzer/fd-4.c: Remove quotes around 'read-only' and
'write-only'.
* c-c++-common/attr-fd.c: New test.
Signed-off-by: Immad Mir <mirimmad17@gmail.com>
|
|
|
|
Fix state explosion on va_arg when the call to va_start is in the
top-level function of the analysis.
gcc/analyzer/ChangeLog:
PR analyzer/106413
* varargs.cc (region_model::impl_call_va_start): Avoid iterating
through non-existant variadic arguments by initializing the
impl_region to "UNKNOWN" if the va_start occurs in the top-level
function to the analysis.
gcc/testsuite/ChangeLog:
PR analyzer/106413
* gcc.dg/analyzer/torture/stdarg-4.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/analyzer/ChangeLog:
PR analyzer/106401
* store.cc (binding_cluster::binding_cluster): Remove overzealous
assertion; we're checking for tracked_p in
store::get_or_create_cluster.
gcc/testsuite/ChangeLog:
PR analyzer/106401
* gcc.dg/analyzer/memcpy-2.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
During CTAD, we currently perform the first phase of overload resolution
from [over.match.list] only if the class template has a list constructor.
But according to [over.match.class.deduct]/4 it should be enough to just
have a guide that looks like a list constructor (which is a more general
criterion in light of user-defined guides).
PR c++/106366
gcc/cp/ChangeLog:
* pt.cc (do_class_deduction): Don't consider TYPE_HAS_LIST_CTOR
when setting try_list_ctor. Reset args even when try_list_ctor
is true and there are no list candidates. Call resolve_args on
the reset args. Rename try_list_ctor to try_list_cand.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction112.C: New test.
|
|
|
|
|
|
This patch unifies the handling of zero capacity regions for structs
and other types in the allocation size checker.
Regression-tested on x86_64 Linux.
2022-07-22 Tim Lange <mail@tim-lange.me>
gcc/analyzer/ChangeLog:
PR analyzer/106394
* region-model.cc (capacity_compatible_with_type): Always return true
if alloc_size is zero.
gcc/testsuite/ChangeLog:
PR analyzer/106394
* gcc.dg/analyzer/pr106394.c: New test.
|
|
equal to zero"
The RTL combiner will transform "if ((x & C) == C) goto label;"
into "if ((~x & C) == 0) goto label;" and will try to match it with
the insn patterns.
/* example */
void test_0(int a) {
if ((char)a == 255)
foo();
}
void test_1(int a) {
if ((unsigned short)a == 0xFFFF)
foo();
}
void test_2(int a) {
if ((a & 0x00003F80) != 0x00003F80)
foo();
}
;; before
test_0:
extui a2, a2, 0, 8
movi a3, 0xff
bne a2, a3, .L1
j.l foo, a9
.L1:
ret.n
test_1:
movi.n a3, -1
extui a2, a2, 0, 16
extui a3, a3, 16, 16
bne a2, a3, .L3
j.l foo, a9
.L3:
ret.n
test_2:
movi a3, 0x80
extui a2, a2, 7, 7
addmi a3, a3, 0x3f00
slli a2, a2, 7
beq a2, a3, .L5
j.l foo, a9
.L5:
ret.n
;; after
test_0:
movi a3, 0xff
bnall a2, a3, .L1
j.l foo, a9
.L1:
ret.n
test_1:
movi.n a3, -1
extui a3, a3, 16, 16
bnall a2, a3, .L3
j.l foo, a9
.L3:
ret.n
test_2:
movi a3, 0x80
addmi a3, a3, 0x3f00
ball a2, a3, .L5
j.l foo, a9
.L5:
ret.n
gcc/ChangeLog:
* config/xtensa/xtensa.md (*masktrue_const_bitcmpl):
Add a new insn_and_split pattern, and a few split patterns for
spacial cases.
|
|
Avoid bash-specific ((expression)) syntax. As the bash syntax
converts a non-zero value to a zero status (and a zero value to a 1
status), and POSIX arithmetic expansion does not, we have to negate
the result.
Based on patch by Sören Tempel.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/419154
|
|
graphds_scc says that it uses Tarjan's algorithm, but it looks like
it uses Kosaraju's algorithm instead (dfs one way followed by dfs
the other way).
gcc/
* graphds.cc (graphds_scc): Fix algorithm attribution.
|
|
While .STORE_LANES is not supported by the recent VN patch we were
still accessing the stored value and valueizing it - but
internal_fn_stored_value_index does not support .STORE_LANES and
we failed to honor that case. Fixed by simply moving the affected
code below the check for the actual supported internal functions.
PR tree-optimization/106403
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Move stored
value valueization after check for IFN_MASKED_STORE or
IFN_LEN_STORE.
|
|
The following fixes maintaining LC SSA when array prefetch inserts
mfence instructions on loop exits that do not use memory. It also
fixes the latent issue that it might split exit edges for this
which will break LC SSA for non-virtuals as well. It should also
make the process cheaper by accumulating the required (LC) SSA
update until the end of the pass.
PR tree-optimization/106397
* tree-ssa-loop-prefetch.cc (emit_mfence_after_loop): Do
not update SSA form here.
(mark_nontemporal_stores): Return whether we marked any
non-temporal stores and inserted mfence.
(loop_prefetch_arrays): Note when we need to update SSA.
(tree_ssa_prefetch_arrays): Perform required (LC) SSA update
at the end of the pass.
* gcc.dg/pr106397.c: New testcase.
|
|
The following fixes an oversight triggering after the recent change
to bump_vector_ptr.
PR tree-optimization/106387
* tree-vect-stmts.cc (vectorizable_load): Use make_ssa_name
if ptr is not an SSA name.
|
|
PR other/106370
gcc/cp/ChangeLog:
* init.cc (sort_mem_initializers): Remove continue as last stmt
in a loop.
libiberty/ChangeLog:
* _doprnt.c: Remove continue as last stmt
in a loop.
|
|
r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d lower complex type
move to scalars, but testcase pr23911 is supposed to scan __complex__
constant which is never available, so adjust testcase to scan
IMAGPART/REALPART_EXPR constants separately.
gcc/testsuite/ChangeLog
PR tree-optimization/106010
* gcc.dg/pr23911.c: Scan IMAGPART/REALPART_EXPR = ** instead
of __complex__ since COMPLEX_CST is lower to scalars.
|
|
And split it after reload.
gcc/ChangeLog:
PR target/106038
* config/i386/mmx.md (<code><mode>3): New define_expand, it's
original "<code><mode>3".
(*<code><mode>3): New define_insn, it's original
"<code><mode>3" be extended to handle memory and immediate
operand with ix86_binary_operator_ok. Also adjust define_split
after it.
(mmxinsnmode): New mode attribute.
(*mov<mode>_imm): Refactor with mmxinsnmode.
* config/i386/predicates.md
(register_or_x86_64_const_vector_operand): New predicate.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr106038-1.c: New test.
|
|
This cleans up some of the naming around the vstrir and vstril
instruction definitions, with some cosmetic changes for consistency.
No functional changes.
Regtested just in case, no regressions.
[V2] Used 'direct' instead of 'internal', and cosmetically reworked
the changelog.
gcc/
* config/rs6000/altivec.md:
(vstrir_code_<mode>): Rename to...
(vstrir_direct_<mode>): ... this.
(vstrir_p_code_<mode>): Rename to...
(vstrir_p_direct_<mode>): ... this.
(vstril_code_<mode>): Rename to...
(vstril_direct_<mode>): ... this.
(vstril_p_code_<mode>): Rename to...
(vstril_p_direct_<mode>): ... this.
|