Age | Commit message (Collapse) | Author | Files | Lines |
|
This implements the conversion from Big_Integer to Long_Long_Unsigned on
32-bit platforms and to Long_Long_Long_{Integer,Unsigned} on 64-bit ones.
gcc/ada/
* libgnat/s-genbig.ads (From_Bignum): New overloaded declarations.
* libgnat/s-genbig.adb (LLLI): New subtype.
(LLLI_Is_128): New boolean constant.
(From_Bignum): Change the return type of the signed implementation
to Long_Long_Long_Integer and add support for the case where its
size is 128 bits. Add a wrapper around it for Long_Long_Integer.
Add an unsigned implementation returning Unsigned_128 and a wrapper
around it for Unsigned_64.
(To_Bignum): Test LLLI_Is_128 instead of its size.
(To_String.Image): Add qualification to calls to From_Bignum.
* libgnat/a-nbnbin.adb (To_Big_Integer): Likewise.
(Signed_Conversions.From_Big_Integer): Likewise.
(Unsigned_Conversions): Likewise.
|
|
This fixes a spurious error on an imported function with a precondition
and a parameter declared with a 'Base formal type, and even a crash in
the case where this function is declared in a generic package.
gcc/ada/
* freeze.adb (Wrap_Imported_Subprogram): Use Copy_Subprogram_Spec
to copy the spec from the subprogram to the generated subprogram
body.
(Freeze_Entity): Do not wrap imported subprograms inside generics.
|
|
gcc/ada/
* sem_ch4.adb (Analyze_Expression_With_Actions.Check_Action_Ok):
If Comes_From_Source (A) is False, then look at Original_Node (A)
instead of A. In particular, if an (illegal) expression function
is transformed into a "vanilla" function, we don't want to allow
it just because Comes_From_Source is now False.
|
|
When a feature that is legal in Ada2022 but not in earlier Ada versions
is used, we typically want to call Error_Msg_Ada_2022_Feature in order to
generate an informative message in the error case. Specifying No_Return
for a function (as opposed to a procedure) is no exception to this rule.
gcc/ada/
* sem_prag.adb (Analyze_Pragma): In Check_No_Return, call
Error_Msg_Ada_2022_Feature in the case of a function. Remove code
outside of Check_No_Return that was querying Ada_Version.
|
|
The temporary is first finalized through its enclosing block.
gcc/ada/
* exp_ch4.adb (Expand_N_Expression_With_Actions.Process_Action): Do
not look into nested blocks.
|
|
They need to go through Constrain_Array or else they do not really work.
gcc/ada/
* sem_ch3.adb (Find_Type_Of_Object): In a spec expression, also set
the Scope of the type, and call Constrain_Array for array subtypes.
|
|
When getting the rightmost node of a pretty-printed expression we
incorrectly traversed some composite nodes, which caused the expression
image to be chopped.
gcc/ada/
* pprint.adb (Expression_Image): Reduce scope of local variables; inline
local uncommented constant From_Source; concatenate string with a single
character, as it is likely to execute faster; add missing cases to
traversal for the rightmost node and assertion to demonstrate that the
??? comment is no longer relevant.
|
|
When pretty-printing expressions with a CASE alternatives we can qualify
the call to Nkind using N_Subexpr, so that we will get compile-time
errors when new node kinds are added (e.g. Ada 2022 case expressions).
gcc/ada/
* pprint.adb (Expr_Name): Qualify CASE expression with N_Subexpr; add
missing alternative for N_Raise_Storage_Error; remove dead alternatives;
explicitly list unsupported alternatives.
|
|
When printing expression images, e.g. for GNATprove counterexamples,
it seems better to print DEL not directly but with its numeric code.
gcc/ada/
* pprint.adb (Expr_Name): Exclude DEL from printable range.
|
|
When copying the AST we need to update fields that carry semantic
meaning and not just copy them. We already updated some of them,
e.g. the First/Next_Named_Association chain, but failed to update
the Controlling_Argument.
This fix doesn't appear to change anything for the compiler, but it is
needed for GNATprove, where we no longer want to expand expression
functions and instead we want to copy their preanalyzed expressions.
gcc/ada/
* sem_util.ads (New_Copy_Tree): Update comment.
* sem_util.adb (New_Copy_Tree): Update Controlling_Argument, very
much like we update the First/Next_Named_Association.
|
|
Remove Ada_With_Extensions, which is not used on the C side.
Do not add Ada_With_Core_Extensions and Ada_With_All_Extensions,
which are also not used on the C side, and on the Ada side
are always used via functions All_Extensions_Allowed and
Core_Extensions_Allowed. Explain this in comments.
Move the functions closer to the type declaration,
so the usage style is clearer.
Cleanup only -- no change in compiler behavior.
gcc/ada/
* fe.h: Remove Ada_With_Extensions and add commentary.
* opt.ads: Rearrange code and add commentary.
|
|
In (illegal) mutually-dependent type declarations, it is possible for
Etype (Etype (Typ)) to point back to Typ. This patch stops the recursion
in such cases.
gcc/ada/
* sem_util.adb (Process_Type): Stop the recursion.
* exp_aggr.adb (Build_Record_Aggr_Code): Add assertion.
|
|
Address comments from Richard that splits the patch of fixing
multiple-rgroup
handling of length counting elements.
This patch is fixing issue of handling multiple-rgroup of length is
counting elements
Before this patch, multiple rgroup run fail:
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c
execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
execution test
After this patch, These tests are all passed.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_get_loop_len): Fix issue for
multiple-rgroup of length.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (vect_get_loop_len): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c:
New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c:
New test.
|
|
Since satisfies_constraint_vi (x) belongs to RVV region.
We make this condition inside riscv_v_ext_vector_mode_p to make codes
more reasonable.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_const_insns): Reorganize the
codes.
|
|
This patch is to refactor the handlings for the case (index
== count) in a loop of vect_transform_slp_perm_load_1, in
order to prepare a subsequent adjustment on *nperm. This
patch doesn't have any functional changes.
Basically this is to rewrite two if below:
if (index == count && !noop_p)
{
// A ...
// ++*n_perms;
}
if (index == count)
{
if (!analyze_only)
{
if (!noop_p)
// B1 ...
// B2 ...
for ...
{
if (!noop_p)
// B3 building VEC_PERM_EXPR
else
// B4 building nothing (no uses for B2 and its seq)
}
}
// B5
}
into one hunk below:
if (index == count)
{
if (!noop_p)
{
// A ...
// ++*n_perms;
if (!analyze_only)
{
// B1 ...
// B2 ...
for ...
// B3 building VEC_PERM_EXPR
}
}
else if (!analyze_only)
{
// no B2 since no any further uses here.
for ...
// B4 building nothing
}
// B5 ...
}
gcc/ChangeLog:
* tree-vect-slp.cc (vect_transform_slp_perm_load_1): Refactor the
handling for the case index == count.
|
|
|
|
If just one bit is inserted in the same position like with:
__builtin_avr_insert_bits (0xFFFFF2FF, src, dst);
a BLD/BST sequence is better than XOR/AND/XOR. Thus, don't fold that
case to the latter sequence.
gcc/
PR target/90622
* config/avr/avr.cc (avr_fold_builtin) [AVR_BUILTIN_INSERT_BITS]:
Don't fold to XOR / AND / XOR if just one bit is copied to the
same position.
|
|
This patch adds support for (a pair of) bit reversal intrinsics
__builtin_nvptx_brev and __builtin_nvptx_brevll which perform 32-bit
and 64-bit bit reversal (using nvptx's brev instruction) matching
the __brev and __brevll instrinsics provided by NVidia's nvcc compiler.
https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__INT.html
2023-05-21 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/nvptx/nvptx.cc (nvptx_expand_brev): Expand target
builtin for bit reversal using brev instruction.
(enum nvptx_builtins): Add NVPTX_BUILTIN_BREV and
NVPTX_BUILTIN_BREVLL.
(nvptx_init_builtins): Define "brev" and "brevll".
(nvptx_expand_builtin): Expand NVPTX_BUILTIN_BREV and
NVPTX_BUILTIN_BREVLL via nvptx_expand_brev function.
* doc/extend.texi (Nvidia PTX Builtin-in Functions): New
section, document __builtin_nvptx_brev{,ll}.
gcc/testsuite/ChangeLog
* gcc.target/nvptx/brev-1.c: New 32-bit test case.
* gcc.target/nvptx/brev-2.c: Likewise.
* gcc.target/nvptx/brevll-1.c: New 64-bit test case.
* gcc.target/nvptx/brevll-2.c: Likewise.
|
|
On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
operands are another bit-wise operation with a common input. If so,
distribute the bit operations to save an operation and possibly two if
constants are involved. For example, convert
(A | B) & (A | C) into A | (B & C)
Further simplification will occur if B and C are constants. */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.
So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.
2023-05-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.
* gcc.target/aarch64/sve/pr109505.c: New test.
|
|
I had thought extract_bit_field bitpos argument was the shifted position
and not the bitposition like BIT_FIELD_REF so I had removed the code which
would use the correct bitposition for BYTES_BIG_ENDIAN.
Committed as obvious; I checked big-endian MIPS to make sure we are now
producing the correct code.
gcc/ChangeLog:
* expr.cc (expand_single_bit_test): Correct bitpos for big-endian.
|
|
This patch support the RVV VREINTERPRET from the int to the
vbool[2|4|8|16|32|64]_t. Aka:
vbool[2|4|8|16|32|64]_t __riscv_vreinterpret_x_x(v{u}int[8|16|32|64]_t);
These APIs help the users to convert vector LMUL=1 integer to
vbool[2-64]_t. According to the RVV intrinsic SPEC as below,
the reinterpret intrinsics only change the types of the underlying
contents.
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#reinterpret-vbool-o-vintm1
For example, given below code.
vbool64_t test_vreinterpret_v_u8m1_b64 (vuint8m1_t src) {
return __riscv_vreinterpret_v_u8m1_b64 (src);
}
It will generate the assembly code similar as below:
vsetvli a5,zero,e8,mf8,ta,ma
vlm.v v1,0(a1)
vsm.v v1,0(a0)
ret
Please NOTE the test files doesn't cover all the possible combinations
of the intrinsic APIs introduced by this PATCH due to too many.
The reinterpret from vbool*_t to v{u}int*_t with lmul=1 will be coverred
int another PATCH.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/genrvv-type-indexer.cc (BOOL_SIZE_LIST): Add the
rest bool size, aka 2, 4, 8, 16, 32, 64.
* config/riscv/riscv-vector-builtins-functions.def (vreinterpret):
Register vbool[2|4|8|16|32|64] interpret function.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_BOOL2_INTERPRET_OPS):
New macro for vbool2_t.
(DEF_RVV_BOOL4_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL8_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL16_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL32_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL64_INTERPRET_OPS): Likewise.
(vint8m1_t): Add the type to bool[2|4|8|16|32|64]_interpret_ops.
(vint16m1_t): Likewise.
(vint32m1_t): Likewise.
(vint64m1_t): Likewise.
(vuint8m1_t): Likewise.
(vuint16m1_t): Likewise.
(vuint32m1_t): Likewise.
(vuint64m1_t): Likewise.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_BOOL2_INTERPRET_OPS):
New macro for vbool2_t.
(DEF_RVV_BOOL4_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL8_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL16_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL32_INTERPRET_OPS): Likewise.
(DEF_RVV_BOOL64_INTERPRET_OPS): Likewise.
(required_extensions_p): Add vbool[2|4|8|16|32|64] interpret case.
* config/riscv/riscv-vector-builtins.def (bool2_interpret): Add
vbool2_t interprect to base type.
(bool4_interpret): Likewise.
(bool8_interpret): Likewise.
(bool16_interpret): Likewise.
(bool32_interpret): Likewise.
(bool64_interpret): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Add
test cases for vbool[2|4|8|16|32|64]_t.
|
|
The problem is I used expand_expr with the target but
we don't want to use the target here as it is the wrong
mode for the original expression. The testcase would ICE
deap down while trying to do a move to use the target.
Anyways just calling expand_expr with NULL_EXPR fixes
the issue.
Committed as obvious after a bootstrap/test on x86_64-linux-gnu.
PR middle-end/109919
gcc/ChangeLog:
* expr.cc (expand_single_bit_test): Don't use the
target for expand_expr.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr109919-1.c: New test.
|
|
|
|
gcc/ChangeLog:
* doc/install.texi (Specific): Remove de facto empty alpha*-*-*
section.
|
|
There are 2 local array in function optimize_mode_switching. It will be
initialized conditionally at the beginning but then always consumed in
another loop. It may trigger the warning maybe-uninitialized, and may
result in build failure when enable werror, aka warning as error.
This patch will initialize the local array to zero explictly when
declaration.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* mode-switching.cc (entity_map): Initialize the array to zero.
(bb_info): Ditto.
|
|
This patch removes the superfluous parallel in [u]divmod patterns in
the AVR backend. Effect of extra parallel is that add_clobbers reaches
gcc_unreachable() because the clobbers for [u]divmod are missing.
If an insn has multiple parts like clobbers, the parallel around the
parts of the insn pattern is implicit.
gcc/
PR target/105753
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod<mode>4): Tidy code. Use gcc_unreachable() instead of
printing error text to assembly.
gcc/testsuite/
PR target/105753
* gcc.target/avr/torture/pr105753.c: New test.
|
|
Instead of using creating trees to the expansion,
just expand directly which makes the code a little simplier
but also reduces how much GC memory will be used during the expansion.
gcc/ChangeLog:
* expr.cc (fold_single_bit_test): Rename to ...
(expand_single_bit_test): This and expand directly.
(do_store_flag): Update for the rename function.
|
|
Instead of depending on combine to do the extraction,
Let's create a tree which will expand directly into
the extraction. This improves code generation on some
targets.
gcc/ChangeLog:
* expr.cc (fold_single_bit_test): Use BIT_FIELD_REF
instead of shift/and.
|
|
Since we know that fold_single_bit_test is now only passed
NE_EXPR or EQ_EXPR, we can simplify it and just use a gcc_assert
to assert that is the code that is being passed.
gcc/ChangeLog:
* expr.cc (fold_single_bit_test): Add an assert
and simplify based on code being NE_EXPR or EQ_EXPR.
|
|
Now the only use of fold_single_bit_test is in do_store_flag,
we can change it such that to pass the inner arg and bitnum
instead of building a tree. There is no code generation changes
due to this change, only a decrease in GC memory that is produced
during expansion.
gcc/ChangeLog:
* expr.cc (fold_single_bit_test): Take inner and bitnum
instead of arg0 and arg1. Update the code.
(do_store_flag): Don't create a tree when calling
fold_single_bit_test instead just call it with the bitnum
and the inner tree.
|
|
The code in fold_single_bit_test, checks if
the inner was a right shift and improve the bitnum
based on that. But since the inner will always be a
SSA_NAME at this point, the code is dead. Move it over
to use the helper function get_def_for_expr instead.
gcc/ChangeLog:
* expr.cc (fold_single_bit_test): Use get_def_for_expr
instead of checking the inner's code.
|
|
fold_single_bit_test
Since the last use of fold_single_bit_test is fold_single_bit_test,
we can inline it and even simplify the inlined version. This has
no behavior change.
gcc/ChangeLog:
* expr.cc (fold_single_bit_test_into_sign_test): Inline into ...
(fold_single_bit_test): This and simplify.
|
|
This is part 1 of N patch set that will change the expansion
of `(A & C) != 0` from using trees to directly expanding so later
on we can do some cost analysis.
Since the only user of fold_single_bit_test is now
expand, move it to there.
gcc/ChangeLog:
* fold-const.cc (fold_single_bit_test_into_sign_test): Move to
expr.cc.
(fold_single_bit_test): Likewise.
* expr.cc (fold_single_bit_test_into_sign_test): Move from fold-const.cc
(fold_single_bit_test): Likewise and make static.
* fold-const.h (fold_single_bit_test): Remove declaration.
|
|
Two issues have been observed in current riscv_expand_conditional_move
implementation.
1. Before introduction of TARGET_XTHEADCONDMOV, op0 of comparision expression
is used for mode comparision with word_mode, but after TARGET_XTHEADCONDMOV
megered with TARGET_SFB_ALU, dest of if-then-else is used for mode comparision with
word_mode, and from md file mode of dest is DI or SI which can be different with
word_mode in RV64.
2. TARGET_XTHEADCONDMOV cannot be generated when the mode of the comparison is E_VOID.
This patch solves the issues above.
Provide an example from the newly added test case.
Testcase:
int ConNmv_reg_reg_reg(int x, int y, int z, int n){
if (x != y) return z;
return n;
}
Cflags:
-O2 -march=rv64gc_xtheadcondmov -mabi=lp64d
before patch:
ConNmv_reg_reg_reg:
bne a0,a1,.L23
mv a2,a3
.L23:
mv a0,a2
ret
after patch:
ConNmv_reg_reg_reg:
sub a1,a0,a1
th.mveqz a2,zero,a1
th.mvnez a3,zero,a1
or a0,a2,a3
ret
Co-Authored by: Fei Gao <gaofei@eswincomputing.com>
Signed-off-by: Die Li <lidie@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move): Fix mode
checking.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xtheadcondmov-indirect-rv32.c: New test.
* gcc.target/riscv/xtheadcondmov-indirect-rv64.c: New test.
|
|
Changes since v1:
- Removed name clash change.
- Fix new pattern indentation.
-- >8 --
When (a & (1 << bit_no)) is tested inside an IF we can use a bit extract.
gcc/ChangeLog:
* config/riscv/bitmanip.md (branch<X:mode>_bext): New split pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-bext-02.c: New test.
|
|
Changes since v1:
- Remove subreg from operand 1.
-- >8 --
We were not able to match the CTZ sign extend pattern on RISC-V
because it gets optimized to zero extend and/or to ANDI patterns.
For the ANDI case, combine scrambles the RTL and generates the
extension by using subregs.
gcc/ChangeLog:
PR target/106888
* config/riscv/bitmanip.md
(<bitmanip_optab>disi2): Match with any_extend.
(<bitmanip_optab>disi2_sext): New pattern to match
with sign extend using an ANDI instruction.
gcc/testsuite/ChangeLog:
PR target/106888
* gcc.target/riscv/pr106888.c: New test.
* gcc.target/riscv/zbbw.c: Check for ANDI.
|
|
|
|
Defer dump option parsing until plugins are initialized. This allows one to
use plugin names for dumps.
PR other/99451
gcc/
* opts.h (handle_deferred_dump_options): Declare.
* opts-global.cc (handle_common_deferred_options): Do not handle
dump options here.
(handle_deferred_dump_options): New.
* toplev.cc (toplev::main): Call it after plugin init.
|
|
Sorry, I forgot the ChangeLog entry for my patch and missed the [v2]
part of the subject.
2023-05-18 Joern Rennecke <joern.rennecke@embecosm.com>
gcc/ChangeLog:
* config/riscv/constraints.md (DsS, DsD): Restore agreement
with shiftm1 mode attribute.
|
|
Code to detect struct/unions across the same TU is not needed
anymore. Code for determining compatibility of tagged types is
preserved as it will be used for C2X. Some errors in the unused
code are fixed.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-decl.cc (set_type_context): Remove.
(pop_scope, diagnose_mismatched_decls, pushdecl):
Remove dead code.
* c-typeck.cc (comptypes_internal): Remove dead code.
(same_translation_unit_p): Remove.
(tagged_types_tu_compatible_p): Some fixes.
|
|
gcc/fortran/ChangeLog:
* expr.cc (gfc_get_corank): Use CLASS_DATA from gfortran.h.
* resolve.cc (resolve_component): Same.
(resolve_fl_derived0): Same.
* simplify.cc (gfc_simplify_extends_type_of): Same.
(simplify_cobound): Same.
|
|
So the problem here is that in the spec files, we were not marking the pch
output file to be removed on error.
The way to fix this is to mark the --output-pch argument as the output
file argument.
For the C++ specs file, we had to move around where the %V was located
such that it would be after the %w marker as %V marker clears the outputfiles.
OK? Bootstrapped and tested on x86_64-linux-gnu.
gcc/cp/ChangeLog:
PR driver/33980
* lang-specs.h ("@c++-header"): Add %w after
the --output-pch.
("@c++-system-header"): Likewise.
("@c++-user-header"): Likewise.
gcc/ChangeLog:
PR driver/33980
* gcc.cc (default_compilers["@c-header"]): Add %w
after the --output-pch.
|
|
[part #2 of PR/109279]
SPEC2017 deepsjeng uses large constants which currently generates less than
ideal code. This fix improves codegen for large constants which have
same low and hi parts: e.g.
long long f(void) { return 0x0101010101010101ull; }
Before
li a5,0x1010000
addi a5,a5,0x101
mv a0,a5
slli a5,a5,32
add a0,a5,a0
ret
With patch
li a5,0x1010000
addi a5,a5,0x101
slli a0,a5,32
add a0,a0,a5
ret
This is testsuite clean.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_split_integer): if loval is equal
to hival, ASHIFT the corresponding regs.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
We can avoid performing two norm_cache lookups during normalization of a
concept-id by allocating and inserting a norm_entry* before rather than
after the fact, which is simpler and cheaper.
gcc/cp/ChangeLog:
* constraint.cc (normalize_concept_check): Avoid having to do
two norm_cache lookups. Remove unnecessary early exit for an
ill-formed concept definition.
|
|
lookup_and_finish_template_variable calls convert_from_reference, which
means for a variable template-id of reference type the function wraps
the corresponding VAR_DECL in an INDIRECT_REF. But the downstream logic
of two callers, tsubst_qualified_id and finish_class_member_access_expr,
expect a DECL_P result and this unexpected INDIRECT_REF leads to an ICE
resolving such a (dependently scoped) template-id as in the first testcase.
(Note these two callers eventually call convert_from_reference on the
result anyway, so calling it earlier seems redundant in this case.)
This patch fixes this by pulling out the convert_from_reference call
from lookup_and_finish_template_variable and into the callers that
actually need it, which turns out to only be tsubst_copy_and_build
(if we got rid of the call there we'd mishandle the second testcase).
PR c++/97340
gcc/cp/ChangeLog:
* pt.cc (lookup_and_finish_template_variable): Don't call
convert_from_reference.
(tsubst_copy_and_build) <case TEMPLATE_ID_EXPR>: Call
convert_from_reference on the result of
lookup_and_finish_template_variable.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/var-templ80.C: New test.
* g++.dg/cpp1y/var-templ81.C: New test.
|
|
This patch removes empty run template files and one redundant stdio.h
include.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/shift-run.c: Do not include
<stdio.h>.
* gcc.target/riscv/rvv/autovec/binop/shift-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vadd-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vand-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vdiv-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vmax-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vmin-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vmul-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vor-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vrem-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vsub-run-template.h: Removed.
* gcc.target/riscv/rvv/autovec/binop/vxor-run-template.h: Removed.
|
|
This patch fixes the recent vmv patch in order to allow loading
of constants via vmv.vi with the "fixed-vlmax" vectorization flavor.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_const_insns): Remove else.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv32.c: New test.
* gcc.target/riscv/rvv/autovec/vmv-imm-fixed-rv64.c: New test.
|
|
This patch re-implements Strings.Delete and also supplies
some runtime test code.
gcc/m2/ChangeLog:
PR modula2/109908
* gm2-libs-iso/Strings.mod (Delete): Re-implement.
gcc/testsuite/ChangeLog:
PR modula2/109908
* gm2/isolib/run/pass/testdelete.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
signed __builtin_mul_overflow{,_p} [PR105776]
In the pattern recognition of signed __builtin_mul_overflow{,_p} we
check for result of unsigned division (which follows unsigned
multiplication) being equality compared against one of the multiplication's
argument (the one not used in the division) and check for the comparison
to be done against same precision cast of the argument (because
division's result is unsigned and the argument is signed).
But as shown in this PR, one can write it equally as comparison done in
the signed type, i.e. compare division's result cast to corresponding
signed type against the argument.
The following patch handles even those cases.
2023-05-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105776
* tree-ssa-math-opts.cc (arith_overflow_check_p): If cast_stmt is
non-NULL, allow division statement to have a cast as single imm use
rather than comparison/condition.
(match_arith_overflow): In that case remove the cast stmt in addition
to the division statement.
* gcc.target/i386/pr105776.c: New test.
|
|
with same unsigned types even when target just has highpart umul [PR101856]
As can be seen on the following testcase, we pattern recognize it on
i?86/x86_64 as return __builtin_mul_overflow_p (x, y, 0UL) and avoid
that way the extra division, but don't do it e.g. on aarch64 or ppc64le,
even when return __builtin_mul_overflow_p (x, y, 0UL); actually produces
there better code. The reason for testing the presence of the optab
handler is to make sure the generated code for it is short to ensure
we don't actually pessimize code instead of optimizing it.
But, we have one case that the internal-fn.cc .MUL_OVERFLOW expansion
handles nicely, and that is when arguments/result is the same mode
TYPE_UNSIGNED type, we only use IMAGPART_EXPR of it (i.e.
__builtin_mul_overflow_p rather than __builtin_mul_overflow) and
umul_highpart_optab supports the particular mode, in that case
we emit comparison of the highpart umul result against zero.
So, the following patch matches what we do in internal-fn.cc and
also pattern matches __builtin_mul_overflow_p if
1) we only need the flag whether it overflowed (i.e. !use_seen)
2) it is unsigned (i.e. !cast_stmt)
3) umul_highpart is supported for the mode
2023-05-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/101856
* tree-ssa-math-opts.cc (match_arith_overflow): Pattern detect
unsigned __builtin_mul_overflow_p even when umulv4_optab doesn't
support it but umul_highpart_optab does.
* gcc.dg/tree-ssa/pr101856.c: New test.
|