Age | Commit message (Collapse) | Author | Files | Lines |
|
define_cond_exec does not support the special @@ syntax
and so can't support {@. As such just remove support
for it.
gcc/ChangeLog:
PR bootstrap/110324
* gensupport.cc (convert_syntax): Explicitly check for RTX code.
|
|
OpenMP permits '(first)private' for C++ member variables, which GCC handles
by tagging those by DECL_OMP_PRIVATIZED_MEMBER, adding a temporary VAR_DECL
and DECL_VALUE_EXPR pointing to the 'this->member_var' in the C++ front end.
The idea is that in omp-low.cc, the DECL_VALUE_EXPR is used before the
region (for 'firstprivate'; ignored for 'private') while in the region,
the DECL itself is used.
In gimplify, the value expansion is suppressed and deferred if the
lang_hooks.decls.omp_disregard_value_expr (decl, shared)
returns true - which is never the case if 'shared' is true. In OpenMP 4.5,
only 'map' and 'use_device_ptr' was permitted for the 'target' directive.
And when OpenMP 5.0's 'private'/'firstprivate' clauses was added, the
the update that now 'shared' argument could be false was missed. The
respective check has now been added.
2024-03-01 Jakub Jelinek <jakub@redhat.com>
Tobias Burnus <tburnus@baylibre.com>
PR c++/110347
gcc/ChangeLog:
* gimplify.cc (omp_notice_variable): Fix 'shared' arg to
lang_hooks.decls.omp_disregard_value_expr for
(first)private in target regions.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-lambda-3.C: Moved from
gcc/testsuite/g++.dg/gomp/ and fixed is-mapped handling.
* testsuite/libgomp.c++/target-lambda-1.C: Modify to also
also work without offloading.
* testsuite/libgomp.c++/firstprivate-1.C: New test.
* testsuite/libgomp.c++/firstprivate-2.C: New test.
* testsuite/libgomp.c++/private-1.C: New test.
* testsuite/libgomp.c++/private-2.C: New test.
* testsuite/libgomp.c++/target-lambda-4.C: New test.
* testsuite/libgomp.c++/use_device_ptr-1.C: New test.
gcc/testsuite/ChangeLog:
* g++.dg/gomp/target-lambda-1.C: Moved to become a
run-time test under testsuite/libgomp.c++.
Co-authored-by: Tobias Burnus <tburnus@baylibre.com>
(cherry picked from commit 4f82d5a95a244d0aa4f8b2541b47a21bce8a191b)
|
|
The OpenACC reduction clause on compute construct implies a copy clause
for each reduction variable [1]. This patch adds tests to check if the
implied copy is being generated. The check covers various types and
operators as described in the specification.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/implied-copy-1.c: New test.
* c-c++-common/goacc/implied-copy-2.c: New test.
* g++.dg/goacc/implied-copy.C: New test.
* gcc.dg/goacc/implied-copy.c: New test.
* gfortran.dg/goacc/implied-copy-1.f90: New test.
* gfortran.dg/goacc/implied-copy-2.f90: New test.
[1] OpenACC 2.7 Specification section 2.5.13
|
|
This patch fixes two bugs related to polymorphic class assignment in the
Fortran front-end. One (described in PR110415) is an issue with the malloc
and realloc calls using the size from the old vptr rather than the new one.
The other is caused by the return value from the realloc call being ignored.
Testcases are added for these issues.
2023-11-28 Andrew Jenner <andrew@codesourcery.com>
gcc/fortran/
PR fortran/110415
* trans-expr.cc (trans_class_vptr_len_assignment): Add
from_vptrp parameter. Populate it. Don't check for DECL_P
when deciding whether to create temporary.
(trans_class_pointer_fcn, gfc_trans_pointer_assignment): Add
NULL argument to trans_class_vptr_len_assignment calls.
(trans_class_assignment): Get rhs_vptr from
trans_class_vptr_len_assignment and use it for determining size
for allocation/reallocation. Use return value from realloc.
gcc/testsuite/
PR fortran/110415
* gfortran.dg/pr110415.f90: New test.
* gfortran.dg/asan/pr110415-2.f90: New test.
* gfortran.dg/asan/pr110415-3.f90: New test.
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
|
|
We don't support it and it doesn't happen without vector extensions, so
just remove the unhandled case.
Fixes gcc.dg/pr78575.c failure.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Disallow TImode.
(cherry picked from commit e7d3414dffc98efc8424818dedac138c99c9ca79)
|
|
I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also. The mov insns for the other modes
already have '$', so this completes the set.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (*mov<mode>_4reg): Disparage AVGPR use when a
reload is required.
(cherry picked from commit ecb22ddbe2b676484d04e7979f7991f7eec93470)
|
|
Add the new CDNA register file. We don't support any of the specialized
instructions that use these registers, but they're useful to relieve
register pressure without spilling to stack.
Co-authored-by: Andrew Jenner <andrew@codesourcery.com>
gcc/ChangeLog:
* config/gcn/constraints.md: Add "a" AVGPR constraint.
* config/gcn/gcn-valu.md (*mov<mode>): Add AVGPR alternatives.
(*mov<mode>_4reg): Likewise.
(@mov<mode>_sgprbase): Likewise.
(gather<mode>_insn_1offset<exec>): Likewise.
(gather<mode>_insn_1offset_ds<exec>): Likewise.
(gather<mode>_insn_2offsets<exec>): Likewise.
(scatter<mode>_expr<exec_scatter>): Likewise.
(scatter<mode>_insn_1offset_ds<exec_scatter>): Likewise.
(scatter<mode>_insn_2offsets<exec_scatter>): Likewise.
* config/gcn/gcn.cc (MAX_NORMAL_AVGPR_COUNT): Define.
(gcn_class_max_nregs): Handle AVGPR_REGS and ALL_VGPR_REGS.
(gcn_hard_regno_mode_ok): Likewise.
(gcn_regno_reg_class): Likewise.
(gcn_spill_class): Allow spilling to AVGPRs on TARGET_CDNA1_PLUS.
(gcn_sgpr_move_p): Handle AVGPRs.
(gcn_secondary_reload): Reload AVGPRs via VGPRs.
(gcn_conditional_register_usage): Handle AVGPRs.
(gcn_vgpr_equivalent_register_operand): New function.
(gcn_valid_move_p): Check for validity of AVGPR moves.
(gcn_compute_frame_offsets): Handle AVGPRs.
(gcn_memory_move_cost): Likewise.
(gcn_register_move_cost): Likewise.
(gcn_vmem_insn_p): Handle TYPE_VOP3P_MAI.
(gcn_md_reorg): Handle AVGPRs.
(gcn_hsa_declare_function_name): Likewise.
(print_reg): Likewise.
(gcn_dwarf_register_number): Likewise.
* config/gcn/gcn.h (FIRST_AVGPR_REG): Define.
(AVGPR_REGNO): Define.
(LAST_AVGPR_REG): Define.
(SOFT_ARG_REG): Update.
(FRAME_POINTER_REGNUM): Update.
(DWARF_LINK_REGISTER): Update.
(FIRST_PSEUDO_REGISTER): Update.
(AVGPR_REGNO_P): Define.
(enum reg_class): Add AVGPR_REGS and ALL_VGPR_REGS.
(REG_CLASS_CONTENTS): Add new register classes and add entries for
AVGPRs to all classes.
(REGISTER_NAMES): Add AVGPRs.
* config/gcn/gcn.md (FIRST_AVGPR_REG, LAST_AVGPR_REG): Define.
(AP_REGNUM, FP_REGNUM): Update.
(define_attr "type"): Add vop3p_mai.
(define_attr "unit"): Handle vop3p_mai.
(define_attr "gcn_version"): Add "cdna2".
(define_attr "enabled"): Handle cdna2.
(*mov<mode>_insn): Add AVGPR alternatives.
(*movti_insn): Likewise.
* config/gcn/mkoffload.cc (isa_has_combined_avgprs): New.
(process_asm): Process avgpr_count.
* config/gcn/predicates.md (gcn_avgpr_register_operand): New.
(gcn_avgpr_hard_register_operand): New.
* doc/md.texi: Document the "a" constraint.
gcc/testsuite/ChangeLog:
* gcc.target/gcn/avgpr-mem-double.c: New test.
* gcc.target/gcn/avgpr-mem-int.c: New test.
* gcc.target/gcn/avgpr-mem-long.c: New test.
* gcc.target/gcn/avgpr-mem-short.c: New test.
* gcc.target/gcn/avgpr-spill-double.c: New test.
* gcc.target/gcn/avgpr-spill-int.c: New test.
* gcc.target/gcn/avgpr-spill-long.c: New test.
* gcc.target/gcn/avgpr-spill-short.c: New test.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (max_isa_vgprs): New.
(run_kernel): CDNA2 devices have more VGPRs.
(cherry picked from commit ae0d2c240213c5a7f6959c032bfc9f0703cab787)
|
|
Remove some unnecessary complexity; no functional change is intended,
although LRA appears to use the constraints from the reload_in/out
patterns, so it's probably an improvement for it to see the real sgprbase
constraints.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (mov<mode>_sgprbase): Add @ modifier.
(reload_in<mode>): Delete.
(reload_out<mode>): Delete.
* config/gcn/gcn.cc (CODE_FOR): Delete.
(get_code_for_##PREFIX##vN##SUFFIX): Delete.
(CODE_FOR_OP): Delete.
(get_code_for_##PREFIX): Delete.
(gcn_secondary_reload): Replace "get_code_for" with "code_for".
(cherry picked from commit a0e6306b7ee16ce4ef067c00609d1303fed71c74)
|
|
The DImode min/max instructions need a clobber that SImode does not, so
add the special case to the reduction expand code.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_expand_reduc_scalar): Add clobber to DImode
min/max instructions.
|
|
The operands really should be VOIDmode, so the warnings are false.
gcc/ChangeLog:
* config/gcn/gcn-valu.md
(vec_extract<V_1REG:mode><V_1REG_ALT:mode>_nop): Mention "operands" in
condition to silence the warnings.
(vec_extract<V_2REG:mode><V_2REG_ALT:mode>_nop): Likewise.
* config/gcn/gcn.md (*movti_insn): Likewise.
|
|
The move instructions typically have many alternatives (and I'm about to add
more) so are good candidates for the new syntax.
This patch only converts the patterns where there are no significant changes to
the generated files. The other patterns can be converted another time.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (*mov<mode>): Convert to compact syntax.
(mov<mode>_exec): Likewise.
(mov<mode>_sgprbase): Likewise.
* config/gcn/gcn.md (*mov<mode>_insn): Likewise.
(*movti_insn): Likewise.
(cherry picked from commit ddfa43933ebc9e6508f0df9e748a765a74e01809)
|
|
gcc/ChangeLog:
* config/gcn/gcn.cc (print_operand): Adjust xcode type to fix warning.
(cherry picked from commit eb239c7f22b646873e93f8115ee992c4fb4d878a)
|
|
This patch adds support for a compact syntax for specifying constraints in
instruction patterns. Credit for the idea goes to Richard Earnshaw.
With this new syntax we want a clean break from the current limitations to make
something that is hopefully easier to use and maintain.
The idea behind this compact syntax is that often times it's quite hard to
correlate the entries in the constrains list, attributes and instruction lists.
One has to count and this often is tedious. Additionally when changing a single
line in the insn multiple lines in a diff change, making it harder to see what's
going on.
This new syntax takes into account many of the common things that are done in MD
files. It's also worth saying that this version is intended to deal with the
common case of a string based alternatives. For C chunks we have some ideas
but those are not intended to be addressed here.
It's easiest to explain with an example:
normal syntax:
(define_insn_and_split "*movsi_aarch64"
[(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m, r, r, r, w,r,w, w")
(match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
"(register_operand (operands[0], SImode)
|| aarch64_reg_or_zero (operands[1], SImode))"
"@
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %1
#
* return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
ldr\\t%w0, %1
ldr\\t%s0, %1
str\\t%w1, %0
str\\t%s1, %0
adrp\\t%x0, %A1\;ldr\\t%w0, [%x0, %L1]
adr\\t%x0, %c1
adrp\\t%x0, %A1
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
"CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), SImode)
&& REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
[(const_int 0)]
"{
aarch64_expand_mov_immediate (operands[0], operands[1]);
DONE;
}"
;; The "mov_imm" type for CNT is just a placeholder.
[(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,mov_imm,load_4,
load_4,store_4,store_4,load_4,adr,adr,f_mcr,f_mrc,fmov,neon_move")
(set_attr "arch" "*,*,*,*,*,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd")
(set_attr "length" "4,4,4,4,*, 4,4, 4,4, 4,8,4,4, 4, 4, 4, 4")
]
)
New syntax:
(define_insn_and_split "*movsi_aarch64"
[(set (match_operand:SI 0 "nonimmediate_operand")
(match_operand:SI 1 "aarch64_mov_operand"))]
"(register_operand (operands[0], SImode)
|| aarch64_reg_or_zero (operands[1], SImode))"
{@ [cons: =0, 1; attrs: type, arch, length]
[r , r ; mov_reg , * , 4] mov\t%w0, %w1
[k , r ; mov_reg , * , 4] ^
[r , k ; mov_reg , * , 4] ^
[r , M ; mov_imm , * , 4] mov\t%w0, %1
[r , n ; mov_imm , * ,16] #
/* The "mov_imm" type for CNT is just a placeholder. */
[r , Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]);
[r , m ; load_4 , * , 4] ldr\t%w0, %1
[w , m ; load_4 , fp , 4] ldr\t%s0, %1
[m , rZ ; store_4 , * , 4] str\t%w1, %0
[m , w ; store_4 , fp , 4] str\t%s1, %0
[r , Usw; load_4 , * , 8] adrp\t%x0, %A1;ldr\t%w0, [%x0, %L1]
[r , Usa; adr , * , 4] adr\t%x0, %c1
[r , Ush; adr , * , 4] adrp\t%x0, %A1
[w , rZ ; f_mcr , fp , 4] fmov\t%s0, %w1
[r , w ; f_mrc , fp , 4] fmov\t%w0, %s1
[w , w ; fmov , fp , 4] fmov\t%s0, %s1
[w , Ds ; neon_move, simd, 4] << aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);
}
"CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), SImode)
&& REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
[(const_int 0)]
{
aarch64_expand_mov_immediate (operands[0], operands[1]);
DONE;
}
)
The main syntax rules are as follows (See docs for full rules):
- Template must start with "{@" and end with "}" to use the new syntax.
- "{@" is followed by a layout in parentheses which is "cons:" followed by
a list of match_operand/match_scratch IDs, then a semicolon, then the
same for attributes ("attrs:"). Both sections are optional (so you can
use only cons, or only attrs, or both), and cons must come before attrs
if present.
- Each alternative begins with any amount of whitespace.
- Following the whitespace is a comma-separated list of constraints and/or
attributes within brackets [], with sections separated by a semicolon.
- Following the closing ']' is any amount of whitespace, and then the actual
asm output.
- Spaces are allowed in the list (they will simply be removed).
- All alternatives should be specified: a blank list should be
"[,,]", "[,,;,]" etc., not "[]" or "" (however genattr may segfault if
you leave certain attributes empty, I have found).
- The actual constraint string in the match_operand or match_scratch, and
the attribute string in the set_attr, must be blank or an empty string
(you can't combine the old and new syntaxes).
- The common idion * return can be shortened by using <<.
- Any unexpanded iterators left during processing will result in an error at
compile time. If for some reason <> is needed in the output then these
must be escaped using \.
- Within an {@ block both multiline and singleline C comments are allowed, but
when used outside of a C block they must be the only non-whitespace blocks on
the line
- Inside an {@ block any unexpanded iterators will result in a compile time
fault instead of incorrect assembly being generated at runtime. If the
literal <> is needed in the output this needs to be escaped with \<\>.
- This check is not performed inside C blocks (lines starting with *).
- Instead of copying the previous instruction again in the next pattern, one
can use ^ to refer to the previous asm string.
This patch works by blindly transforming the new syntax into the old syntax,
so it doesn't do extensive checking. However, it does verify that:
- The correct number of constraints/attributes are specified.
- You haven't mixed old and new syntax.
- The specified operand IDs/attribute names actually exist.
- You don't have duplicate cons
If something goes wrong, it may write invalid constraints/attributes/template
back into the rtx. But this shouldn't matter because error_at will cause the
program to fail on exit anyway.
Because this transformation occurs as early as possible (before patterns are
queued), the rest of the compiler can completely ignore the new syntax and
assume that the old syntax will always be used.
This doesn't seem to have any measurable effect on the runtime of gen*
programs.
gcc/ChangeLog:
* gensupport.cc (class conlist, add_constraints, add_attributes,
skip_spaces, expect_char, preprocess_compact_syntax,
parse_section_layout, parse_section, convert_syntax): New.
(process_rtx): Check for conversion.
* genoutput.cc (process_template): Check for unresolved iterators.
(class data): Add compact_syntax_p.
(gen_insn): Use it.
* gensupport.h (compact_syntax): New.
(hash-set.h): Include.
* doc/md.texi: Document it.
Co-Authored-By: Omar Tahir <Omar.Tahir2@arm.com>
(cherry picked from commit 957ae90406591739b68e95ad49a0232faeb74217)
|
|
Probably a fallout of the backport of r14-4471-g6a8edd50a149f1
Fortran/OpenMP: Fix handling of strictly structured blocks
This showed up as parsing error/fail with
libgomp.fortran/metadirective-1.f90
libgomp.fortran/metadirective-6.f90
gcc/fortran/
* decl.cc (gfc_match_end): Handle unnamed END BLOCK with
metadirectives.
|
|
Without this change, we we get an ICE in verify_gimple_call for
GOMP_allocate when doing a late replacement in omp-low.cc
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_clauses): Avoid gfc_evaluate_now
for allocator with indirect ref for better diagnostic.
gcc/ChangeLog:
* gcc/gimplify.cc (gimplify_omp_allocate): Gimplify allocator.
* omp-low.cc (lower_omp_allocate): Simplify; GOMP_free can also
take a plain 0 as allocator argument (arg is unused in libgomp).
libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-8a.f90: New test.
|
|
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Remove "omp allocate" attribute
to avoid that auxillary statement list reaches LTO.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-13a.f90: New test.
(cherry picked from commit af4bb221153359f5948da917d5ef2df738bb1e61)
|
|
This has been missed to include in commit
59175e6f088 Fortran: Support OpenMP's 'allocate' directive for stack vars
gcc/testsuite/
* gfortran.dg/gomp/allocate-4a.f90: Update dg-error.
|
|
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_allocate): Change error
wording.
gcc/cp/ChangeLog:
* cp-tree.h (finish_omp_allocate): New prototype.
* parser.cc (struct cp_omp_loc_tree,
cp_check_omp_allocate_allocator_r): New.
(cp_parser_omp_allocate): Call it; remove sorry,
improve checks, call finish_omp_allocate.
* pt.cc (tsubst_stmt): Call finish_omp_allocate.
* semantics.cc (finish_omp_allocate): New.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Impl. Status): Document that 'omp allocate'
is now supported for C++ stack/automatic variables.
* testsuite/libgomp.c-c++-common/allocate-4.c: Renamed from ...
* testsuite/libgomp.c/allocate-4.c: ... this.
* testsuite/libgomp.c-c++-common/allocate-5.c: Renamed from ...
* testsuite/libgomp.c/allocate-5.c: ... this.
* testsuite/libgomp.c-c++-common/allocate-6.c: Renamed from ...
* testsuite/libgomp.c/allocate-6.c: ... this.
* testsuite/libgomp.c++/allocate-2.C: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-5.c: Remove C++ 'sorry'; minor updates.
* c-c++-common/gomp/allocate-9.c: Likewise.
* c-c++-common/gomp/allocate-17.c: Likewise.
* c-c++-common/gomp/directive-1.c: Likewise.
* g++.dg/gomp/allocate-5.C: New test.
|
|
gcc/fortran/ChangeLog:
* gfortran.h (ext_attr_t): Add omp_allocate flag.
* match.cc (gfc_free_omp_namelist): Void deleting same
u2.allocator multiple times now that a sequence can use
the same one.
* openmp.cc (gfc_match_omp_clauses, gfc_match_omp_allocate): Use
same allocator expr multiple times.
(is_predefined_allocator): Make static.
(gfc_resolve_omp_allocate): Update/extend restriction checks;
remove sorry message.
(resolve_omp_clauses): Reject corarrays in allocate/allocators
directive.
* parse.cc (check_omp_allocate_stmt): Permit procedure pointers
here (rejected later) for less misleading diagnostic.
* trans-array.cc (gfc_trans_auto_array_allocation): Propagate
size for GOMP_alloc and location to which it should be added to.
* trans-decl.cc (gfc_trans_deferred_vars): Handle 'omp allocate'
for stack variables; sorry for static variables/common blocks.
* trans-openmp.cc (gfc_trans_omp_clauses): Evaluate 'allocate'
clause's allocator only once; fix adding expressions to the
block.
(gfc_trans_omp_single): Pass a block to gfc_trans_omp_clauses.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Handle Fortran's
'omp allocate' for stack variables.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Impl. Status): Mention that Fortran now
supports the allocate directive for stack variables.
* testsuite/libgomp.fortran/allocate-5a.f90: Renamed from
testsuite/libgomp.fortran/allocate-5.f90.
* testsuite/libgomp.fortran/allocate-5.f90: New test.
* testsuite/libgomp.fortran/allocate-6.f90: New test.
* testsuite/libgomp.fortran/allocate-7.f90: New test.
* testsuite/libgomp.fortran/allocate-8.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-14.c: Fix directive name.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Fix comment typo.
* gfortran.dg/gomp/allocate-4.f90: Remove sorry dg-error.
* gfortran.dg/gomp/allocate-7.f90: Likewise.
* gfortran.dg/gomp/allocate-10.f90: New test.
* gfortran.dg/gomp/allocate-11.f90: New test.
* gfortran.dg/gomp/allocate-12.f90: New test.
* gfortran.dg/gomp/allocate-13.f90: New test.
* gfortran.dg/gomp/allocate-14.f90: New test.
* gfortran.dg/gomp/allocate-15.f90: New test.
* gfortran.dg/gomp/allocate-8.f90: New test.
* gfortran.dg/gomp/allocate-9.f90: New test.
(cherry picked from commit 969f5c3eaa7f073f532206ced0f177b4eb58aee2)
|
|
Commit r14-1301-gd64e8e1224708e added u2.allocator to gfc_omp_namelist
for better readability and to permit to use namelist->expr for code
like the following:
!$omp allocators allocate(align(32) : dt%alloc_comp)
allocate (dt%alloc_comp(5))
!$omp allocate(dt%alloc_comp2) align(64)
allocate (dt%alloc_comp2(10))
However, for the parse-tree dump the change was incomplete.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_namelist): Fix dump of the allocator
modifier of OMP_LIST_ALLOCATE.
(cherry picked from commit 99e3214f582b08b69b11b53eb3fc73b0919ef4f1)
|
|
Merge up to r13-7985-g319e887bdddfb8b9244f9310a54c1f08b7e8f0e8 (26th Oct 2023)
|
|
This merges mainline's
r14-1301-gd64e8e1224708e
Fortran/OpenMP: Add parsing support for allocators/allocate directives
into OG13.
In theory, that just replaces (reverts) the OG13-only commit
6ce5ee77f73 Add parsing support for allocate directive (OpenMP 5.0)
by Hafiz Abid Qadeer <abidh@codesourcery.com>
However, the replacement is not full - thus, this commit does much more:
While mainline support 'omp allocators' besides 'omp allocate' and
the latter also for stack and pointer variables,
the OG13 branch supports actual allocation with the directive version.
* Thus, this commit wires the mainline's executive allocate + new allocators
with the OG13 GOMP_alloc/GOMP_free handling.
* It also handles now the 'align' clause/modifier.
* By defaulting to 'omp allocators', it re-uses the ALLOCATE clause in the
ME instead of using the new ALLOCATOR clause, which has now been removed
from the ME code.
On the FE side, it also includes a modified version of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634096.html
[Patch] OpenMP/Fortran: Handle unlisted items in 'omp allocators' + exec. 'omp allocate'
The whole patch cannot be easily disentangled and parts of
6ce5ee77f73 remain in a revised form. For those bits, the
credits belong to Abid.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_namelist, show_omp_node,
show_code_node): Dump EXEC_OMP_ALLOCATORS update for struct changes.
* gfortran.h (enum gfc_statement): Add ST_OMP_(END_)ALLOCATORS.
(gfc_omp_namelist): Add allocator to union u2.
(enum): Remove OMP_LIST_ALLOCATOR.
(gfc_namespace): Add omp_allocate.
(enum gfc_exec_op): Add EXEC_OMP_ALLOCATORS.
(gfc_resolve_omp_allocate): New prototype.
* match.cc (gfc_free_omp_namelist): Free u2.allocator.
* match.h (gfc_match_omp_allocators): New prototype.
* openmp.cc (gfc_omp_directives): Enable allocate/allocator.
(gfc_match_omp_variable_list): Add reject_common_vars Boolean arg.
(enum omp_mask2): Remove OMP_CLAUSE_ALLOCATOR.
(gfc_match_omp_clauses): Update for struct change.
(OMP_ALLOCATE_CLAUSES): Remove.
(OMP_ALLOCATORS_CLAUSES): Add.
(gfc_match_omp_allocate): Replace by upstream version.
(gfc_match_omp_allocators, is_predefined_allocator,
gfc_resolve_omp_allocate): New.
(verify_omp_clauses_symbol_dups): Update for allocate/allocator.
(resolve_omp_clauses): Update from mainline.
(omp_code_to_statement): Add EXEC_OMP_ALLOCATORS.
prepare_omp_allocated_var_list_for_cleanup,
check_allocate_directive_restrictions, EMPTY_VAR_LIST,
gfc_resolve_omp_allocate): Remove.
(gfc_resolve_omp_directive): Add EXEC_OMP_ALLOCATORS.
* parse.cc (check_omp_allocate_stmt, decode_omp_directive
next_statement, case_omp_decl, gfc_ascii_statement,
verify_st_order, parse_openmp_allocate_block,
parse_omp_structured_block, parse_executable): Update from mainline.
* resolve.cc (gfc_resolve_blocks, resolve_codes): Handle allocate
and allocator.
* st.cc (gfc_free_statement): And EXEC_OMP_ALLOCATORS.
* trans-decl.cc (gfc_trans_deferred_vars): Use OMP_LIST_ALLOCATE
instead of OMP_LIST_ALLOCATOR.
* trans-openmp.cc (gfc_trans_omp_clauses): Remove OMP_LIST_ALLOCATOR,
update allocate handling.
(gfc_trans_omp_allocate): Renamed to ...
(gfc_trans_omp_allocators): ... this.
(gfc_split_omp_clauses, gfc_trans_omp_directive): Update.
* trans.cc (trans_code): Handle EXEC_OMP_ALLOCATORS.
gcc/ChangeLog:
* omp-low.cc (scan_sharing_clauses): Update message for directive use.
(lower_omp_allocate, lower_omp_1): Update for clause changes; handle
alignment.
* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE_ALLOCATOR.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.cc (omp_clause_num_ops): Likewise.
* tree.h (OMP_ALLOCATE_DECL, OMP_ALLOCATE_ALLOCATOR): Remove.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-2a.f90: Add exec stmt.
* testsuite/libgomp.fortran/allocate-4.f90: Update dg-error
* testsuite/libgomp.fortran/allocate-5.f90: Likewise; add exec stmt.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-2.f90: Update dg-warning.
* gfortran.dg/gomp/allocate-4.f90: New upstream test.
* gfortran.dg/gomp/allocate-5.f90: New upstream test.
* gfortran.dg/gomp/allocate-6.f90: New upstream test.
* gfortran.dg/gomp/allocate-7.f90: New upstream test.
* gfortran.dg/gomp/allocate-4a.f90: Renamed from allocate-4.f90
* gfortran.dg/gomp/allocate-6a.f90: Renamed from allocate-6.f90
* gfortran.dg/gomp/allocate-7a.f90: Renamed from allocate-7.f90
* gfortran.dg/gomp/allocate-9a.f90: New test.
* gfortran.dg/gomp/allocators-1.f90: New upstream test.
* gfortran.dg/gomp/allocators-2.f90: New upstream test.
|
|
LoongArch's microstructure ensures cache consistency by hardware.
Due to out-of-order execution, "ibar" is required to ensure the visibility of the
store (invalidated icache) executed by this CPU before "ibar" (to the instance).
"ibar" will not invalidate the icache, so the start and end parameters are not Affect
"ibar" performance.
gcc/ChangeLog:
* config/loongarch/loongarch.h (CLEAR_INSN_CACHE): New definition.
(cherry picked from commit 5697ed0327f23d2e2ec4f7beec3b3d02f463173c)
|
|
gcc/ChangeLog:
* config/loongarch/loongarch.md (get_thread_pointer<mode>):Adds the
instruction template corresponding to the __builtin_thread_pointer
function.
* doc/extend.texi:Add the __builtin_thread_pointer function support
description to the documentation.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/builtin_thread_pointer.c: New test.
(cherry picked from commit 1b30ef7cea773e0af527dbf821e0be42b6a264f8)
|
|
|
|
|
|
* opts.cc (debug_type_names): Remove stabs and xcoff.
(df_set_names): Adjust.
(cherry picked from commit 724badcadf889b5798321620aeed16b61e91fe72)
|
|
|
|
gcc/ChangeLog:
PR target/111001
* config/sh/sh_treg_combine.cc (sh_treg_combine::record_set_of_reg):
Skip over nop move insns.
|
|
As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.
Mike's commit r10-4547-gce6a6c007e5a98 has already handled
the 64 bit case (DImode), this patch is to treat the 32
bit case (SImode) by making use of mode iterator P and
ptrload attribute iterator, also fixes the constraints
to match the emitted operand formats.
PR target/111367
gcc/ChangeLog:
* config/rs6000/rs6000.md (stack_protect_setsi): Support prefixed
instruction emission and incorporate to stack_protect_set<mode>.
(stack_protect_setdi): Rename to ...
(stack_protect_set<mode>): ... this, adjust constraint.
(stack_protect_testsi): Support prefixed instruction emission and
incorporate to stack_protect_test<mode>.
(stack_protect_testdi): Rename to ...
(stack_protect_test<mode>): ... this, adjust constraint.
gcc/testsuite/ChangeLog:
* g++.target/powerpc/pr111367.C: New test.
(cherry picked from commit 530babc2058be5f2b06b1541384e7b730c368b93)
|
|
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/111837
* frontend-passes.cc (traverse_io_block): Dependency check of loop
nest shall be triangular, not banded.
gcc/testsuite/ChangeLog:
PR fortran/111837
* gfortran.dg/implied_do_io_8.f90: New test.
(cherry picked from commit 5ac63ec5da2e93226457bea4dbb3a4f78d5d82c2)
|
|
|
|
In the PR, Joseph says that in C char8_t is not a distinct type. So
we should behave as if it can alias anything, like ordinary char.
In C, unsigned_char_type_node == char8_type_node, so with this patch
we return 0 instead of -1. And the following comment says:
/* The C standard guarantees that any object may be accessed via an
lvalue that has narrow character type (except char8_t). */
if (t == char_type_node
|| t == signed_char_type_node
|| t == unsigned_char_type_node)
return 0;
Which appears to be wrong, so I'm adjusting that as well.
PR c/111884
gcc/c-family/ChangeLog:
* c-common.cc (c_common_get_alias_set): Return -1 for char8_t only
in C++.
gcc/testsuite/ChangeLog:
* c-c++-common/alias-1.c: New test.
(cherry picked from commit 281699fbff6262766674ab13087d37db751cd40a)
|
|
Fix accidentally inverted comparison.
gcc/ChangeLog:
PR target/101177
* config/sh/sh.md (unnamed split pattern): Fix comparison of
find_regno_note result.
|
|
|
|
As noted on the PR, commit r13-1544, the fix for PR53431, did not handle
the specific case of -Wunknown-pragmas, because that warning is issued
during preprocessing, but not by libcpp directly (it comes from the
cb_def_pragma callback). Address that by handling this pragma in
addition to libcpp pragmas during the early pragma handler.
gcc/c-family/ChangeLog:
PR c++/89038
* c-pragma.cc (handle_pragma_diagnostic_impl): Handle
-Wunknown-pragmas during early processing.
gcc/testsuite/ChangeLog:
PR c++/89038
* c-c++-common/cpp/Wunknown-pragmas-1.c: New test.
|
|
|
|
While backporting another patch to an earlier release, I hit a
situation in which lra_eliminate_regs_1 would eliminate an address to:
(plus (reg:P R) (const_int 0))
This address compared not-equal to plain:
(reg:P R)
which caused an ICE in a later peephole2. (The ICE showed up in
gfortran.fortran-torture/compile/pr80464.f90 on the branch but seems
to be latent on trunk.)
These unfolded PLUSes shouldn't occur in the insn stream, and later code
in the same function tried to avoid them.
gcc/
PR target/111528
* lra-eliminations.cc (lra_eliminate_regs_1): Use simplify_gen_binary
rather than gen_rtx_PLUS.
(cherry picked from commit 10d59b802a7db9ae908291fb20627c1493cfa26c)
|
|
When having modula-2 enabled in a development tree and there are any
changes that trigger rebuilds in m2/ doing a 'make all-gcc' in the
build directory might fail due to lack of dependency tracking.
This patch introduces build dependencies into gcc/m2/Make-lang.in using
-M* options.
gcc/m2/ChangeLog:
PR modula2/111756
* Make-lang.in (CM2DEP): New define conditionally set if
($(CXXDEPMODE),depmode=gcc3).
(m2/gm2-gcc/%.o): Ensure $(@D)/$(DEPDIR) is created.
Add $(CM2DEP) to the $(COMPILER) command and use $(POSTCOMPILE).
(m2/gm2-gcc/m2configure.o): Ditto.
(m2/gm2-lang.o): Ditto.
(m2/m2pp.o): Ditto.
(m2/gm2-gcc/rtegraph.o): Ditto.
(m2/mc-boot/$(SRC_PREFIX)%.o): Ditto.
(m2/mc-boot-ch/$(SRC_PREFIX)%.o): Ditto.
(m2/mc-boot-ch/$(SRC_PREFIX)%.o): Ditto.
(m2/mc-boot/main.o): Ditto.
(mcflex.o): Ditto.
(m2/gm2-libs-boot/M2RTS.o): Ditto.
(m2/gm2-libs-boot/%.o): Ditto.
(m2/gm2-libs-boot/%.o): Ditto.
(m2/gm2-libs-boot/RTcodummy.o): Ditto.
(m2/gm2-libs-boot/RTintdummy.o): Ditto.
(m2/gm2-libs-boot/wrapc.o): Ditto.
(m2/gm2-libs-boot/UnixArgs.o): Ditto.
(m2/gm2-libs-boot/choosetemp.o): Ditto.
(m2/gm2-libs-boot/errno.o): Ditto.
(m2/gm2-libs-boot/dtoa.o): Ditto.
(m2/gm2-libs-boot/ldtoa.o): Ditto.
(m2/gm2-libs-boot/termios.o): Ditto.
(m2/gm2-libs-boot/SysExceptions.o): Ditto.
(m2/gm2-libs-boot/SysStorage.o): Ditto.
(m2/gm2-compiler-boot/M2GCCDeclare.o): Ditto.
(m2/gm2-compiler-boot/M2Error.o): Ditto.
(m2/gm2-compiler-boot/%.o): Ditto.
(m2/gm2-compiler-boot/%.o): Ditto.
(m2/gm2-compiler-boot/m2flex.o): Ditto.
(m2/gm2-compiler/m2flex.o): Ditto.
(m2/gm2-libs/choosetemp.o): Ditto.
(m2/boot-bin/mklink$(exeext)): Ditto.
(m2/pge-boot/%.o): Ditto.
(m2/pge-boot/%.o): Ditto.
* README: Remove out of date info.
* gm2-compiler/M2Quads.mod (BuildStringAdrParam): Correct
procedure end name.
* gm2-compiler/SymbolTable.mod (GetVarPointerCheck): Add
default FALSE return value.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
|
|
and GENERAL_REGS.
For testcase
void __cond_swap(double* __x, double* __y) {
bool __r = (*__x < *__y);
auto __tmp = __r ? *__x : *__y;
*__y = __r ? *__y : *__x;
*__x = __tmp;
}
GCC-14 with -O2 and -march=x86-64 options generates the following code:
__cond_swap(double*, double*):
movsd xmm1, QWORD PTR [rdi]
movsd xmm0, QWORD PTR [rsi]
comisd xmm0, xmm1
jbe .L2
movq rax, xmm1
movapd xmm1, xmm0
movq xmm0, rax
.L2:
movsd QWORD PTR [rsi], xmm1
movsd QWORD PTR [rdi], xmm0
ret
rax is used to save and restore DFmode value. In RA both GENERAL_REGS
and SSE_REGS cost zero since we didn't disparage the
alternative in movdf_internal pattern, according to register
allocation order, GENERAL_REGS is allocated. The patch add ? for
alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal
pattern, after that we get optimal RA.
__cond_swap:
.LFB0:
.cfi_startproc
movsd (%rdi), %xmm1
movsd (%rsi), %xmm0
comisd %xmm1, %xmm0
jbe .L2
movapd %xmm1, %xmm2
movapd %xmm0, %xmm1
movapd %xmm2, %xmm0
.L2:
movsd %xmm1, (%rsi)
movsd %xmm0, (%rdi)
ret
gcc/ChangeLog:
PR target/110170
* config/i386/i386.md (movdf_internal): Disparage slightly for
2 alternatives (r,v) and (v,r) by adding constraint modifier
'?'.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110170-3.c: New test.
(cherry picked from commit 37a231cc7594d12ba0822077018aad751a6fb94e)
|
|
|
|
Merge up to r13-7954-gcc87aaeceea58389b681e3a6a63f95e54f2b59cd (16th Oct 2023)
|
|
gfc_match ("... %s ...", ...) matches a gfc_symbol but with
host_assoc = 0. This commit adds '%S' as variant which matches
with host_assoc = 1
gcc/fortran/ChangeLog:
* match.cc (gfc_match_char): Match with '%S' a symbol
with host_assoc = 1.
(cherry picked from commit 0607e93490058ec31b6ab57078c54771f139b870)
|
|
As PR111380 (and the discussion in related PRs) shows, for
now how function rs6000_can_inline_p treats the callee
without any target option node is wrong. It considers it's
always safe to inline this kind of callee, but actually its
target flags are from the command line options
(target_option_default_node), it's possible that the flags
of callee don't satisfy the condition of inlining, but it
is still inlined, then result in unexpected consequence.
As the associated test case pr111380-1.c shows, the caller
main is attributed with power8, but the callee foo is
compiled with power9 from command line, it's unexpected to
make main inline foo since foo can contain something that
requires power9 capability. Without this patch, for lto
(with -flto) we can get error message (as it forces the
callee to have a target option node), but for non-lto, it's
inlined unexpectedly.
This patch is to make callee adopt target_option_default_node
when it doesn't have a target option node, it can avoid wrong
inlining decision and fix the inconsistency between LTO and
non-LTO. It also aligns with what the other ports do.
PR target/111380
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_can_inline_p): Adopt
target_option_default_node when the callee has no option
attributes, also simplify the existing code accordingly.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr111380-1.c: New test.
* gcc.target/powerpc/pr111380-2.c: New test.
(cherry picked from commit 266dfed68b881702e9660889f63408054b7fa9c0)
|
|
PR111366 exposes one thing that can be improved in function
rs6000_update_ipa_fn_target_info is to skip the given empty
inline asm string, since it's impossible to adopt any
hardware features (so far HTM).
Since this rs6000_update_ipa_fn_target_info related approach
exists in GCC12 and later, the affected project highway has
updated its target pragma with ",htm", see the link:
https://github.com/google/highway/commit/15e63d61eb535f478bc
I'd not bother to consider an inline asm parser for now but
will file a separated PR for further enhancement.
PR target/111366
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Skip
empty inline asm.
gcc/testsuite/ChangeLog:
* g++.target/powerpc/pr111366.C: New test.
(cherry picked from commit a65b38e361320e0aa45adbc969c704385ab1f45b)
|
|
|
|
|