Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch implements the missing vstrq_p128 intrinsic.
It just performs a store of the poly128_t argument to a memory location.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vstrq_p128): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vstrq_p128_1.c: New test.
(cherry picked from commit d23ea1e865301cd45f14ccbdb0bca49251fde9e1)
|
|
This patch implements some missing intrinsics that perform a CLS on unsigned SIMD types.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vcls_u8, vcls_u16, vcls_u32,
vclsq_u8, vclsq_u16, vclsq_u32): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vcls_unsigned_1.c: New test.
(cherry picked from commit 30957092db46d8798e632feefb5df634488dbb33)
|
|
This patch implements some missing vceq* intrinsics on poly types.
The behaviour is to produce the appropriate CMEQ instruction as for the unsigned types.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vceqq_p64, vceqz_p64, vceqzq_p64): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vceq_poly_1.c: New test.
(cherry picked from commit d4703be185b422f637deebd3bb9222a41c8023d6)
|
|
This implements the vadd[p]_p* intrinsics.
In terms of functionality they are aliases of veor operations on the relevant unsigned types.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/71233
* config/aarch64/arm_neon.h (vadd_p8, vadd_p16, vadd_p64, vaddq_p8,
vaddq_p16, vaddq_p64, vaddq_p128): Define.
gcc/testsuite/
PR target/71233
* gcc.target/aarch64/simd/vadd_poly_1.c: New test.
(cherry picked from commit fa9ad35dae03dcb20c4ccb50ba1b351a8ab77970)
|
|
This reverts commit 4a67941a956003dcce8866604ba25f5a0bfd16cf.
|
|
The compiler gives a bogus alignment warning on an address clause and
a discriminated record type with variable size.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (maybe_saturate_size): Add ALIGN parameter
and round down the result to ALIGN.
(gnat_to_gnu_entity): Adjust calls to maybe_saturate_size.
gcc/testsuite/ChangeLog:
* gnat.dg/addr16.adb: New test.
* gnat.dg/addr16_pkg.ads: New helper.
|
|
|
|
operand(s) [PR97073]
The following testcase is miscompiled on i686-linux, because
we try to expand a double-word bitwise logic operation with op0
being a (mem:DI u) and target (mem:DI u+4), i.e. partial overlap, and
thus end up with:
movl 4(%esp), %eax
andl u, %eax
movl %eax, u+4
! movl u+4, %eax optimized out
andl 8(%esp), %eax
movl %eax, u+8
rather than with the desired:
movl 4(%esp), %edx
movl 8(%esp), %eax
andl u, %edx
andl u+4, %eax
movl %eax, u+8
movl %edx, u+4
because the store of the first word to target overwrites the second word of
the operand.
expand_binop for this (and several similar places) already check for target
== op0 or target == op1, this patch just adds reg_overlap_mentioned_p calls
next to it.
Pedantically, at least for some of these it might be sufficient to force
a different target if there is overlap but target is not rtx_equal_p to
the operand (e.g. in this bitwise logical case, but e.g. not in the shift
cases where there is reordering), though that would go against the
preexisting target == op? checks and the rationale that REG_EQUAL notes in
that case isn't correct.
2020-09-27 Jakub Jelinek <jakub@redhat.com>
PR middle-end/97073
* optabs.c (expand_binop, expand_absneg_bit, expand_unop,
expand_copysign_bit): Check reg_overlap_mentioned_p between target
and operand(s) and if it returns true, force a pseudo as target.
* gcc.c-torture/execute/pr97073.c: New test.
(cherry picked from commit a4b31d5807f2bc67c8999b3d53369cf2a5c6e1ec)
|
|
Local identifiers can not be the same as a module name. Original
patch by Steve Kargl resulted in name clashes between common block
names and local identifiers. A local identifier can be the same as
a global identier if that identifier represents a common. The patch
was modified to allow global identifiers that represent a common
block.
2020-09-27 Steven G. Kargl <kargl@gcc.gnu.org>
Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran/
PR fortran/95614
* decl.c (gfc_get_common): Use gfc_match_common_name instead
of match_common_name.
* decl.c (gfc_bind_idents): Use gfc_match_common_name instead
of match_common_name.
* match.c : Rename match_common_name to gfc_match_common_name.
* match.c (gfc_match_common): Use gfc_match_common_name instead
of match_common_name.
* match.h : Rename match_common_name to gfc_match_common_name.
* resolve.c (resolve_common_vars): Check each symbol in a
common block has a global symbol. If there is a global symbol
issue an error if the symbol type is known as is not a common
block name.
2020-09-27 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/95614
* gfortran.dg/pr95614_1.f90: New test.
* gfortran.dg/pr95614_2.f90: New test.
(cherry picked from commit e5a76af3a2f3324efc60b4b2778ffb29d5c377bc)
|
|
|
|
|
|
2020-06-04 Vladimir Makarov <vmakarov@redhat.com>
PR middle-end/95464
* gcc.target/i386/pr95464.c: New.
(cherry picked from commit e7ef9a40cd0c688cd331bc26224d1fbe360c1fe6)
|
|
2020-06-04 Vladimir Makarov <vmakarov@redhat.com>
PR middle-end/95464
* lra.c (lra_emit_move): Add processing STRICT_LOW_PART.
* lra-constraints.c (match_reload): Use STRICT_LOW_PART in output
reload if the original insn has it too.
(cherry picked from commit 5261cf8ce824bfc75eb6f12ad5e3716c085b6f9a)
|
|
Previously, the machine description patterns for vst1q accepted a generic memory
operand for the destination, which could lead to an unrecognised builtin when
expanding vst1q* intrinsics. This change fixes the pattern to only accept MVE
memory operands.
gcc/ChangeLog:
PR target/96683
* config/arm/mve.md (mve_vst1q_f<mode>): Require MVE memory operand for
destination.
(mve_vst1q_<supf><mode>): Likewise.
gcc/testsuite/ChangeLog:
PR target/96683
* gcc.target/arm/mve/intrinsics/vst1q_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_s16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_s8.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_u16.c: New test.
* gcc.target/arm/mve/intrinsics/vst1q_u8.c: New test.
(cherry picked from commit 91d206adfe39ce063f6a5731b92a03c05e82e94a)
|
|
|
|
Add sp_is_clobbered_by_asm to rtl_data to inform backends that the stack
pointer is clobbered by asm statement.
gcc/
PR target/97032
* cfgexpand.c (asm_clobber_reg_kind): Set sp_is_clobbered_by_asm
to true if the stack pointer is clobbered by asm statement.
* emit-rtl.h (rtl_data): Add sp_is_clobbered_by_asm.
* config/i386/i386.c (ix86_get_drap_rtx): Set need_drap to true
if the stack pointer is clobbered by asm statement.
gcc/testsuite/
PR target/97032
* gcc.target/i386/pr97032.c: New test.
(cherry picked from commit 453a20c65722719b9e2d84339f215e7ec87692dc)
|
|
Power10 pc-relative code doesn't use or preserve r2 as a TOC pointer.
That means calling between pc-relative and TOC using code can't be
done without intervening linker stubs, and a call from TOC code to
pc-relative code must have a nop after the bl in order to restore r2.
Now the PowerPC libffi assembly code doesn't use r2 except for the
implicit use when making calls back to C, ffi_closure_helper_LINUX64
and ffi_prep_args64. So changing the assembly to interoperate with
pc-relative code without stubs is easily done.
PR target/97166
* src/powerpc/linux64.S (ffi_call_LINUX64): Don't emit global
entry when __PCREL__. Call using @notoc. Add nops.
* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Likewise.
(ffi_go_closure_linux64): Likewise.
(cherry picked from commit 08cd8d5929eac84b27788d8483fd75ab7ad13129)
(cherry picked from commit fff56af6421a1a3e357bcaad99f2ea084d72a7a8)
|
|
Useful in assembly to know details of power10 function calls.
PR target/97166
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
Conditionally define __PCREL__.
(cherry picked from commit 677b9150f54a0483d3de1182ac40717b7c4431a5)
|
|
2020-09-17 Andrea Corallo <andrea.corallo@arm.com>
* config/aarch64/aarch64-builtins.c
(aarch64_general_expand_builtin): Use expand machinery not to
alter the value of an rtx returned by force_reg.
(cherry picked from commit 2c62952f8160bdc8d4111edb34a4bc75096c1e05)
|
|
This patch backports the AArch64 support for Arm's Neoverse V1 CPU to
GCC 10.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def: Add Neoverse V1.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse V1.
|
|
|
|
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/97167
* src/c++17/fs_path.cc (path::_Parser::root_path()): Check
for empty string before inspecting the first character.
* testsuite/27_io/filesystem/path/append/source.cc: Append
empty string_view to path.
(cherry picked from commit 49ff88bd0d8a36a9e903f01ce05685cfe07dee5d)
|
|
The 'mod' and 'div' operators in eBPF are unsigned, with no signed
counterpart. xBPF adds two new ALU operations, sdiv and smod, for
signed division and modulus, respectively. Update bpf.md with
'define_insn' blocks for signed div and mod to use them when targetting
xBPF, and add new tests to ensure they are used appropriately.
2020-09-17 David Faust <david.faust@oracle.com>
gcc/
* config/bpf/bpf.md: Add defines for signed div and mod operators.
gcc/testsuite/
* gcc.target/bpf/diag-sdiv.c: New test.
* gcc.target/bpf/diag-smod.c: New test.
* gcc.target/bpf/xbpf-sdiv-1.c: New test.
* gcc.target/bpf/xbpf-smod-1.c: New test.
(cherry picked from commit 7c8ba5da80d5d95a8521010d6731d0d83036145d)
|
|
The necessary adjustments to support CFI in ROCGDB (ROCm 3.8+).
The -fomit-frame-pointer option now has different defaults because it has now
become useful. Otherwise the only change in output is in the debug info.
gcc/
* common/config/gcn/gcn-common.c
(gcn_option_optimization_table): Change OPT_fomit_frame_pointer to -O3.
* config/gcn/gcn.c (move_callee_saved_registers): Emit CFI notes for
prologue register saves.
(gcn_expand_prologue): Prefer the frame pointer when emitting CFI.
(gcn_frame_pointer_rqd): New function.
(gcn_debug_unwind_info): Use UI_DWARF2.
(gcn_dwarf_register_number): Map DWARF_LINK_REGISTER to DWARF PC.
(gcn_dwarf_register_span): DWARF_LINK_REGISTER doesn't span.
(TARGET_FRAME_POINTER_REQUIRED): Define new hook.
* config/gcn/gcn.h (DWARF_FRAME_RETURN_COLUMN): New define.
(DWARF_LINK_REGISTER): New define.
(FIRST_PSEUDO_REGISTER): Increment.
(FIXED_REGISTERS): Add entry for DWARF_LINK_REGISTER.
(CALL_USED_REGISTERS): Likewise.
(REGISTER_NAMES): Likewise.
|
|
While backporting 5494edae83ad33c769bd1ebc98f0c492453a6417 I noticed
that it's still not correct. I made the allocator-extended constructor
use the right type for the uses-allocator construction detection, but I
used an rvalue when it should be a const lvalue.
This should fix it properly this time.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Use correct value category in __use_alloc call.
* testsuite/20_util/tuple/cons/96803.cc: Check with constructors
that require correct value category to be used.
(cherry picked from commit 7825399092d572ce8ea82c4aa8dfeb65076b0e52)
|
|
The _Tuple_impl constructor for allocator-extended construction from a
different tuple type uses the _Tuple_impl's own _Head type in the
__use_alloc test. That is incorrect, because the argument tuple could
have a different type. Using the wrong type might select the
leading-allocator convention when it should use the trailing-allocator
convention, or vice versa.
libstdc++-v3/ChangeLog:
PR libstdc++/96803
* include/std/tuple
(_Tuple_impl(allocator_arg_t, Alloc, const _Tuple_impl<U...>&)):
Replace parameter pack with a type parameter and a pack and pass
the first type to __use_alloc.
* testsuite/20_util/tuple/cons/96803.cc: New test.
(cherry picked from commit 5494edae83ad33c769bd1ebc98f0c492453a6417)
|
|
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/94681
* src/c++17/fs_ops.cc (read_symlink): Use posix::lstat instead
of calling ::lstat directly.
* src/filesystem/ops.cc (read_symlink): Likewise.
(cherry picked from commit 5b065f0563262a0d6cd1fea8426913bfdd841301)
|
|
The configure switch should only affect the optional Filesystem TS, not
the std::filesystem features of C++17.
libstdc++-v3/ChangeLog:
PR libstdc++/94681
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Do not depend on
$enable_libstdcxx_filesystem_ts.
* configure: Regenerate.
(cherry picked from commit 90f7636bf8df50940e0f749af60a6b374a8f09b4)
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/97101
* include/std/functional (bind_front): Fix order of parameters
in is_nothrow_constructible_v specialization.
* testsuite/20_util/function_objects/bind_front/97101.cc: New test.
(cherry picked from commit 3c755b428e188228d0bad90625c995fd25a02322)
|
|
This ensures that internal/goroot.IsStandardPackage does not treat
golang.org packages as being in the standard library.
For golang/go#41368
Fixes golang/go#41499
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/256319
|
|
When a pool resource is constructed with max_blocks_per_chunk=1 it ends
up creating a pool with blocks_per_chunk=0 which means it never
allocates anything. Instead it returns null pointers, which should be
impossible.
To avoid this problem, round the max_blocks_per_chunk value to a
multiple of four, so it's never smaller than four.
libstdc++-v3/ChangeLog:
PR libstdc++/94160
* src/c++17/memory_resource.cc (munge_options): Round
max_blocks_per_chunk to a multiple of four.
(__pool_resource::_M_alloc_pools()): Simplify slightly.
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc:
Check that valid pointers are returned when small values are
used for max_blocks_per_chunk.
(cherry picked from commit 30b41cfbb2dade63e52465234a725d1d02fe70aa)
|
|
Add support for architectures such as AMD GCN, in which the pointer size is
larger than the register size. This allows the CFI information to include
multi-register locations for the stack pointer, frame pointer, and return
address.
Note that this uses a newly proposed DWARF operator DW_OP_LLVM_piece_end,
which is currently only recognized by the ROCGDB debugger from AMD. The exact
name and encoding for this operator is subject to change if and when the DWARF
standard accepts it.
gcc/ChangeLog:
* dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg.
(dw_frame_pointer_regnum): Likewise.
(new_cfi_row): Use set_by_dwreg.
(get_cfa_from_loc_descr): Use set_by_dwreg. Support register spans
with DW_OP_piece and DW_OP_LLVM_piece_end. Support DW_OP_lit*,
DW_OP_const*, DW_OP_minus, and DW_OP_plus.
(lookup_cfa_1): Use set_by_dwreg.
(def_cfa_0): Update for cfa_reg and support register spans.
(reg_save): Change sreg parameter to struct cfa_reg. Support register
spans.
(dwf_cfa_reg): New function.
(dwarf2out_flush_queued_reg_saves): Use dwf_cfa_reg instead of
dwf_regno.
(dwarf2out_frame_debug_def_cfa): Likewise.
(dwarf2out_frame_debug_adjust_cfa): Likewise.
(dwarf2out_frame_debug_cfa_offset): Likewise. Update reg_save usage.
(dwarf2out_frame_debug_cfa_register): Likewise.
(dwarf2out_frame_debug_expr): Likewise.
(create_pseudo_cfg): Use set_by_dwreg.
(initial_return_save): Use set_by_dwreg and dwf_cfa_reg,
(create_cie_data): Use dwf_cfa_reg.
(execute_dwarf2_frame): Use dwf_cfa_reg.
(dump_cfi_row): Use set_by_dwreg.
* dwarf2out.c (build_span_loc): New function.
(build_cfa_loc): Support register spans.
(build_cfa_aligned_loc): Update cfa_reg usage.
(convert_cfa_to_fb_loc_list): Use set_by_dwreg.
* dwarf2out.h (struct cfa_reg): New type.
(struct dw_cfa_location): Use struct cfa_reg.
(build_span_loc): New prototype.
* gengtype.c (main): Accept poly_uint16_pod type.
include/ChangeLog:
* dwarf2.def (DW_OP_LLVM_piece_end): New extension operator.
|
|
|
|
2020-09-20 John David Anglin < danglin@gcc.gnu.org>
gcc/ChangeLog
* config/pa/pa-hpux11.h (LINK_GCC_C_SEQUENCE_SPEC): Delete.
* config/pa/pa64-hpux.h (LINK_GCC_C_SEQUENCE_SPEC): Likewise.
(ENDFILE_SPEC): Link with libgcc_stub.a and mill.a.
* config/pa/pa32-linux.h (ENDFILE_SPEC): Link with libgcc.a.
|
|
|
|
An edit got botched. Fixed now.
libgomp/
* config/nvptx/bar.c (gomp_team_barrier_wake): Remove rogue asm.
|
|
|
|
gcc/fortran/
PR fortran/96041
PR fortran/93423
* decl.c (gfc_match_submod_proc): Avoid later double-free
in the error case.
(cherry picked from commit c12facd22881517127ebbe213d7ecc7fc1fcea4e)
|
|
When recovering from an error, a NULL pointer dereference could occur.
Check for that situation and punt.
gcc/fortran/
PR fortran/93423
* resolve.c (resolve_symbol): Avoid NULL pointer dereference.
(cherry picked from commit b88744905a46be44ffa3c57d46080f601ae832b8)
|
|
This pass only had an optimization for obtaining team/thread numbers in it,
and that turns out to be invalid in the presence of nested parallel regions,
so we can simply delete the whole thing.
Of course, it would be nice to apply the optimization where it is valid, but
that will take more effort than I have to spend right now.
gcc/ChangeLog:
* config/gcn/gcn-tree.c (execute_omp_gcn): Delete.
(make_pass_omp_gcn): Delete.
* config/gcn/t-gcn-hsa (PASSES_EXTRA): Delete.
* config/gcn/gcn-passes.def: Removed.
(cherry picked from commit 220724c311473b8b0f2418350c2b64e796e92bda)
|
|
Both GCN and NVPTX allow nested parallel regions, but the barrier
implementation did not allow the nested teams to run independently of each
other (due to hardware limitations). This patch fixes that, under the
assumption that each thread will create a new subteam of one thread, by
simply not using barriers when there's no other thread to synchronise.
libgomp/ChangeLog:
* config/gcn/bar.c (gomp_barrier_wait_end): Skip the barrier if the
total number of threads is one.
(gomp_team_barrier_wake): Likewise.
(gomp_team_barrier_wait_end): Likewise.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/nvptx/bar.c (gomp_barrier_wait_end): Likewise.
(gomp_team_barrier_wake): Likewise.
(gomp_team_barrier_wait_end): Likewise.
(gomp_team_barrier_wait_cancel_end): Likewise.
* testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c: New test.
|
|
libgomp/
* testsuite/libgomp.c-c++-common/pr96390.c: XFAIL on nvptx.
* testsuite/libgomp.c++/pr96390.C: Likewise.
|
|
|
|
PR middle-end/96390
* omp-offload.c (omp_discover_declare_target_tgt_fn_r): Fix
offloadable setting with target_alias.
|
|
Merge up to d01c3def63caf59d1d156aec8d52a1540e5ffe27 (2020-09-17)
|
|
Submitted to mainline at
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554142.html
gcc/ChangeLog:
PR middle-end/96390
* omp-offload.c (omp_discover_declare_target_tgt_fn_r): Handle
alias nodes.
libgomp/ChangeLog:
PR middle-end/96390
* testsuite/libgomp.c++/pr96390.C: New test.
* testsuite/libgomp.c-c++-common/pr96390.c: New test.
|
|
target regions
OpenMP 5.0 also specifies that functions referenced from target regions
(except for target regions with device(ancestor:)) are also implicitly declare target to.
This patch implements that.
2020-05-14 Jakub Jelinek <jakub@redhat.com>
* function.h (struct function): Add has_omp_target bit.
* omp-offload.c (omp_discover_declare_target_fn_r): New function,
old renamed to ...
(omp_discover_declare_target_tgt_fn_r): ... this.
(omp_discover_declare_target_var_r): Call
omp_discover_declare_target_tgt_fn_r instead of
omp_discover_declare_target_fn_r.
(omp_discover_implicit_declare_target): Also queue functions with
has_omp_target bit set, for those walk with
omp_discover_declare_target_fn_r, for declare target to functions
walk with omp_discover_declare_target_tgt_fn_r.
gcc/c/
* c-parser.c (c_parser_omp_target): Set cfun->has_omp_target.
gcc/cp/
* cp-gimplify.c (cp_genericize_r): Set cfun->has_omp_target.
gcc/fortran/
* trans-openmp.c: Include function.h.
(gfc_trans_omp_target): Set cfun->has_omp_target.
libgomp/
* testsuite/libgomp.c-c++-common/target-40.c: New test.
(cherry picked from commit 49ddde69fc8e1e4c48d4b0027ea37ac862da0f1f)
|
|
This attempts to implement what the OpenMP 5.0 spec in declare target section
says as ammended by the 5.1 changes so far (related to device_type(host)), except
that it doesn't have the device(ancestor: ...) handling yet because we do not
support it yet, and I've left so far out the except lambda note, because I need
that clarified.
2020-05-12 Jakub Jelinek <jakub@redhat.com>
* omp-offload.h (omp_discover_implicit_declare_target): Declare.
* omp-offload.c: Include context.h.
(omp_declare_target_fn_p, omp_declare_target_var_p,
omp_discover_declare_target_fn_r, omp_discover_declare_target_var_r,
omp_discover_implicit_declare_target): New functions.
* cgraphunit.c (analyze_functions): Call
omp_discover_implicit_declare_target.
* testsuite/libgomp.c/target-39.c: New test.
(cherry picked from commit dc703151d4f4560e647649506d5b4ceb0ee11e90)
|