Age | Commit message (Collapse) | Author | Files | Lines |
|
-ffrontend-optimize)
2017-12-26 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/83540
* frontend-passes.c (create_var): If an array to be created
has unknown size and -fno-realloc-lhs is in effect,
return NULL.
2017-12-26 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/83540
* gfortran.dg/inline_matmul_20.f90: New test.
From-SVN: r256003
|
|
2017-12-26 Tom de Vries <tom@codesourcery.com>
* c-c++-common/unroll-5.c: Use relative line number.
From-SVN: r256002
|
|
PR rtl-optimization/83513
* sel-sched.c (sel_rank_for_schedule): Order by non-zero usefulness
before priority comparison.
From-SVN: r256001
|
|
From-SVN: r256000
|
|
PR target/83488
* config/i386/i386.opt (-mavx512vpopcntdq, -mmavx512bitalg): Move from
ix86_isa_flags2 to ix86_isa_flags.
* config/i386/i386-c.c (ix86_target_macros_internal): Test
OPTION_MASK_ISA_AVX512BITALG and OPTION_MASK_ISA_AVX512VPOPCNTDQ in
isa_flags rather than isa_flags2.
* config/i386/i386.c (ix86_target_string): Move -mavx512vpopcntdq
and -mavx512bitalg from isa2_opts to isa_opts.
(ix86_option_override_internal): Test OPTION_MASK_ISA_AVX512VPOPCNTDQ
in x_ix86_isa_flags_explicit rather than x_ix86_isa_flags2_explicit
and set it in x_ix86_isa_flags rather than x_ix86_isa_flags2.
Formatting fixes.
(def_builtin): Treat OPTION_MASK_ISA_AVX512BW or
OPTION_MASK_ISA_AVX512F ored with another option similarly to
OPTION_MASK_ISA_AVX512VL. Even for OPTION_MASK_ISA_AVX512VL don't
clear it if mask is just OPTION_MASK_ISA_AVX512VL itself.
(ix86_expand_builtin): Don't handle OPTION_MASK_ISA_GFNI and
OPTION_MASK_ISA_VPCLMULQDQ specially, instead handle
OPTION_MASK_ISA_AVX512BW and OPTION_MASK_ISA_AVX512F that way.
* config/i386/i386-builtin.def: Move AVX512VPOPCNTDQ and AVX512BITALG
builtins from bdesc_args2 to bdesc_args section.
(__builtin_ia32_compressstoreuqi512_mask,
__builtin_ia32_compressstoreuhi512_mask,
__builtin_ia32_compressstoreuqi256_mask,
__builtin_ia32_expandloadqi512_mask,
__builtin_ia32_expandloadqi512_maskz,
__builtin_ia32_expandloadhi512_mask,
__builtin_ia32_expandloadhi512_maskz,
__builtin_ia32_compressqi512_mask, __builtin_ia32_compresshi512_mask,
__builtin_ia32_compressqi256_mask, __builtin_ia32_expandqi512_mask,
__builtin_ia32_expandqi512_maskz, __builtin_ia32_expandhi512_mask,
__builtin_ia32_expandhi512_maskz, __builtin_ia32_expandqi256_mask,
__builtin_ia32_expandqi256_maskz, __builtin_ia32_vpshrd_v32hi_mask,
__builtin_ia32_vpshld_v32hi_mask, __builtin_ia32_vpshrdv_v32hi_mask,
__builtin_ia32_vpshrdv_v32hi_maskz, __builtin_ia32_vpshldv_v32hi_mask,
__builtin_ia32_vpshldv_v32hi_maskz,
__builtin_ia32_vpopcountb_v64qi_mask,
__builtin_ia32_vpopcountw_v32hi_mask,
__builtin_ia32_vpshufbitqmb512_mask,
__builtin_ia32_vpshufbitqmb256_mask): Add
" | OPTION_MASK_ISA_AVX512BW".
(__builtin_ia32_expandloadqi256_mask,
__builtin_ia32_expandloadqi256_maskz,
__builtin_ia32_vpopcountb_v32qi_mask): Add
" | OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512BW".
(__builtin_ia32_expandloadhi256_mask,
__builtin_ia32_expandloadhi256_maskz,
__builtin_ia32_expandloadqi128_mask,
__builtin_ia32_expandloadqi128_maskz,
__builtin_ia32_expandloadhi128_mask,
__builtin_ia32_expandloadhi128_maskz,
__builtin_ia32_vpshrd_v16hi, __builtin_ia32_vpshrd_v16hi_mask,
__builtin_ia32_vpshrd_v8hi, __builtin_ia32_vpshrd_v8hi_mask,
__builtin_ia32_vpshrd_v8si, __builtin_ia32_vpshrd_v8si_mask,
__builtin_ia32_vpshrd_v4si, __builtin_ia32_vpshrd_v4si_mask,
__builtin_ia32_vpshrd_v4di, __builtin_ia32_vpshrd_v4di_mask,
__builtin_ia32_vpshrd_v2di, __builtin_ia32_vpshrd_v2di_mask,
__builtin_ia32_vpshld_v16hi, __builtin_ia32_vpshld_v16hi_mask,
__builtin_ia32_vpshld_v8hi, __builtin_ia32_vpshld_v8hi_mask,
__builtin_ia32_vpshld_v8si, __builtin_ia32_vpshld_v8si_mask,
__builtin_ia32_vpshld_v4si, __builtin_ia32_vpshld_v4si_mask,
__builtin_ia32_vpshld_v4di, __builtin_ia32_vpshld_v4di_mask,
__builtin_ia32_vpshld_v2di, __builtin_ia32_vpshld_v2di_mask,
__builtin_ia32_vpshrdv_v16hi, __builtin_ia32_vpshrdv_v16hi_mask,
__builtin_ia32_vpshrdv_v16hi_maskz, __builtin_ia32_vpshrdv_v8hi,
__builtin_ia32_vpshrdv_v8hi_mask, __builtin_ia32_vpshrdv_v8hi_maskz,
__builtin_ia32_vpshrdv_v8si, __builtin_ia32_vpshrdv_v8si_mask,
__builtin_ia32_vpshrdv_v8si_maskz, __builtin_ia32_vpshrdv_v4si,
__builtin_ia32_vpshrdv_v4si_mask, __builtin_ia32_vpshrdv_v4si_maskz,
__builtin_ia32_vpshrdv_v4di, __builtin_ia32_vpshrdv_v4di_mask,
__builtin_ia32_vpshrdv_v4di_maskz, __builtin_ia32_vpshrdv_v2di,
__builtin_ia32_vpshrdv_v2di_mask, __builtin_ia32_vpshrdv_v2di_maskz,
__builtin_ia32_vpshldv_v16hi, __builtin_ia32_vpshldv_v16hi_mask,
__builtin_ia32_vpshldv_v16hi_maskz, __builtin_ia32_vpshldv_v8hi,
__builtin_ia32_vpshldv_v8hi_mask, __builtin_ia32_vpshldv_v8hi_maskz,
__builtin_ia32_vpshldv_v8si, __builtin_ia32_vpshldv_v8si_mask,
__builtin_ia32_vpshldv_v8si_maskz, __builtin_ia32_vpshldv_v4si,
__builtin_ia32_vpshldv_v4si_mask, __builtin_ia32_vpshldv_v4si_maskz,
__builtin_ia32_vpshldv_v4di, __builtin_ia32_vpshldv_v4di_mask,
__builtin_ia32_vpshldv_v4di_maskz, __builtin_ia32_vpshldv_v2di,
__builtin_ia32_vpshldv_v2di_mask, __builtin_ia32_vpshldv_v2di_maskz,
__builtin_ia32_vpopcountb_v32qi, __builtin_ia32_vpopcountb_v16qi,
__builtin_ia32_vpopcountb_v16qi_mask, __builtin_ia32_vpopcountw_v16hi,
__builtin_ia32_vpopcountw_v16hi_mask, __builtin_ia32_vpopcountw_v8hi,
__builtin_ia32_vpopcountw_v8hi_mask): Add
" | OPTION_MASK_ISA_AVX512VL".
* config/i386/avx512vbmi2intrin.h (_mm512_shrdi_epi16,
_mm512_shrdi_epi32, _mm512_mask_shrdi_epi32, _mm512_maskz_shrdi_epi32,
_mm512_shrdi_epi64, _mm512_mask_shrdi_epi64, _mm512_maskz_shrdi_epi64,
_mm512_shldi_epi16, _mm512_shldi_epi32, _mm512_mask_shldi_epi32,
_mm512_maskz_shldi_epi32, _mm512_shldi_epi64, _mm512_mask_shldi_epi64,
_mm512_maskz_shldi_epi64, _mm512_shrdv_epi16, _mm512_shrdv_epi32,
_mm512_mask_shrdv_epi32, _mm512_maskz_shrdv_epi32, _mm512_shrdv_epi64,
_mm512_mask_shrdv_epi64, _mm512_maskz_shrdv_epi64, _mm512_shldv_epi16,
_mm512_shldv_epi32, _mm512_mask_shldv_epi32, _mm512_maskz_shldv_epi32,
_mm512_shldv_epi64, _mm512_mask_shldv_epi64,
_mm512_maskz_shldv_epi64): Don't require avx512bw for these intrinsics.
* config/i386/avx512bitalgintrin.h (_mm_bitshuffle_epi64_mask,
_mm_mask_bitshuffle_epi64_mask): Likewise.
* common/config/i386/i386-common.c
(OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET,
OPTION_MASK_ISA_AVX512BITALG_SET): Or in OPTION_MASK_ISA_AVX512F_SET.
(OPTION_MASK_ISA_AVX512F_UNSET): Or in
OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET and
OPTION_MASK_ISA_AVX512BITALG_UNSET.
(OPTION_MASK_ISA2_AVX512F_UNSET,
OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET): Define.
(ix86_handle_option): For -mno-general-regs-only, clear from
ix86_isa_flags2 OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET rather than
just OPTION_MASK_ISA_MPX. For -mno-sse{,2,3,4,4.1,4.2,avx,avx2} and
-mno-ssse3 clear OPTION_MASK_ISA2_AVX512F_UNSET bits from
ix86_isa_flags2. For -mno-avx512f likewise, instead of masking
individually listed ISAs. For -m{,no-}avx512{vpopcntdq,bitalg} adjust
for moving from ix86_isa_flags2 to ix86_isa_flags.
From-SVN: r255997
|
|
From-SVN: r255996
|
|
From-SVN: r255991
|
|
case label inside)
PR c++/83553
* fold-const.c (struct contains_label_data): New type.
(contains_label_1): Return non-NULL even for CASE_LABEL_EXPR, unless
inside of a SWITCH_BODY seen during the walk.
(contains_label_p): Use walk_tree instead of
walk_tree_without_duplicates, prepare data for contains_label_1 and
provide own pset.
* c-c++-common/torture/pr83553.c: New test.
From-SVN: r255987
|
|
From-SVN: r255986
|
|
declaration since r224161)
PR debug/83550
* c-decl.c (finish_struct): Set DECL_SOURCE_LOCATION on
TYPE_STUB_DECL and call rest_of_type_compilation before processing
incomplete vars rather than after it.
* c-c++-common/dwarf2/pr83550.c: New test.
From-SVN: r255981
|
|
ought to be)
PR debug/83547
* tree-iterator.c (alloc_stmt_list): Start with cleared
TREE_SIDE_EFFECTS regardless whether a new STATEMENT_LIST is allocated
or old one reused.
c/
* c-typeck.c (c_finish_stmt_expr): Ignore !TREE_SIDE_EFFECTS as
indicator of ({ }), instead skip all trailing DEBUG_BEGIN_STMTs first,
and consider empty ones if there are no other stmts. For
-Wunused-value walk all statements before the one only followed by
DEBUG_BEGIN_STMTs.
testsuite/
* gcc.c-torture/compile/pr83547.c: New test.
From-SVN: r255980
|
|
PR target/83488
* config/i386/avx512vnniintrin.h: Don't check for __AVX512F__ nor
enable avx512f explicitly in #pragma GCC target.
* config/i386/i386-builtin.def (__builtin_ia32_vpdpbusd_v8si,
__builtin_ia32_vpdpbusd_v8si_mask, __builtin_ia32_vpdpbusd_v8si_maskz,
__builtin_ia32_vpdpbusd_v4si, __builtin_ia32_vpdpbusd_v4si_mask,
__builtin_ia32_vpdpbusd_v4si_maskz, __builtin_ia32_vpdpbusds_v8si,
__builtin_ia32_vpdpbusds_v8si_mask,
__builtin_ia32_vpdpbusds_v8si_maskz, __builtin_ia32_vpdpbusds_v4si,
__builtin_ia32_vpdpbusds_v4si_mask,
__builtin_ia32_vpdpbusds_v4si_maskz, __builtin_ia32_vpdpwssd_v8si,
__builtin_ia32_vpdpwssd_v8si_mask, __builtin_ia32_vpdpwssd_v8si_maskz,
__builtin_ia32_vpdpwssd_v4si, __builtin_ia32_vpdpwssd_v4si_mask,
__builtin_ia32_vpdpwssd_v4si_maskz, __builtin_ia32_vpdpwssds_v8si,
__builtin_ia32_vpdpwssds_v8si_mask,
__builtin_ia32_vpdpwssds_v8si_maskz, __builtin_ia32_vpdpwssds_v4si,
__builtin_ia32_vpdpwssds_v4si_mask,
__builtin_ia32_vpdpwssds_v4si_maskz): Use
OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512VL instead of
just OPTION_MASK_ISA_AVX512VNNI.
* gcc.target/i386/pr83488-2.c: New test.
* gcc.target/i386/pr83488-3.c: New test.
From-SVN: r255979
|
|
2017-12-22 Martin Jambor <mjambor@suse.cz>
PR lto/82027
* lto-cgraph.c (output_cgraph_opt_summary_p): Also check former
clones.
testsuite/
* g++.dg/lto/pr82027_0.C: New test.
From-SVN: r255978
|
|
Array_index_expression may be used for indexing/slicing array or
slice. If a slice element is address taken, the slice itself is
not necessarily address taken. Only propagate address-taken for
arrays.
Reviewed-on: https://go-review.googlesource.com/83877
From-SVN: r255977
|
|
This CL ports the latest (~Go 1.10) escape analysis code from
the gc compiler. Changes include:
- In the gc compiler, the variable expression is represented
with the variable node itself (ONAME). It is the same node
used in the AST for multiple var expressions for the same
variable. In our case, the var expressions nodes are distinct
nodes. We need to propagate the escape state from/to the
underlying variable in getter and setter. We already do it in
the setter. Do it in the getter as well.
- At the point of escape analysis, some AST constructs have not
been lowered to runtime calls, for example, map literal
construction and some builtin calls. Change the analysis to
work on the non-lowered AST constructs instead of call
expressions for them. For this to work, the analysis needs to
look into Builtin_call_expression. Move its class definition
from expressions.cc to expressions.h, and add necessary
accessors. Also fix bugs in other runtime call handlings
(selectsend, ifaceX2Y2, etc.).
- Handle closures properly. The analysis tracks the function
reference expression, and the escape state is propagated to
the underlying heap expression for get_backend to do stack
allocation for non-escaping closures.
- Fix add_dereference. Before, this was doing expr->deref(),
which undoes an indirection instead of add one. In the gc
compiler, it adds a level of indirection, which is modeled as
an OIND node regardless of the type of the expression. We
can't do this for non-pointer typed expression, otherwise it
will result in a type error. Instead, we model it with a
special flavor of Node, "indirect". The flood phase handles
this by incrementing its level.
- Slicing of an array was not handled correctly. The gc compiler
has an implicit (compiler inserted) OADDR node for the array,
so the analysis is actually performed on the address of the
array. We don't have this implicit address-of expression in
the AST. Instead, we model this by adding an implicit child to
the Node of the Array_index_expression representing slicing of
an array.
- Array_index_expression may represent indexing or slicing. The
code distinguishes them by looking at whether the type of the
expression is a slice. This does not work if the slice element
is a slice. Instead, check whether its end() is NULL.
- Temporary references was handled only in a limited case, as
part of address-of expression. This CL handles it in general.
The analysis uses the Temporary_statement as the point of
tracking, and forwards Temporary_reference_expression to the
underlying statement when needed.
- Handle call return value flows, escpecially multiple return
values. This includes porting part of CL 8202, CL 20102, and
other fixes.
- Support go:noescape pragma.
- Add special handling for self assignment like
b.buf = b.buf[m:n]. (CL 3162)
- Remove ESCAPE_SCOPE, which was treated essentially the same as
ESCAPE_HEAP, and was removed from the gc compiler. (CL 32130)
- Run flood phase until fix point. (CL 30693)
- Unnamed parameters do not escape. (CL 38600)
- Various small bug fixes and improvements.
"make check-go" passes except the one test in math/big, when the
escape analysis is on. The escape analysis is still not run by
default.
Reviewed-on: https://go-review.googlesource.com/83876
From-SVN: r255976
|
|
gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512BITALG_SET,
OPTION_MASK_ISA_AVX512BITALG_UNSET): New.
(ix86_handle_option): Handle -mavx512bitalg, fix 4VNNIW formatting.
* config.gcc: Add avx512vpopcntdqvlintrin.h and avx512bitalgintrin.h.
* config/i386/avx512bitalgintrin.h (_mm512_popcnt_epi8, _mm512_popcnt_epi16,
_mm512_mask_popcnt_epi8, _mm512_maskz_popcnt_epi8, _mm512_mask_popcnt_epi16,
_mm512_maskz_popcnt_epi16, _mm512_bitshuffle_epi64_mask, _mm256_popcnt_epi8,
_mm512_mask_bitshuffle_epi64_mask, _mm256_mask_popcnt_epi8, _mm_popcnt_epi8,
_mm256_maskz_popcnt_epi8, _mm_bitshuffle_epi64_mask, _mm256_popcnt_epi16,
_mm_mask_bitshuffle_epi64_mask, _mm256_bitshuffle_epi64_mask,
_mm256_mask_bitshuffle_epi64_mask, _mm_popcnt_epi16, _mm_maskz_popcnt_epi8,
_mm256_mask_popcnt_epi16, _mm256_maskz_popcnt_epi16, _mm_mask_popcnt_epi8,
_mm_mask_popcnt_epi16, _mm_maskz_popcnt_epi16): New intrinsics.
* config/i386/avx512vpopcntdqvlintrin.h (_mm_popcnt_epi32, _mm_popcnt_epi64,
_mm_mask_popcnt_epi32, _mm_maskz_popcnt_epi32, _mm256_popcnt_epi32,
_mm256_mask_popcnt_epi32, _mm256_maskz_popcnt_epi32, _mm_mask_popcnt_epi64,
_mm_maskz_popcnt_epi64, _mm256_popcnt_epi64, _mm256_mask_popcnt_epi64,
_mm256_maskz_popcnt_epi64): New intrinsics.
* config/i386/cpuid.h (bit_AVX512BITALG): New bit.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect -mavx512bitalg.
* config/i386/i386-builtin-types.def (V64QI_FTYPE_V64QI, V64QI_FTYPE_V64QI,
V4DI_FTYPE_V4DI, UHI_FTYPE_V2DI_V2DI_UHI, USI_FTYPE_V4DI_V4DI_USI,
V4SI_FTYPE_V4SI_V4SI_UHI, V8SI_FTYPE_V8SI_V8SI_UHI): New types.
* config/i386/i386-builtin.def (__builtin_ia32_vpopcountq_v4di,
__builtin_ia32_vpopcountq_v4di_mask, __builtin_ia32_vpopcountq_v2di,
__builtin_ia32_vpopcountq_v2di_mask, __builtin_ia32_vpopcountd_v4si,
__builtin_ia32_vpopcountd_v4si_mask, __builtin_ia32_vpopcountd_v8si,
__builtin_ia32_vpopcountd_v8si_mask, __builtin_ia32_vpopcountb_v64qi,
__builtin_ia32_vpopcountb_v64qi_mask, __builtin_ia32_vpopcountb_v32qi,
__builtin_ia32_vpopcountb_v32qi_mask, __builtin_ia32_vpopcountb_v16qi,
__builtin_ia32_vpopcountb_v16qi_mask, __builtin_ia32_vpopcountw_v32hi,
__builtin_ia32_vpopcountw_v32hi_mask, __builtin_ia32_vpopcountw_v16hi,
__builtin_ia32_vpopcountw_v16hi_mask, __builtin_ia32_vpopcountw_v8hi,
__builtin_ia32_vpopcountw_v8hi_mask, __builtin_ia32_vpshufbitqmb128_mask,
__builtin_ia32_vpshufbitqmb256_mask,
__builtin_ia32_vpshufbitqmb512_mask): New builtins.
* config/i386/i386-c.c (__AVX512BITALG__): New.
* config/i386/i386.c (isa2_opts): Add -mavx512bitalg.
(ix86_valid_target_attribute_inner_p): Ditto.
(ix86_expand_args_builtin): Handle new types.
* config/i386/i386.h (TARGET_AVX512BITALG, TARGET_AVX512BITALG_P): New.
* config/i386/i386.opt: Add -mavx512bitalg.
* config/i386/immintrin.h: Add avx512vpopcntdqvlintrin.h and
avx512bitalgintrin.h.
* config/i386/sse.md (VI48_AVX512VLBW): New iterator.
(vpopcount<mode><mask_name>): Add more types.
(avx512vl_vpshufbitqmb<mode><mask_scalar_merge_name>): New.
* doc/invoke.texi: Add -mavx512bitalg and -mavx512vpopcntdq.
gcc/testsuite/
* g++.dg/other/i386-2.C: Add new options.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/sse-12.c: Ditto.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/avx512-check.h: Handle bit_AVX512BITALG.
* gcc.target/i386/avx512bitalg-vpopcntb-1.c: New.
* gcc.target/i386/avx512bitalg-vpopcntb.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntbvl.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntw.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntwvl.c: Ditto.
* gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Ditto.
* gcc.target/i386/avx512bitalg-vpshufbitqmb.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpopcntb-1.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpopcntw-1.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpshufbitqmb-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdqvl-vpopcntd-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c: Ditto.
* gcc.target/i386/i386.exp (check_effective_target_avx512bitalg): New.
* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: Add more types.
* gcc.target/i386/avx512vpopcntdq-vpopcntd.c: Handle new intrinsics.
* gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdq-vpopcntq.c: Ditto.
Co-Authored-By: Sebastian Peryt <sebastian.peryt@intel.com>
From-SVN: r255975
|
|
This is a follow up patch for pr83488 to fix an error in setting
OPTION_MASK_ISA_AVX512VNNI_SET and OPTION_MASK_ISA_AVX512F_SET bits.
There were both set in ix86_isa_flags2 while being defined in
different ISA sets. Additionally move OPTION_MASK_ISA_AVX512VNNI_SET
to ix86_isa_flags as it can be used with OPTION_MASK_ISA_AVX512VL_SET.
gcc/
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VNNI_SET):
Or in OPTION_MASK_ISA_AVX512F_SET.
(OPTION_MASK_ISA_AVX512F_UNSET): Or in
OPTION_MASK_ISA_AVX512VNNI_UNSET.
(ix86_handle_option): Adjust for
OPTION_MASK_ISA_AVX512VNNI_*SET being in ix86_isa_flags.
* config/i386/i386-builtin.def: Move VNNI builtins from ARGS2
section to ARGS.
* config/i386/i386-c.c: Check for OPTION_MASK_ISA_AVX512VNNI in
isa_flag instead of isa_flag2.
* config/i386/i386.c (ix86_target_string): Move -mavx512vnni from
isa_opts2 to isa_opts.
* config/i386/i386.opt (mavx512vnni): Move from ix86_isa_flags2
to ix86_isa_flags.
From-SVN: r255974
|
|
* doc/extend.texi (Loop-Specific Pragmas): Document pragma GCC unroll.
c-family/
* c-pragma.c (init_pragma): Register pragma GCC unroll.
* c-pragma.h (enum pragma_kind): Add PRAGMA_UNROLL.
c/
* c-parser.c (c_parser_while_statement): Add unroll parameter and
build ANNOTATE_EXPR if present. Add 3rd operand to ANNOTATE_EXPR.
(c_parser_do_statement): Likewise.
(c_parser_for_statement): Likewise.
(c_parser_statement_after_labels): Adjust calls to above.
(c_parse_pragma_ivdep): New static function.
(c_parser_pragma_unroll): Likewise.
(c_parser_pragma) <PRAGMA_IVDEP>: Add support for pragma Unroll.
<PRAGMA_UNROLL>: New case.
cp/
* constexpr.c (cxx_eval_constant_expression) <ANNOTATE_EXPR>: Remove
assertion on 2nd operand.
(potential_constant_expression_1): Likewise.
* cp-tree.def (RANGE_FOR_STMT): Take a 5th operand.
* cp-tree.h (RANGE_FOR_UNROLL): New macro.
(cp_convert_range_for): Adjust prototype.
(finish_while_stmt_cond): Likewise.
(finish_do_stmt): Likewise.
(finish_for_cond): Likewise.
* init.c (build_vec_init): Adjut call to finish_for_cond.
* parser.c (cp_parser_statement): Adjust call to
cp_parser_iteration_statement.
(cp_parser_for): Add unroll parameter and pass it in calls to
cp_parser_range_for and cp_parser_c_for.
(cp_parser_c_for): Add unroll parameter and pass it in call to
finish_for_cond.
(cp_parser_range_for): Add unroll parameter, set in on RANGE_FOR_STMT
and pass it in call to cp_convert_range_for.
(cp_convert_range_for): Add unroll parameter and pass it in call to
finish_for_cond.
(cp_parser_iteration_statement): Add unroll parameter and pass it in
calls to finish_while_stmt_cond, finish_do_stmt and cp_parser_for.
(cp_parser_pragma_ivdep): New static function.
(cp_parser_pragma_unroll): Likewise.
(cp_parser_pragma) <PRAGMA_IVDEP>: Add support for pragma Unroll.
<PRAGMA_UNROLL>: New case.
* pt.c (tsubst_expr) <FOR_STMT>: Adjust call to finish_for_cond.
<RANGE_FOR_STMT>: Pass unrolling factor to cp_convert_range_for.
<WHILE_STMT>: Adjust call to finish_while_stmt_cond.
<DO_STMT>: Adjust call to finish_do_stmt.
* semantics.c (finish_while_stmt_cond): Add unroll parameter and
build ANNOTATE_EXPR if present.
(finish_do_stmt): Likewise.
(finish_for_cond): Likewise.
(begin_range_for_stmt): Build RANGE_FOR_STMT with 5th operand.
fortran/
* array.c (gfc_copy_iterator): Copy unroll field.
* decl.c (directive_unroll): New global variable.
(gfc_match_gcc_unroll): New function.
* gfortran.h (gfc_iterator]): Add unroll field.
(directive_unroll): Declare:
* match.c (gfc_match_do): Use memset to initialize the iterator.
* match.h (gfc_match_gcc_unroll): New prototype.
* parse.c (decode_gcc_attribute): Match "unroll".
(parse_do_block): Set iterator's unroll.
(parse_executable): Diagnose misplaced unroll directive.
* trans-stmt.c (gfc_trans_simple_do) Annotate loop condition with
annot_expr_unroll_kind.
(gfc_trans_do): Likewise.
* gfortran.texi (GNU Fortran Compiler Directives): Split section into
subections 'ATTRIBUTES directive' and 'UNROLL directive'.
From-SVN: r255973
|
|
This CL brings escape analysis diagnostics closer to the gc
compiler's. This makes porting and debugging escape analysis
code easier. A few changes:
- In the gc compiler, the variable expression is represented
with the variable node itself (ONAME), the location of which
is the location of definition. We add a definition_location
method to Node, and make use of it when the gc compiler emits
diagnostics at the definition locations.
- In the gc compiler, methods are named T.M or (*T).M. Add the
type to the method name when possible.
- Print "moved to heap" messages only for variables.
- Reduce some duplicated diagnostics.
- Print "does not escape" messages in more situations which the
gc compiler does.
- Remove the special handling for closure numbers. In gofrontend,
closures are named "$nested#" where # is a global counter
starting from 0, whereas in the gc compiler they are named
"outer.func#" where # is a per-function counter starting from
1. We tried to adjust the closure name to better matching the
ones in the gc compiler, however, it cannot match exactly
because of the difference of the counter. Instead, just print
"outer.$nested#".
Reviewed-on: https://go-review.googlesource.com/83875
From-SVN: r255967
|
|
for gcc/c-family/ChangeLog
PR debug/83527
PR debug/83419
* c-semantics.c (only_debug_stmts_after_p): New.
(pop_stmt_list): Clear side effects in debug-only stmt list.
Check for single nondebug stmt followed by debug stmts only.
for gcc/testsuite/ChangeLog
PR debug/83527
PR debug/83419
* gcc.dg/pr83527.c: New.
From-SVN: r255966
|
|
From-SVN: r255965
|
|
gcc/testsuite/ChangeLog
* c-c++-common/Warray-bounds-3.c: Adjust dg-warning grep pattern.
From-SVN: r255962
|
|
PR middle-end/83487
* config/i386/i386.c (ix86_function_arg_boundary): Return
PARM_BOUNDARY for TYPE_EMPTY_P types.
* gcc.c-torture/compile/pr83487.c: New test.
* gcc.dg/compat/pr83487-1.h: New file.
* gcc.dg/compat/pr83487-1_main.c: New test.
* gcc.dg/compat/pr83487-1_x.c: New file.
* gcc.dg/compat/pr83487-1_y.c: New file.
* gcc.dg/compat/pr83487-2_main.c: New test.
* gcc.dg/compat/pr83487-2_x.c: New file.
* gcc.dg/compat/pr83487-2_y.c: New file.
* g++.dg/abi/pr83487.C: New test.
* g++.dg/compat/abi/pr83487-1_main.C: New test.
* g++.dg/compat/abi/pr83487-1_x.C: New file.
* g++.dg/compat/abi/pr83487-1_y.C: New file.
* g++.dg/compat/abi/pr83487-2_main.C: New test.
* g++.dg/compat/abi/pr83487-2_x.C: New file.
* g++.dg/compat/abi/pr83487-2_y.C: New file.
From-SVN: r255961
|
|
PR c/83448
* gimple-ssa-sprintf.c (maybe_warn): Don't call set_caret_index
if navail is >= dir.len.
* gcc.c-torture/compile/pr83448.c: New test.
* gcc.dg/tree-ssa/builtin-snprintf-warn-4.c: New test.
From-SVN: r255960
|
|
From-SVN: r255959
|
|
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Variable>: Always take
into account the Esize if it is known.
From-SVN: r255958
|
|
2017-12-21 Steve Ellcey <sellcey@cavium.com>
* config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Fix
triplet for ilp32.
From-SVN: r255957
|
|
From-SVN: r255955
|
|
ICE when compiled with options "-fprofile-use -freorder-blocks-and-partition")
PR rtl-optimization/80747
PR rtl-optimization/83512
* cfgrtl.c (force_nonfallthru_and_redirect): When splitting
succ edge from ENTRY, copy partition from e->dest to the newly
created bb.
* bb-reorder.c (reorder_basic_blocks_simple): If last_tail is
ENTRY, use BB_PARTITION of its successor block as current_partition.
Don't copy partition when splitting succ edge from ENTRY.
* gcc.dg/pr80747.c: New test.
* gcc.dg/pr83512.c: New test.
From-SVN: r255954
|
|
marked for throw, but doesn't))
PR tree-optimization/83523
* tree-ssa-math-opts.c (is_widening_mult_p): Return false if
for INTEGER_TYPE TYPE_OVERFLOW_TRAPS.
(convert_mult_to_fma): Likewise.
* g++.dg/tree-ssa/pr83523.C: New test.
From-SVN: r255953
|
|
operand in unary operation))
PR tree-optimization/83521
* tree-ssa-phiopt.c (factor_out_conditional_conversion): Use
gimple_build_assign without code on result of
fold_build1 (VIEW_CONVERT_EXPR, ...), as it might not create
a VIEW_CONVERT_EXPR.
* gcc.dg/pr83521.c: New test.
From-SVN: r255952
|
|
2017-12-21 Andrew Pinski <apinski@cavium.com>
Steve Ellcey <sellcey@cavium.com>
* config/aarch64/t-aarch64-linux (MULTILIB_OSDIRNAMES): Handle
multi-arch for ilp32.
Co-Authored-By: Steve Ellcey <sellcey@cavium.com>
From-SVN: r255951
|
|
https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01432.html
PR c++/83406
* parser.c (cp_parser_lambda_body): Remove obsolete
single-return-statement handling.
PR c++/83406
* g++.dg/cpp0x/lambda/lambda-ice15.C: Adjust error.
* g++.dg/cpp1y/pr83406.C: New.
From-SVN: r255950
|
|
to find a register to spill with -flive-range-shrinkage -m8bit-idiv)
PR target/83467
* config/i386/i386.md (*ashl<mode>3_mask): Add operand
constraints to operand 2.
(*ashl<mode>3_mask_1): Ditto.
(*<shift_insn><mode>3_mask): Ditto.
(*<shift_insn><mode>3_mask_1): Ditto.
(*<rotate_insn><mode>3_mask): Ditto.
(*<rotate_insn><mode>3_mask_1): Ditto.
testsuite/ChangeLog:
PR target/83467
* gcc.target/i386/pr83467-1.c: New test.
* gcc.target/i386/pr83467-2.c: Ditto.
From-SVN: r255949
|
|
A number of -fcompare-debug errors on sparc arise as we split a dbr
SEQUENCE back into separate insns to turn the branch into a return.
If we just take the location from the PREV_INSN, it might be a debug
insn without INSN_LOCATION, or an insn with an unrelated location.
But that's silly: each of the SEQUENCEd insns is still an insn with
its own INSN_LOCATION, so use that instead, even though some may have
been adjusted while constructing the SEQUENCE.
for gcc/ChangeLog
* reorg.c (make_return_insns): Reemit each insn with its own
location.
From-SVN: r255948
|
|
Statements without side effects, preceded by debug begin stmt markers,
would become a statement list with side effects, although the stmt on
its own would be extracted from the list and remain not having side
effects. This causes debug info and possibly codegen differences.
This patch fixes it, identifying the situation in which the stmt would
have been extracted from the stmt list, and propagating the side
effects flag from the stmt to the list.
for gcc/ChangeLog
PR debug/83419
* c-family/c-semantics.c (pop_stmt_list): Propagate side
effects from single nondebug stmt to container list.
for gcc/testsuite/ChangeLog
PR debug/83419
* gcc.dg/pr83419.c: New.
From-SVN: r255947
|
|
it is not useful
Our current vector initialisation code will first duplicate
the first element to both lanes, then overwrite the top lane with a new
value.
This duplication can be clunky and wasteful.
Better would be to simply use the fact that we will always be overwriting
the remaining bits, and simply move the first element to the corrcet place
(implicitly zeroing all other bits).
We also need a new pattern in simplify-rtx.c:simplify_ternary_operation ,
to ensure we can still simplify:
(vec_merge:OUTER
(vec_duplicate:OUTER x:INNER)
(subreg:OUTER y:INNER 0)
(const_int N))
To:
(vec_concat:OUTER x:INNER y:INNER) or (vec_concat y x)
---
gcc/
* config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify code
generation for cases where splatting a value is not useful.
* simplify-rtx.c (simplify_ternary_operation): Simplify vec_merge
across a vec_duplicate and a paradoxical subreg forming a vector
mode to a vec_concat.
gcc/testsuite/
* gcc.target/aarch64/vect-slp-dup.c: New.
From-SVN: r255946
|
|
scalar int mode
gcc/
* combine.c (simplify_set): Do not transform subregs to zero_extends
if the destination is not a scalar int mode.
From-SVN: r255945
|
|
PR c++/82872
* convert.c (convert_to_integer_1) <POINTER_TYPE>: Do not return the
shared zero if the input has overflowed.
From-SVN: r255944
|
|
system detection
Since support for -mcpu=cortex-a55 and -mcpu=cortex-a75
was added we added support for the +dotprod extension
which these CPUs support.
We already specify as such in the arm-cpus.in entries for
these processors. However the table in driver-arm.c was
not adding +dotproct to the -march string that it generates.
This patch fixes that oversight.
In the future I'd like to get the arm_cpu_table in driver-arm.c
be auto-generated somehow from the arm-cpus.in data so
that we don't have to keep track of discrepancies explicitly...
Bootstrapped and tested on arm-none-linux-gnueabihf.
* config/arm/driver-arm.c (arm_cpu_table): Specify dotprod
support for Cortex-A55 and Cortex-A75.
From-SVN: r255943
|
|
* common/config/arm/arm-common.c (compare_opt_names): Add function
comment. Use strcmp instead of manual loop.
From-SVN: r255942
|
|
2017-12-21 Martin Liska <mliska@suse.cz>
PR gcov-profile/83509
* gcov-dump.c (dump_gcov_file): Do not read info about
support_unexecuted_blocks for gcda files.
From-SVN: r255941
|
|
varasm.c:3896 on aarch64)
PR rtl-optimization/82973
* emit-rtl.h (valid_for_const_vec_duplicate_p): Rename to ...
(valid_for_const_vector_p): ... this.
* emit-rtl.c (valid_for_const_vec_duplicate_p): Rename to ...
(valid_for_const_vector_p): ... this. Adjust function comment.
(gen_vec_duplicate): Adjust caller.
* optabs.c (expand_vector_broadcast): Likewise.
* simplify-rtx.c (simplify_const_unary_operation): Don't optimize into
CONST_VECTOR if some element isn't simplified valid_for_const_vector_p
constant.
(simplify_const_binary_operation): Likewise. Use CONST_FIXED_P macro
instead of GET_CODE == CONST_FIXED.
(simplify_subreg): Use CONST_FIXED_P macro instead of
GET_CODE == CONST_FIXED.
* gfortran.dg/pr82973.f90: New test.
From-SVN: r255939
|
|
varasm.c:3896 on aarch64)
PR rtl-optimization/82973
* emit-rtl.h (valid_for_const_vec_duplicate_p): Rename to ...
(valid_for_const_vector_p): ... this.
* emit-rtl.c (valid_for_const_vec_duplicate_p): Rename to ...
(valid_for_const_vector_p): ... this. Adjust function comment.
(gen_vec_duplicate): Adjust caller.
* optabs.c (expand_vector_broadcast): Likewise.
* simplify-rtx.c (simplify_const_unary_operation): Don't optimize into
CONST_VECTOR if some element isn't simplified valid_for_const_vector_p
constant.
(simplify_const_binary_operation): Likewise. Use CONST_FIXED_P macro
instead of GET_CODE == CONST_FIXED.
(simplify_subreg): Use CONST_FIXED_P macro instead of
GET_CODE == CONST_FIXED.
* gfortran.dg/pr82973.f90: New test.
From-SVN: r255938
|
|
PR target/83488
* config/i386/i386.c (ix86_target_string): Move -mavx512vbmi2 and
-mshstk entries from isa_opts2 to isa_opts and -mhle, -mmovbe,
-mclzero and -mmwaitx entries from isa_opts to isa_opts2.
(ix86_option_override_internal): Adjust for
OPTION_MASK_ISA_{HLE,MOVBE,CLZERO,MWAITX} moving to ix86_isa_flags2
and OPTION_MASK_ISA_SHSTK moving to ix86_isa_flags.
(BDESC_VERIFYS): Remove SPECIAL_ARGS2 related checks.
(ix86_init_mmx_sse_builtins): Remove bdesc_special_args2 handling.
Use def_builtin2 instead of def_builtin for OPTION_MASK_ISA_MWAITX
and OPTION_MASK_ISA_CLZERO builtins. Use def_builtin instead of
def_builtin2 for CET builtins.
(ix86_expand_builtin): Remove bdesc_special_args2 handling. Fix
up formatting in IX86_BUILTIN_RDPID code.
* config/i386/i386-builtin.def: Move VBMI2 builtins from SPECIAL_ARGS2
section to SPECIAL_ARGS and from ARGS2 section to ARGS.
* config/i386/i386.opt (mavx512vbmi2, mshstk): Move from
ix86_isa_flags2 to ix86_isa_flags.
(mhle, mmovbe, mclzero, mmwaitx): Move from ix86_isa_flags to
ix86_isa_flags2.
* config/i386/i386-c.c (ix86_target_macros_internal): Check for
OPTION_MASK_ISA_{CLZERO,MWAITX} in isa_flag2 instead of isa_flag.
Check for OPTION_MASK_ISA_{SHSTK,AVX512VBMI2} in isa_flag instead
of isa_flag2.
* common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512VBMI2_SET):
Or in OPTION_MASK_ISA_AVX512F_SET.
(OPTION_MASK_ISA_AVX512F_UNSET): Or in
OPTION_MASK_ISA_AVX512VBMI2_UNSET.
(ix86_handle_option): Adjust for
OPTION_MASK_ISA_{SHSTK,AVX512VBMI2}_*SET being in ix86_isa_flags
and OPTION_MASK_ISA_{MOVBE,MWAITX,CLZERO}_*SET in ix86_isa_flags2.
* gcc.target/i386/pr83488.c: New test.
From-SVN: r255937
|
|
This patch makes prune_runtime_alias_test_list take the iteration
factor as a poly_int and tracks polynomial offsets internally
as well.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-data-ref.h (prune_runtime_alias_test_list): Take the
factor as a poly_uint64 rather than an unsigned HOST_WIDE_INT.
* tree-data-ref.c (prune_runtime_alias_test_list): Likewise.
Track polynomial offsets.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255936
|
|
This patch makes vect_compute_data_ref_alignment treat DR_INIT as a
poly_int and handles cases in which the calculated misalignment might
not be constant.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
Treat drb->init as a poly_int. Fail if its misalignment wrt
vector_alignment isn't known.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255935
|
|
This patch splits the loop versioning threshold out from the
cost model threshold so that the former can become a poly_uint64.
We still use a single test to enforce both limits where possible.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (_loop_vec_info): Add a versioning_threshold
field.
(LOOP_VINFO_VERSIONING_THRESHOLD): New macro
(vect_loop_versioning): Take the loop versioning threshold as a
separate parameter.
* tree-vect-loop-manip.c (vect_loop_versioning): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
versioning_threshold.
(vect_analyze_loop_2): Compute the loop versioning threshold
whenever loop versioning is needed, and store it in the new
field rather than combining it with the cost model threshold.
(vect_transform_loop): Update call to vect_loop_versioning.
Try to combine the loop versioning and cost thresholds here.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255934
|
|
This patch makes ivopts handle polynomial address offsets
when recording potential IV uses.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-ssa-loop-ivopts.h (strip_offset): Return the offset as
poly_uint64_pod rather than an unsigned HOST_WIDE_INT.
* tree-loop-distribution.c (classify_builtin_st): Update accordingly.
* tree-ssa-loop-ivopts.c (iv_use::addr_offset): Change from
an unsigned HOST_WIDE_INT to a poly_uint64_pod.
(group_compare_offset): Update accordingly.
(split_small_address_groups_p): Likewise.
(record_use): Take addr_offset as a poly_uint64 rather than
an unsigned HOST_WIDE_INT.
(strip_offset): Return the offset as a poly_uint64 rather than
an unsigned HOST_WIDE_INT.
(record_group_use, split_address_groups): Track polynomial offsets.
(add_iv_candidate_for_use): Likewise.
(addr_offset_valid_p): Take the offset as a poly_int64 rather
than a HOST_WIDE_INT.
(strip_offset_1): Return the offset as a poly_int64 rather than
a HOST_WIDE_INT.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255933
|
|
This patch changes the offset parameter to get_binfo_at_offset
from HOST_WIDE_INT to poly_int64. This function probably doesn't
need to handle polynomial offsets in practice, but it's easy
to do and avoids forcing the caller to check first.
2017-12-21 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (get_binfo_at_offset): Take the offset as a poly_int64
rather than a HOST_WIDE_INT.
* tree.c (get_binfo_at_offset): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r255932
|