Age | Commit message (Collapse) | Author | Files | Lines |
|
04-Nov-2021 Sandra Loosemore <sandra@codesourcery.com>
Bernhard Reutner-Fischer <aldot@gcc.gnu.org>
PR fortran/101337
gcc/fortran/ChangeLog:
* interface.c (gfc_compare_actual_formal): Continue checking
all arguments after encountering an error.
* intrinsic.c (do_ts29113_check): Likewise.
* resolve.c (resolve_operator): Continue resolving on op2 error.
gcc/testsuite/ChangeLog:
* gfortran.dg/bessel_3.f90: Expect additional diagnostics from
multiple bad arguments in the call.
* gfortran.dg/pr24823.f: Likewise.
* gfortran.dg/pr39937.f: Likewise.
* gfortran.dg/pr41011.f: Likewise.
* gfortran.dg/pr61318.f90: Likewise.
* gfortran.dg/c-interop/c407b-2.f90: Remove xfails.
* gfortran.dg/c-interop/c535b-2.f90: Likewise.
|
|
!binds_to_current_def_p
While proofreading the code for handling EAF flags of !binds_to_current_def_p I
noticed that the interprocedural dataflow actually ignores the flag possibly
introducing wrong code on quite complex interposable functions in non-trivial
recursion cycles (or at ltrans partition boundary).
This patch unifies the flags changes to single place (remove_useless_eaf_flags)
and does extend modref_merge_call_site_flags to do the right thing.
lto-bootstrapped/regtested x86_64-linux. Plan to commit it today after bit
more testing (firefox/clang build).
gcc/ChangeLog:
* gimple.c (gimple_call_arg_flags): Use interposable_eaf_flags.
(gimple_call_retslot_flags): Likewise.
(gimple_call_static_chain_flags): Likewise.
* ipa-modref.c (remove_useless_eaf_flags): Do not remove everything for
NOVOPS.
(modref_summary::useful_p): Likewise.
(modref_summary_lto::useful_p): Likewise.
(analyze_parms): Do not give up on NOVOPS.
(analyze_function): When dumping report chnages in EAF flags
between IPA and local pass.
(modref_merge_call_site_flags): Compute implicit eaf flags
based on callee ecf_flags and fnspec; if the function does not
bind to current defs use interposable_eaf_flags.
(modref_propagate_flags_in_scc): Update.
* ipa-modref.h (interposable_eaf_flags): New function.
|
|
This patch forms the meat of the improvements for this patch series.
We develop a replacement for rs6000_expand_builtin and its supporting
functions, which are inefficient and difficult to maintain.
Differences between the old and new support in this patch include:
- Make use of the new builtin data structures, directly looking up
a function's information rather than searching for the function
multiple times;
- Test for enablement of builtins at expand time, to support #pragma
target changes within a compilation unit;
- Use the builtin function attributes (e.g., bif_is_cpu) to control
special handling;
- Refactor common code into one place; and
- Provide common error handling in one place for operands that are
restricted to specific values or ranges.
2021-11-07 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): New
forward decl.
(rs6000_invalid_new_builtin): New function.
(rs6000_expand_builtin): Call rs6000_expand_new_builtin.
(rs6000_expand_ldst_mask): New function.
(new_cpu_expand_builtin): Likewise.
(elemrev_icode): Likewise.
(ldv_expand_builtin): Likewise.
(lxvrse_expand_builtin): Likewise.
(lxvrze_expand_builtin): Likewise.
(stv_expand_builtin): Likewise.
(new_mma_expand_builtin): Likewise.
(new_htm_spr_num): Likewise.
(new_htm_expand_builtin): Likewise.
(rs6000_expand_new_builtin): Likewise.
(rs6000_init_builtins): Initialize altivec_builtin_mask_for_load.
|
|
implement the (long promised) intraprocedural dataflow for
propagating eaf flags, so we can handle parameters that participate
in loops in SSA graphs. Typical example are acessors that walk linked
lists, for example.
I implemented dataflow using the standard iteration over BBs in RPO some time
ago, but did not like it becuase it had measurable compile time impact with
very small code quality effect. This is why I kept mainline to do the DFS walk
instead. The reason is that we care about flags of SSA names that corresponds
to parameters and those can be often determined from a small fraction of the
SSA graph so solving dataflow for all SSA names in a function is a waste.
This patch implements dataflow more carefully. The DFS walk is kept in place to
solve acyclic cases and discover the relevat part of SSA graph into new graph
(which is similar to one used for inter-procedrual dataflow - we only need to
know the edges and if the access is direct or derefernced). The RPO iterative
dataflow then works on this simplified graph.
This seems to be fast in practice. For GCC linktime we do dataflow for 4881
functions. Out of that 4726 finishes in one iteration, 144 in two and 10 in 3.
Overall 31979 functions are analysed, so we do dataflow only for bit over of
10% of cases. 131123 edges are visited by the solver. I measured no compile
time impact of this.
gcc/ChangeLog:
* ipa-modref.c (modref_lattice): Add do_dataflow,
changed and propagate_to fields.
(modref_lattice::release): Free propagate_to
(modref_lattice::merge): Do not give up early on unknown
lattice values.
(modref_lattice::merge_deref): Likewise.
(modref_eaf_analysis): Update toplevel comment.
(modref_eaf_analysis::analyze_ssa_name): Record postponned ssa names;
do optimistic dataflow initialization.
(modref_eaf_analysis::merge_with_ssa_name): Build dataflow graph.
(modref_eaf_analysis::propagate): New member function.
(analyze_parms): Update to new API of modref_eaf_analysis.
|
|
|
|
gcc/ChangeLog:
* cgraph.h (cgraph_node::can_be_discarded_p): Do not
return true on functions from other partition.
gcc/lto/ChangeLog:
PR ipa/103070
PR ipa/103058
* lto-partition.c (must_not_rename): Update comment.
(promote_symbol): Set resolution to LDPR_PREVAILING_DEF_IRONLY.
|
|
gcc/fortran/ChangeLog:
PR fortran/102715
* decl.c (add_init_expr_to_sym): Reject rank mismatch between
array and its initializer.
gcc/testsuite/ChangeLog:
PR fortran/102715
* gfortran.dg/pr68019.f90: Adjust error message.
* gfortran.dg/pr102715.f90: New test.
|
|
Tamar's recent patch to teach CSE to perform vector extract exercises
VSX splat more frequently, which exposed a constraint error for the
vsx_splat patterns. The pattern could be created for Power9, but
the "we constraint only provided alternatives in 64 bit mode. The
instructions are valid in 32 bit mode and SImode is allowed in VSX
registers. This patch updates the constraints from "we" to "wa" to
allow the pattern and fix the failing testcases.
gcc/ChangeLog:
* config/rs6000/vsx.md (vsx_splat_v4si): Change constraints to "wa".
(vsx_splat_v4si_di): Change constraint to "wa".
|
|
The problem here is that we are incorrectly threading 41->20->21 here:
<bb 35> [local count: 56063504182]:
_134 = M.10_120 + 1;
if (_71 <= _134)
goto <bb 19>; [11.00%]
else
goto <bb 41>; [89.00%]
...
...
...
<bb 41> [local count: 49896518755]:
<bb 20> [local count: 56063503181]:
# lb_75 = PHI <_134(41), 1(18)>
_117 = mstep_49 + lb_75;
_118 = _117 + -1;
_119 = mstep_49 + _118;
M.10_120 = MIN_EXPR <_119, _71>;
if (lb_75 > M.10_120)
goto <bb 21>; [11.00%]
else
goto <bb 22>; [89.00%]
First, lb_17 == _134 because of the PHI.
Second, _134 > M.10_120 because of _134 = M.10_120 + 1.
We then assume that lb_75 > M.10_120, but this is incorrect because
M.10_120 was killed along the path.
This incorrect thread causes the miscompilation in 527.cam4_r.
Tested on x86-64 and ppc64le Linux.
gcc/ChangeLog:
PR tree-optimization/103061
* value-relation.cc (path_oracle::path_oracle): Initialize
m_killed_defs.
(path_oracle::killing_def): Set m_killed_defs.
(path_oracle::query_relation): Do not look at the root oracle for
killed defs.
* value-relation.h (class path_oracle): Add m_killed_defs.
|
|
AIX does not provide memalign, so the testcases much use
posix_memalign for portability on AIX.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/tsvc/tsvc.h (init): Use posix_memalign on AIX.
|
|
The main path discovery function was due for a cleanup. First,
there's a nagging goto and second, my bitmap use was sloppy. Hopefully
this makes the code easier for others to read.
Regstrapped on x86-64 Linux. I also made sure there were no difference
in the number of threads with this patch.
No functional changes.
gcc/ChangeLog:
* tree-ssa-threadbackward.c (back_threader::find_paths_to_names):
Remove gotos and other cleanups.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/102817
* expr.c (simplify_parameter_variable): Copy shape of referenced
subobject when simplifying.
gcc/testsuite/ChangeLog:
PR fortran/102817
* gfortran.dg/pr102817.f90: New test.
|
|
gcc/ChangeLog:
PR ipa/103073
* ipa-modref-tree.h (modref_tree::insert): Do nothing for
paradoxical and zero sized accesses.
gcc/testsuite/ChangeLog:
PR ipa/103073
* g++.dg/torture/pr103073.C: New test.
* gcc.dg/tree-ssa/modref-11.c: New test.
|
|
gcc/ChangeLog:
PR ipa/103082
* ipa-modref-tree.h (struct modref_access_node): Avoid left shift
of negative value
|
|
gcc/fortran/ChangeLog:
PR fortran/69419
* match.c (gfc_match_common): Check array spec of a symbol in a
COMMON object list and reject it if it is a coarray.
gcc/testsuite/ChangeLog:
PR fortran/69419
* gfortran.dg/pr69419.f90: New test.
|
|
These declarations should be noexcept after I added it to the
definitions in <valarray>.
libstdc++-v3/ChangeLog:
* include/bits/range_access.h (begin(valarray), end(valarray)):
Add noexcept.
|
|
libstdc++-v3/ChangeLog:
* include/std/tuple (tuple_size_v): Fix pack expansion.
|
|
gcc/fortran/ChangeLog:
PR fortran/100972
* decl.c (gfc_match_implicit_none): Fix typo in warning.
* resolve.c (resolve_unknown_f): Reject external procedures
without explicit EXTERNAL attribute whe IMPLICIT none (external)
is in effect.
gcc/testsuite/ChangeLog:
PR fortran/100972
* gfortran.dg/implicit_14.f90: Adjust error.
* gfortran.dg/external_implicit_none_3.f08: New test.
|
|
gcc/fortran/ChangeLog:
* decl.c (gfc_insert_kind_parameter_exprs): Make static.
* expr.c (gfc_build_init_expr): Make static
(gfc_build_default_init_expr): Move below its static helper.
* gfortran.h (gfc_insert_kind_parameter_exprs, gfc_add_saved_common,
gfc_add_common, gfc_use_derived_tree, gfc_free_charlen,
gfc_get_ultimate_derived_super_type,
gfc_resolve_oacc_parallel_loop_blocks, gfc_build_init_expr,
gfc_iso_c_sub_interface): Delete.
* symbol.c (gfc_new_charlen, gfc_get_derived_super_type): Make
static.
|
|
2021-11-05 Sandra Loosemore <sandra@codesourcery.com>
PR fortran/35276
gcc/fortran/
* gfortran.texi (Mixed-Language Programming): Talk about C++,
and how to link.
|
|
Currently all the tsvc tests fail to build on Darwin because
they assume that <malloc.h> and memalign() are available.
For Darwin, <stdlib.h> is sufficient to obtain the declarations
for malloc and the port has posix_memalign () but not memalign.
Fixed as below.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* gcc.dg/vect/tsvc/tsvc.h: Do not try to include malloc.h
on Darwin also use posix_memalign ().
|
|
For aarch64, the alignment of the LTRAMPn symbols matters.
Actually, the LTRAMPn symbols _are_ 8 byte aligned, but because
they are Local, the linker doesn't know that this guarantee can be met.
It assumes that they are not necessarily more aligned than the
containing section (ld64 atoms strike again).
The fix is to publish the trampoline symbol for the linker to access
directly - it can then see that the atom is suitably aligned.
Fixes issue #11 on the development branch.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.h (ASM_GENERATE_INTERNAL_LABEL): Add LTRAMP
to the list of symbol prefixes that must be made linker-
visible.
|
|
This will allow someone (with an existing Ada compiler on the
platform - which can be provided by the experimental aarch64-darwin
branch) - to build the host tools (gnatmake and friends) for a
non-native cross.
The existing provisions for iOS are OK for cross-compilation from
an x86-64-darwin platform, but we need some adjustments so that these
host tools can be built to run on aarch64-darwin.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ada/
* gcc-interface/Make-lang.in: Use iOS signal trampoline code
for hosted Ada tools.
* sigtramp-ios.c: Wrap the declarations in extern "C" when
the code is built by a C++ compiler.
|
|
At present, there is no special action needed for aarch64-darwin
this just pulls in generic Darwin code.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config.host: Add support for aarch64-*-darwin.
* config/aarch64/host-aarch64-darwin.c: New file.
* config/aarch64/x-darwin: New file.
|
|
We have a shim crt for Darwin10 that implements functionality
missing in libSystem. Provide this with a prototype to silence the
warning about this.
libgcc/ChangeLog:
* config/darwin10-unwind-find-enc-func.c: Include libgcc_tm.h.
* config/i386/darwin-lib.h: Declare Darwin10 crt function.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
libstdc++-v3/ChangeLog:
* src/c++11/random.cc (__x86_rdrand, __x86_rdseed): Add
[[unlikely]] attribute.
|
|
The ISA-3.0 instruction set includes DARN ("deliver a random number")
which can be used similarly to the existing support for RDRAND and RDSEED.
libstdc++-v3/ChangeLog:
* src/c++11/random.cc [__powerpc__] (USE_DARN): Define.
(__ppc_darn): New function to use POWER9 DARN instruction.
(Which): Add 'darn' enumerator.
(which_source): Check for __ppc_darn.
(random_device::_M_init): Support "darn" and "hw" tokens.
(random_device::_M_getentropy): Add darn to switch.
* testsuite/26_numerics/random/random_device/cons/token.cc:
Check "darn" token.
* testsuite/26_numerics/random/random_device/entropy.cc:
Likewise.
|
|
|
|
|
|
|
|
When the IL has changed, any new ssa-names import calculations may not jive
with existing ssa-names, so just remove the assert.
gcc/
PR tree-optimization/103093
* gimple-range-gori.cc (range_def_chain::get_imports): Remove assert.
gcc/testsuite/
* gcc.dg/pr103093.c: New.
|
|
Make it more efficient by removing the call to vec::contains.
PR tree-optimization/102943
* gimple-range-cache.cc (class update_list): New.
(update_list::add): Replace add_to_update.
(update_list::pop): New.
(ranger_cache::ranger_cache): Adjust.
(ranger_cache::~ranger_cache): Adjust.
(ranger_cache::add_to_update): Delete.
(ranger_cache::propagate_cache): Adjust to new class.
(ranger_cache::propagate_updated_value): Ditto.
(ranger_cache::fill_block_cache): Ditto.
* gimple-range-cache.h (class ranger_cache): Adjust to update class.
|
|
I forgot to commit the changes done as response to Richards review
before committing.
2021-11-05 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vect_analyze_loop): Remove obsolete
comment and expand on another one. Combine nested if.
|
|
This change implements TI mode on PA64. Various new patterns are
added to pa.md. The libgcc build needed modification to build both
DI and TI routines. We also need various softfp routines to
convert to and from TImode.
I added full softfp for the -msoft-float option. At the moment,
this doesn't completely eliminate all use of the floating-point
co-processor. For this, libgcc needs to be built with -msoft-mult.
The floating-point exception support also needs a soft option.
2021-11-05 John David Anglin <danglin@gcc.gnu.org>
PR libgomp/96661
gcc/ChangeLog:
* config/pa/pa-modes.def: Add OImode integer type.
* config/pa/pa.c (pa_scalar_mode_supported_p): Allow TImode
for TARGET_64BIT.
* config/pa/pa.h (MIN_UNITS_PER_WORD) Define to MIN_UNITS_PER_WORD
to UNITS_PER_WORD if IN_LIBGCC2.
* config/pa/pa.md (addti3, addvti3, subti3, subvti3, negti2,
negvti2, ashlti3, shrpd_internal): New patterns.
Change some multi instruction types to multi.
libgcc/ChangeLog:
* config.host (hppa*64*-*-linux*): Revise tmake_file.
(hppa*64*-*-hpux11*): Likewise.
* config/pa/sfp-exceptions.c: New.
* config/pa/sfp-machine.h: New.
* config/pa/t-dimode: New.
* config/pa/t-softfp-sfdftf: New.
|
|
> Several older compilers fail to build modern GCC because of missing
> or incomplete C++11 support.
>
> * config/i386/i386.h (struct stringop_algs): Define a CTOR for
> this type.
Unfortunately, as mentioned in my
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583289.html
mail, without the new dyninit pass this causes dynamic initialization of
many variables, 6.5KB _GLOBAL__sub_I_* on x86_64 and 12.5KB on i686.
The following patch makes the ctor constexpr so that already the FE
is able to statically initialize all those.
I have tested on godbolt a reduced testcase without a constructor,
with constructor and with constexpr constructor.
clang before 3.3 is unhappy about all the 3 cases, clang 3.3 and 3.4
is ok with ctor and ctor with constexpr and optimizes it into static
initialization, clang 3.5+ is ok with all 3 versions and optimizes,
gcc 4.8 and 5+ is ok with all 3 versions and no ctor and ctor with constexpr
is optimized, gcc 4.9 is unhappy about the no ctor case and happy with the
other two.
2021-11-05 Jakub Jelinek <jakub@redhat.com>
PR bootstrap/100246
* config/i386/i386.h
(stringop_algs::stringop_strategy::stringop_strategy): Make the ctor
constexpr.
|
|
contrib/ChangeLog:
* testsuite-management/validate_failures.py: 2to3
|
|
The stack protector implementation hides symbols in a const unspec, which means
movdi/movsi patterns must always support const on symbol operands and
explicitly strip away the unspec. Do this for the recently added GOT
alternatives. Add a test to ensure stack-protector tests GOT accesses as well.
2021-11-05 Wilco Dijkstra <wdijkstr@arm.com>
PR target/103085
* config/aarch64/aarch64.c (aarch64_mov_operand_p): Strip the salt
first.
* config/aarch64/constraints.md: Support const in Usw.
gcc/testsuite/
PR target/103085
* gcc.target/aarch64/pr103085.c: New test
|
|
This fixes D language build on hppa64-hpux11.
2021-11-05 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.h (PREFERRED_DEBUGGING_TYPE): Define to DWARF2_DEBUG.
* config/pa/pa64-hpux.h (PREFERRED_DEBUGGING_TYPE): Remove define.
|
|
PR gcov-profile/102945
gcc/testsuite/ChangeLog:
* gcc.dg/gcov-info-to-gcda.c: Filter supported targets.
|
|
As discussed this splits the analysis loop into two, first settling
on a vector mode used for the main loop and only then analyzing
the epilogue of that for possible vectorization. That makes it
easier to put in support for unrolled main loops.
On the way I've realized some cleanup opportunities, namely caching
n_stmts in vec_info_shared (it's computed by dataref analysis)
avoiding to pass that around and setting/clearing loop->aux
during analysis - try_vectorize_loop_1 will ultimatively set it
on those we vectorize.
This also gets rid of the previously introduced callback in
vect_analyze_loop_1 in favor of making that advance the mode iterator.
I'm now pushing VOIDmode explicitely into the vector_modes array
which makes the re-start on the epilogue side a bit more
straight-forward. Note that will now use auto-detection of the
vector mode in case the main loop used it and we want to try
LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P and the first mode from
the target array if not. I've added a comment that says we may
want to make sure we don't try vectorizing the epilogue with a
bigger vector size than the main loop but the situation isn't
very likely to appear in practice I guess (and it was also present
before this change).
In principle this change should not change vectorization decisions
but the way we handled re-analyzing epilogues as main loops makes
me only 99% sure that it does.
2021-11-05 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vec_info_shared::n_stmts): Add.
(LOOP_VINFO_N_STMTS): Likewise.
(vec_info_for_bb): Remove unused function.
* tree-vectorizer.c (vec_info_shared::vec_info_shared):
Initialize n_stmts member.
* tree-vect-loop.c: Remove INCLUDE_FUNCTIONAL.
(vect_create_loop_vinfo): Do not set loop->aux.
(vect_analyze_loop_2): Do not get n_stmts as argument,
instead use LOOP_VINFO_N_STMTS. Set LOOP_VINFO_VECTORIZABLE_P
here.
(vect_analyze_loop_1): Remove callback, get the mode iterator
and autodetected_vector_mode as argument, advancing the
iterator and initializing autodetected_vector_mode here.
(vect_analyze_loop): Split analysis loop into two, first
processing main loops only and then epilogues.
|
|
The check this patch removes has remained from times when ancestor
jump functions have been only used for devirtualization and also
contained BINFOs. It is not necessary now and should have been
removed long time ago.
gcc/ChangeLog:
2021-11-04 Martin Jambor <mjambor@suse.cz>
* ipa-prop.c (compute_complex_assign_jump_func): Remove
unnecessary check for RECORD_TYPE.
|
|
For some reason the type printer for std::string doesn't work in C++20
mode, so std::basic_string<char, char_traits<char>, allocator<char> is
printed out in full rather than being shown as std::string. It's
probably related to the fact that the extern template declarations are
disabled for C++20, but I don't know why that affects GDB.
For now I'm just marking the relevant tests as XFAIL. That requires
adding support for target selectors to individual GDB directives such as
note-test and whatis-regexp-test.
libstdc++-v3/ChangeLog:
* testsuite/lib/gdb-test.exp: Add target selector support to the
dg-final directives.
* testsuite/libstdc++-prettyprinters/80276.cc: Add xfail for
C++20.
* testsuite/libstdc++-prettyprinters/libfundts.cc: Likewise.
* testsuite/libstdc++-prettyprinters/prettyprinters.exp: Tweak
comment.
|
|
This came up in the context of libsanitizer, where platform-specific
support for FreeBSD relies on aspects provided by FreeBSD's own md5.h.
Address this by allowing GCC's md5.h to pull in the system header
instead, controlled by a new macro USE_SYSTEM_MD5.
2021-11-05 Gerald Pfeifer <gerald@pfeifer.com>
Jakub Jelinek <jakub@redhat.com>
include/
* md5.h (USE_SYSTEM_MD5): Introduce.
|
|
Commit 431d26e1dd18c1146d3d4dcd3b45a3b04f7f7d59 removed
doc/install-old.texi, alas we still tried to generate the
associated web page old.html - which then turned out empty.
Simplify remove this from the list of pages to be generated.
gcc:
* doc/install.texi2html: Do not generate old.html any longer.
|
|
PR debug/102955
gcc/ChangeLog:
* opts.c (finish_options): Reset flag_gtoggle when it is used.
gcc/testsuite/ChangeLog:
* g++.dg/pr102955.C: New test.
|
|
My last change to CONST_WIDE_INT handling in add_const_value_attribute broke
handling of CONST_WIDE_INT constants like ((__uint128_t) 1 << 120).
wi::min_precision (w1, UNSIGNED) in that case 121, but wide_int::from
creates a wide_int that has 0 and 0xff00000000000000ULL in its elts and
precision 121. When we output that, we output both elements and thus emit
0, 0xff00000000000000 instead of the desired 0, 0x0100000000000000.
IMHO we should actually pass machine_mode to add_const_value_attribute from
callers, so that we know exactly what precision we want. Because
hypothetically, if say mode is OImode and the CONST_WIDE_INT value fits into
128 bits or 192 bits, we'd emit just those 128 or 192 bits but debug info
users would expect 256 bits.
On
typedef unsigned __int128 U;
int
main ()
{
U a = (U) 1 << 120;
U b = 0xffffffffffffffffULL;
U c = ((U) 0xffffffff00000000ULL) << 64;
return 0;
}
vanilla gcc incorrectly emits 0, 0xff00000000000000 for a,
0xffffffffffffffff alone (DW_FORM_data8) for b and 0, 0xffffffff00000000
for c. gcc with the previously posted PR103046 patch emits
0, 0x0100000000000000 for a, 0xffffffffffffffff alone for b and
0, 0xffffffff00000000 for c. And with this patch we emit
0, 0x0100000000000000 for a, 0xffffffffffffffff, 0 for b and
0, 0xffffffff00000000 for c.
So, the patch below certainly causes larger debug info (well, 128-bit
integers are pretty rare), but in this case the question is if it isn't
more correct, as debug info consumers generally will not know if they
should sign or zero extend the value in DW_AT_const_value.
The previous code assumes they will always zero extend it...
2021-11-05 Jakub Jelinek <jakub@redhat.com>
PR debug/103046
* dwarf2out.c (add_const_value_attribute): Add MODE argument, use it
in CONST_WIDE_INT handling. Adjust recursive calls.
(add_location_or_const_value_attribute): Pass DECL_MODE (decl) to
new add_const_value_attribute argument.
(tree_add_const_value_attribute): Pass TYPE_MODE (type) to new
add_const_value_attribute argument.
|
|
The macro TARGET_VXWORKS7 is always defined (see vxworks-dummy.h).
Thus we need to test its value, not its definedness.
Fixes aca124df (define NO_DOT_IN_LABEL only in vxworks6).
gcc/ChangeLog:
* config/vx-common.h: Test value of TARGET_VXWORKS7 rather
than definedness.
|
|
This refactors the main loop analysis part in vect_analyze_loop,
re-purposing the existing vect_reanalyze_as_main_loop for this
to reduce code duplication. Failure flow is a bit tricky since
we want to extract info from the analyzed loop but I wanted to
share the destruction part. Thus I add some std::function and
lambda to funnel post-analysis for the case we want that
(when analyzing from the main iteration but not when re-analyzing
an epilogue as main).
In addition I split vect_analyze_loop_form into analysis and
vinfo creation so we can do the analysis only once, simplifying
the new vect_analyze_loop_1.
As discussed we probably want to change the loop over vector
modes to first only analyze things as the main loop, picking
the best (or simd VF) mode for the main loop and then analyze
for a vectorized epilogue. The unroll would then integrate
with the main loop vectorization. I think that currently
we may fail to analyze the epilogue with the same mode as
the main loop when using partial vectors since we increment
mode_i before doing that.
2021-11-04 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (struct vect_loop_form_info): New.
(vect_analyze_loop_form): Adjust.
(vect_create_loop_vinfo): New.
* tree-parloops.c (gather_scalar_reductions): Adjust for
vect_analyze_loop_form API change.
* tree-vect-loop.c: Include <functional>.
(vect_analyze_loop_form_1): Rename to vect_analyze_loop_form,
take struct vect_loop_form_info as output parameter and adjust.
(vect_analyze_loop_form): Rename to vect_create_loop_vinfo and
split out call to the original vect_analyze_loop_form_1.
(vect_reanalyze_as_main_loop): Rename to...
(vect_analyze_loop_1): ... this, factor out the call to
vect_analyze_loop_form and generalize to be able to use it twice ...
(vect_analyze_loop): ... here. Perform vect_analyze_loop_form
once only and here.
|
|
gcc/ChangeLog:
2021-11-05 Xionghu Luo <luoxhu@linux.ibm.com>
PR target/102991
* config/rs6000/fusion.md: Regenerate.
* config/rs6000/genfusion.pl: Fix incorrect clobber constraint.
|