Age | Commit message (Collapse) | Author | Files | Lines |
|
Hi,
As reported, the recently added pr96139 tests will fail on older targets
because the tests are missing the appropriate -mvsx or -maltivec options.
This adds the options and clarifies the dg-require statements.
The pr96139-c.c test needs -maltivec to work, but does not actually use
vectors, so does not require -mvsx like the others.
Sniff-regtested OK when specifying older targets on a power7 host.
--target_board=unix/'{-mcpu=power4,-mcpu=power5,-mcpu=power6,-mcpu=power7,
-mcpu=power8,-mcpu=power9}''{-m64,-m32}'"
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: Specify -mvsx option and update the
dg-require stanza to match.
* gcc.target/powerpc/pr96139-b.c: Same.
* gcc.target/powerpc/pr96139-c.c: Specify -maltivec option and update
the dg-require stanza to match.
|
|
These tests are written for 256 bit vector. For -march=cascadelake,
vector size changed to 512 bit. It doubles the number of fma
instruction and test fail. Fix is to explicitly disable 512 bit
vector by passing additional option -mno-avx512f.
Tested on x86-64.
gcc/testsuite/ChangeLog:
PR target/97018
* gcc.target/i386/l_fma_double_1.c: Add option -mno-avx512f.
* gcc.target/i386/l_fma_double_2.c: Likewise.
* gcc.target/i386/l_fma_double_3.c: Likewise.
* gcc.target/i386/l_fma_double_4.c: Likewise.
* gcc.target/i386/l_fma_double_5.c: Likewise.
* gcc.target/i386/l_fma_double_6.c: Likewise.
* gcc.target/i386/l_fma_float_1.c: Likewise.
* gcc.target/i386/l_fma_float_2.c: Likewise.
* gcc.target/i386/l_fma_float_3.c: Likewise.
* gcc.target/i386/l_fma_float_4.c: Likewise.
* gcc.target/i386/l_fma_float_5.c: Likewise.
* gcc.target/i386/l_fma_float_6.c: Likewise.
|
|
Resolves:
PR middle-end/96903 - bogus warning on memcpy at negative offset from array end
gcc/ChangeLog:
PR middle-end/96903
* builtins.c (compute_objsize): Remove incorrect offset adjustment.
(compute_objsize): Adjust offset range here instead.
gcc/testsuite/ChangeLog:
PR middle-end/96903
* gcc.dg/Wstringop-overflow-42.c:: Add comment.
* gcc.dg/Wstringop-overflow-43.c: New test.
|
|
Syntax errors in method definition lists could leave us in a function
scope. My recent change for block scope externs didn't like that.
This reimplements the parsing loop to finish the method definition we
started. AFAICT the original code was attempting to provide some
error recovery. Also while there, simply do the token peeking at the
top of the loop, rather than at the two(!) ends.
gcc/cp/
* parser.c (cp_parser_objc_method_definition_list): Reimplement
loop, make sure we pop scope.
gcc/testsuite/
* obj-c++.dg/syntax-error-9.mm: Adjust expected errors.
|
|
Since we now have DECL_DECLARED_CONSTINIT_P, we no longer need
LOOKUP_CONSTINIT.
gcc/cp/ChangeLog:
* cp-tree.h (LOOKUP_CONSTINIT): Remove.
(LOOKUP_REWRITTEN): Adjust.
* decl.c (duplicate_decls): Set DECL_DECLARED_CONSTINIT_P.
(check_initializer): Use DECL_DECLARED_CONSTINIT_P instead of
LOOKUP_CONSTINIT.
(cp_finish_decl): Don't set DECL_DECLARED_CONSTINIT_P. Use
DECL_DECLARED_CONSTINIT_P instead of LOOKUP_CONSTINIT.
(grokdeclarator): Set DECL_DECLARED_CONSTINIT_P.
* decl2.c (grokfield): Don't handle LOOKUP_CONSTINIT.
* parser.c (cp_parser_decomposition_declaration): Remove
LOOKUP_CONSTINIT handling.
(cp_parser_init_declarator): Likewise.
* pt.c (tsubst_expr): Likewise.
(instantiate_decl): Likewise.
* typeck2.c (store_init_value): Use DECL_DECLARED_CONSTINIT_P instead
of LOOKUP_CONSTINIT.
|
|
The previous re-org made the cost of SLP vector stmts in loop
vectorization ignored. The following rectifies this mistake.
2020-09-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/97020
* tree-vect-slp.c (vect_slp_analyze_operations): Apply
SLP costs when doing loop vectorization.
|
|
This avoids an ICE on amdgcn.
gcc/testsuite/ChangeLog:
* gcc.dg/gimplefe-44.c: Require exceptions.
|
|
gcc/jit/ChangeLog
2020-08-01 Andrea Corallo <andrea.corallo@arm.com>
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.
* docs/topics/expressions.rst (gcc_jit_global_set_initializer):
Document new entry point in section 'Global variables'.
* jit-playback.c (global_new_decl, global_finalize_lvalue): New
method.
(playback::context::new_global): Make use of global_new_decl,
global_finalize_lvalue.
(load_blob_in_ctor): New template function in use by the
following.
(playback::context::new_global_initialized): New method.
* jit-playback.h (class context): Decl 'new_global_initialized',
'global_new_decl', 'global_finalize_lvalue'.
(lvalue::set_initializer): Add implementation.
* jit-recording.c (recording::memento_of_get_pointer::get_size)
(recording::memento_of_get_type::get_size): Add implementation.
(recording::global::write_initializer_reproducer): New function in
use by 'recording::global::write_reproducer'.
(recording::global::replay_into)
(recording::global::write_to_dump)
(recording::global::write_reproducer): Handle
initialized case.
* jit-recording.h (class type): Decl 'get_size' and
'num_elements'.
* libgccjit++.h (class lvalue): Declare new 'set_initializer'
method.
(class lvalue): Decl 'is_global' and 'set_initializer'.
(class global) Decl 'write_initializer_reproducer'. Add
'm_initializer', 'm_initializer_num_bytes' fields. Implement
'set_initializer'. Add a destructor to free 'm_initializer'.
* libgccjit.c (gcc_jit_global_set_initializer): New function.
* libgccjit.h (gcc_jit_global_set_initializer): New function
declaration.
* libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.
gcc/testsuite/ChangeLog
2020-08-01 Andrea Corallo <andrea.corallo@arm.com>
* jit.dg/all-non-failing-tests.h: Add test-blob.c.
* jit.dg/test-global-set-initializer.c: New testcase.
|
|
Add nvptx support to libatomic.
Given that atomic_test_and_set is not implemented for nvptx (PR96964), the
compiler translates __atomic_test_and_set falling back onto the "Failing all
else, assume a single threaded environment and simply perform the operation"
case in expand_atomic_test_and_set, so it doesn't map onto an actual atomic
operation.
Still, that counts as supported for the configure test of libatomic, so we
end up with HAVE_ATOMIC_TAS_1/2/4/8/16 == 1, and the corresponding
__atomic_test_and_set_1/2/4/8/16 in libatomic all using that non-atomic
implementation.
Fix this by adding an atomic_test_and_set expansion for nvptx, that uses
libatomics __atomic_test_and_set_1.
This again makes the configure tests for HAVE_ATOMIC_TAS_1/2/4/8/16 fail, so
instead we use this case in tas_n.c:
...
/* If this type is smaller than word-sized, fall back to a word-sized
compare-and-swap loop. */
bool
SIZE(libat_test_and_set) (UTYPE *mptr, int smodel)
...
which for __atomic_test_and_set_8 uses INVERT_MASK_8.
Add INVERT_MASK_8 in libatomic_i.h, as well as MASK_8.
Tested libatomic testsuite on nvptx.
gcc/ChangeLog:
PR target/96964
* config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
expansion.
libatomic/ChangeLog:
PR target/96898
* configure.tgt: Add nvptx.
* libatomic_i.h (MASK_8, INVERT_MASK_8): New macro definition.
* config/nvptx/host-config.h: New file.
* config/nvptx/lock.c: New file.
|
|
This prevents execution failures caused by partially overlapping input and
output registers. This is the same solution already used for DImode.
gcc/ChangeLog:
* config/gcn/gcn.c (gcn_hard_regno_mode_ok): Align TImode registers.
* config/gcn/gcn.md: Assert that TImode registers do not early clobber.
|
|
This tries to improve BB vectorization dumps by providing more
precise locations. Currently the vect_location is simply the
very last stmt in a basic-block that has a location. So for
double a[4], b[4];
int x[4], y[4];
void foo()
{
a[0] = b[0]; // line 5
a[1] = b[1];
a[2] = b[2];
a[3] = b[3];
x[0] = y[0]; // line 9
x[1] = y[1];
x[2] = y[2];
x[3] = y[3];
} // line 13
we show the user with -O3 -fopt-info-vec
t.c:13:1: optimized: basic block part vectorized using 16 byte vectors
while with the patch we point to both independently vectorized
opportunities:
t.c:5:8: optimized: basic block part vectorized using 16 byte vectors
t.c:9:8: optimized: basic block part vectorized using 16 byte vectors
there's the possibility that the location regresses in case the
root stmt in the SLP instance has no location. For a SLP subgraph
with multiple entries the location also chooses one entry at random,
not sure in which case we want to dump both.
Still as the plan is to extend the basic-block vectorization
scope from single basic-block to multiple ones this is a first
step to preserve something sensible.
Implementation-wise this makes both costing and code-generation
happen on the subgraphs as analyzed.
2020-09-11 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (_slp_instance::location): New method.
(vect_schedule_slp): Adjust prototype.
* tree-vectorizer.c (vec_info::remove_stmt): Adjust
the BB region begin if we removed the stmt it points to.
* tree-vect-loop.c (vect_transform_loop): Adjust.
* tree-vect-slp.c (_slp_instance::location): Implement.
(vect_analyze_slp_instance): For BB vectorization set
vect_location to that of the instance.
(vect_slp_analyze_operations): Likewise.
(vect_bb_vectorization_profitable_p): Remove wrapper.
(vect_slp_analyze_bb_1): Remove cost check here.
(vect_slp_region): Cost check and code generate subgraphs separately,
report optimized locations and missed optimizations due to
profitability for each of them.
(vect_schedule_slp): Get the vector of SLP graph entries to
vectorize as argument.
|
|
This is a regression present on the mainline and 10 branch: the compiler
aborts on code accessing a component of a packed record type whose type
is a packed discriminated record type with variant part.
gcc/ada/ChangeLog:
* gcc-interface/utils.c (type_has_variable_size): New function.
(create_field_decl): In the packed case, also force byte alignment
when the type of the field has variable size.
gcc/testsuite/ChangeLog:
* gnat.dg/pack27.adb: New test.
* gnat.dg/pack27_pkg.ads: New helper.
|
|
This adds a missing stride entry for bit-packed arrays of record types.
gcc/ada/ChangeLog:
* gcc-interface/misc.c (get_array_bit_stride): Return TYPE_ADA_SIZE
for record and union types.
|
|
GDB can now deal with the DWARF representation just fine.
gcc/ada/ChangeLog:
* gcc-interface/misc.c (gnat_get_fixed_point_type): Bail out only
when the GNAT encodings are specifically used.
|
|
This is a regression present on mainline, 10 and 9 branches: the compiler
goes into an infinite recursion eventually exhausting the stack for the
declaration of a discriminated record type with an array component having
a discriminant as bound and an index type that is an enumeration type with
a non-standard representation clause.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (gnat_to_gnu_entity) <E_Array_Subtype>: Only
create extra subtypes for discriminants if the RM size of the base
type of the index type is lower than that of the index type.
gcc/testsuite/ChangeLog:
* gnat.dg/specs/discr7.ads: New test.
|
|
|
|
|
|
|
|
This avoids dumping 'vectorization is not profitable' one more time
if none of the opportunities in a BB is profitable.
2020-09-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/97013
* tree-vect-slp.c (vect_slp_analyze_bb_1): Remove duplicate dumping.
|
|
This fixes random things found when doing SLP discovery from
arbitrary sets of stmts.
2020-09-10 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_build_slp_tree_1): Check vector
types for all lanes are compatible.
(vect_analyze_slp_instance): Appropriately check for stores.
(vect_schedule_slp): Likewise.
|
|
When nvptx_assemble_value is called with size == 16, this bitshift runs
into UB:
...
val &= ((unsigned HOST_WIDE_INT)2 << (size * BITS_PER_UNIT - 1)) - 1;
...
Fix this by checking the shift amount.
Tested on nvptx.
gcc/ChangeLog:
* config/nvptx/nvptx.c (nvptx_assemble_value): Fix undefined
behaviour.
|
|
For this code:
...
__int128 min_one = -1;
...
we currently generate:
...
.visible .global .align 8 .u64 min_one[2] = { -1, 0 };
...
Fix this in nvptx_assemble_value, such that we have instead:
...
.visible .global .align 8 .u64 min_one[2] = { -1, -1 };
...
gcc/ChangeLog:
* config/nvptx/nvptx.c (nvptx_assemble_value): Handle negative
__int128.
gcc/testsuite/ChangeLog:
* gcc.target/nvptx/int128.c: New test.
|
|
This is a (hopefully temporary) fix to PR96791. This will make
the default be -mno-block-ops-vector-pair even on power10, so we will
not hit the issue of DSE trying to truncate a POImode register. I am
still concerned it will be possible to hit this because the MMA builtins
will also generate POImode stores, but I think any example of that will
be somewhat more contrived.
gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Change default.
|
|
Amongst other things PR analyzer/96798 notes that
region_model::on_call_pre treats any builtin that hasn't been coded
yet as a no-op (albeit with an unknown return value), which is wrong
for non-pure builtins.
This patch updates that function's handling of such builtins so that it
instead conservatively assumes that any escaped/reachable regions can
be affected by the call, and implements enough handling of specific
builtins to avoid regressing the testsuite (I hope).
gcc/analyzer/ChangeLog:
PR analyzer/96798
* region-model-impl-calls.cc (region_model::impl_call_memcpy):
New.
(region_model::impl_call_strcpy): New.
* region-model.cc (region_model::on_call_pre): Flag unhandled
builtins that are non-pure as having unknown side-effects.
Implement BUILT_IN_MEMCPY, BUILT_IN_MEMCPY_CHK, BUILT_IN_STRCPY,
BUILT_IN_STRCPY_CHK, BUILT_IN_FPRINTF, BUILT_IN_FPRINTF_UNLOCKED,
BUILT_IN_PUTC, BUILT_IN_PUTC_UNLOCKED, BUILT_IN_FPUTC,
BUILT_IN_FPUTC_UNLOCKED, BUILT_IN_FPUTS, BUILT_IN_FPUTS_UNLOCKED,
BUILT_IN_FWRITE, BUILT_IN_FWRITE_UNLOCKED, BUILT_IN_PRINTF,
BUILT_IN_PRINTF_UNLOCKED, BUILT_IN_PUTCHAR,
BUILT_IN_PUTCHAR_UNLOCKED, BUILT_IN_PUTS, BUILT_IN_PUTS_UNLOCKED,
BUILT_IN_VFPRINTF, BUILT_IN_VPRINTF.
* region-model.h (region_model::impl_call_memcpy): New decl.
(region_model::impl_call_strcpy): New decl.
gcc/testsuite/ChangeLog:
PR analyzer/96798
* gcc.dg/analyzer/memcpy-1.c: New test.
* gcc.dg/analyzer/strcpy-1.c: New test.
|
|
|
|
In doing the other work for adding ISA 3.1 128-bit minimum, maximum, and
conditional move support, I noticed the two functions that process conditional
moves return 'int' instead of 'bool'. This patch changes these functions to
return 'bool'.
gcc/
2020-09-10 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000-protos.h (rs6000_emit_cmove): Change return
type to bool.
(rs6000_emit_int_cmove): Change return type to bool.
* config/rs6000/rs6000.c (rs6000_emit_cmove): Change return type
to bool.
(rs6000_emit_int_cmove): Change return type to bool.
|
|
Currently, for this code from c-c++-common/spec-barrier-1.c:
...
__int128 g = 9;
...
we generate:
...
// BEGIN GLOBAL VAR DEF: g
.visible .global .align 8 .u64 g[2] = { 9, 9 };
...
and consequently the test-case fails in execution.
The problem is caused by a shift in nvptx_assemble_value:
...
val >>= part * BITS_PER_UNIT;
...
where the shift amount is equal to the number of bits in val, which is
undefined behaviour.
Fix this by detecting the situation and setting val to 0.
Tested on nvptx.
gcc/ChangeLog:
PR target/97004
* config/nvptx/nvptx.c (nvptx_assemble_value): Handle shift by
number of bits in shift operand.
|
|
When I've tried to backport recent LTO changes of mine, I've ran into
FAIL: g++.dg/ubsan/align-3.C -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test
FAIL: g++.dg/ubsan/align-3.C -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects output pattern test
regressions that don't for some reason show up on the trunk.
I've tracked it down to input_location_and_block being called recursively,
first on some UBSAN_NULL ifn call's location which needs to stream a BLOCK
that hasn't been streamed yet, which in turn needs to stream some locations
for the decls in the BLOCK.
Looking at that align-3.C testcase on the trunk, I also see the recursive
lto_output_location_1 calls as well. First on block:
$18 = <block 0x7fffea738480>
which is in the block tree from what I can see just fine (see the end of
mail).
Now, output_function has code that should stream the BLOCK tree leafs:
/* Output DECL_INITIAL for the function, which contains the tree of
lexical scopes. */
stream_write_tree (ob, DECL_INITIAL (function), true);
/* As we do not recurse into BLOCK_SUBBLOCKS but only BLOCK_SUPERCONTEXT
collect block tree leafs and stream those. */
auto_vec<tree> block_tree_leafs;
if (DECL_INITIAL (function))
collect_block_tree_leafs (DECL_INITIAL (function), block_tree_leafs);
streamer_write_uhwi (ob, block_tree_leafs.length ());
for (unsigned i = 0; i < block_tree_leafs.length (); ++i)
stream_write_tree (ob, block_tree_leafs[i], true);
static void
collect_block_tree_leafs (tree root, vec<tree> &leafs)
{
for (root = BLOCK_SUBBLOCKS (root); root; root = BLOCK_CHAIN (root))
if (! BLOCK_SUBBLOCKS (root))
leafs.safe_push (root);
else
collect_block_tree_leafs (BLOCK_SUBBLOCKS (root), leafs);
}
but the problem is that it is broken, it doesn't cover all block leafs,
but only leafs with an odd depth from DECL_INITIAL (and only some of those).
The following patch fixes that, but I guess we are going to stream at that
point significantly more blocks than before (though I guess most of the time
we'd stream them later on when streaming the gimple_locations that refer to
them).
2020-09-10 Jakub Jelinek <jakub@redhat.com>
* lto-streamer-out.c (collect_block_tree_leafs): Recurse on
root rather than BLOCK_SUBBLOCKS (root).
|
|
We need to record whether template function-scopestatic decls are
constinit. That's currently held on the var's TEMPLATE_INFO data.
But I want to get rid of such decl's template header as they're not
really templates, and they're never instantiated separately from their
containing function's definition. (Just like auto vars, which don't
get them for instance).
This patch moves the flag into a spare decl_lang_flag.
gcc/cp/
* cp-tree.h (TINFO_VAR_DECLARED_CONSTINIT): Replace with ...
(DECL_DECLARED_CONSTINIT_P): ... this.
* decl.c (start_decl): No need to retrofit_lang_decl for constinit
flag.
(cp_finish_decl): Use DECL_DECLARED_CONSTINIT_P.
* pt.c (tsubst_decl): No need to handle constinit flag
propagation.
(tsubst_expr): Or here.
|
|
This adds support for Arm's Cortex-R82 CPU to GCC. For more information about
this CPU, see [0].
[0] : https://developer.arm.com/ip-products/processors/cortex-r/cortex-r82
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def: Add Cortex-R82.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Add entry for Cortex-R82.
|
|
This adds support for Armv8-R AArch64 to GCC. It adds the -march value
armv8-r and sets the ACLE feature macro __ARM_ARCH_PROFILE correctly
when -march is set to armv8-r.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.c
(aarch64_get_extension_string_for_isa_flags): Don't force +crc for
Armv8-R.
* config/aarch64/aarch64-arches.def: Add entry for Armv8-R.
* config/aarch64/aarch64-c.c (aarch64_define_unconditional_macros): Set
__ARM_ARCH_PROFILE correctly for Armv8-R.
* config/aarch64/aarch64.h (AARCH64_FL_V8_R): New.
(AARCH64_FL_FOR_ARCH8_R): New.
(AARCH64_ISA_V8_R): New.
* doc/invoke.texi: Add Armv8-R to architecture table.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/acle/armv8-r.c: New test.
|
|
These warnings are handled outside of the D core language front-end, so
shouldn't be enabled by -Wall.
gcc/d/ChangeLog:
* lang.opt (Waddress): Enable warning by -Wextra.
(Wcast-result): Likewise.
(Wunknown-pragmas): Likewise.
|
|
There is no problem with using `T var = void', it is if the variable
remains uninitialized on first use that a warning should be issued.
gcc/d/ChangeLog:
* decl.cc (DeclVisitor::visit (VarDeclaration *)): Don't warn about
variables initialized with 'void'.
|
|
Before, the warning was only issued when casting in the other direction.
Now a warning is printed for both directions.
gcc/d/ChangeLog:
* d-convert.cc (convert_expr): Warn when casting from a D class to a
C++ class.
gcc/testsuite/ChangeLog:
* gdc.dg/Waddress.d: New test.
* gdc.dg/Wcastresult1.d: New test.
* gdc.dg/Wcastresult2.d: New test.
|
|
This is a regression present on the mainline and 10 branch: the compiler
rejects a Value_Size clause on a discriminated record type with variant.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (set_rm_size): Do not take into account the
Value_Size clause if it is not for the entity itself.
gcc/testsuite/ChangeLog:
* gnat.dg/specs/size_clause5.ads: New test.
|
|
This fixes a wrong code issue with nested variant record types: the
compiler generates move instructions that depend on an uninitialized
variable, which was initially a SAVE_EXPR not instantiated early enough.
gcc/ada/ChangeLog:
* gcc-interface/decl.c (build_subst_list): For a definition, make
sure to instantiate the SAVE_EXPRs generated by the elaboration of
the constraints in front of the elaboration of the type itself.
gcc/testsuite/ChangeLog:
* gnat.dg/discr59.adb: New test.
* gnat.dg/discr59_pkg1.ads: New helper.
* gnat.dg/discr59_pkg2.ads: Likewise.
|
|
This is only for internal debugging purposes.
gcc/ada/ChangeLog:
* gcc-interface/misc.c: Include tree-pass.h.
(internal_error_function): Call emergency_dump_function.
|
|
Looking further at arm_override_options_after_change_1, it also seems to be
incorrect, rather than testing
!opts->x_str_align_functions
it should be really testing
!opts_set->x_str_align_functions
and get &global_options_set or similar passed to it as additional opts_set
argument. That is because otherwise the decision will be sticky, while it
should be done whenever use provided -falign-functions but didn't provide
-falign-functions= (either on the command line, or through optimize
attribute or pragma).
Here is a fix for that (incremental change on top of the previous patch).
2020-09-10 Jakub Jelinek <jakub@redhat.com>
* config/arm/arm.c (arm_override_options_after_change_1): Add opts_set
argument, test opts_set->x_str_align_functions rather than
opts->x_str_align_functions.
(arm_override_options_after_change, arm_option_override_internal,
arm_set_current_function): Adjust callers.
|
|
As mentioned in the PR, the testcase fails to link, because when set_cfun is
being called on the crc function, arm_override_options_after_change is
called from set_cfun -> invoke_set_current_function_hook:
/* Change optimization options if needed. */
if (optimization_current_node != opts)
{
optimization_current_node = opts;
cl_optimization_restore (&global_options, TREE_OPTIMIZATION (opts));
}
and at that point target_option_default_node actually matches even the
current state of options, so this means armv7 (or whatever) arch is set as
arm_active_target, then
targetm.set_current_function (fndecl);
is called later in that function, which because the crc function's
DECL_FUNCTION_SPECIFIC_TARGET is different from the current one will do:
cl_target_option_restore (&global_options, TREE_TARGET_OPTION (new_tree));
which calls arm_option_restore and sets arm_active_target to armv8-a+crc
(so far so good).
Later arm_set_current_function calls:
save_restore_target_globals (new_tree);
which in this case calls:
/* Call target_reinit and save the state for TARGET_GLOBALS. */
TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts ();
which because optimization_current_node != optimization_default_node
(the testcase is LTO, so all functions have their
DECL_FUNCTION_SPECIFIC_TARGET and TREE_OPTIMIZATION nodes) will call:
cl_optimization_restore
(&global_options,
TREE_OPTIMIZATION (optimization_default_node));
and
cl_optimization_restore (&global_options,
TREE_OPTIMIZATION (opts));
The problem is that these call arm_override_options_after_change again,
and that one uses the target_option_default_node as what to set the
arm_active_target to (i.e. back to armv7 or whatever, but not to the
armv8-a+crc that should be the active target for the crc function).
That means we then error on the builtin call in that function.
Now, the targetm.override_options_after_change hook is called always at the
end of cl_optimization_restore, i.e. when we change the Optimization marked
generic options. So it seems unnecessary to call arm_configure_build_target
at that point (nothing it depends on changed), and additionally incorrect
(because it uses the target_option_default_node, rather than the current
set of options; we'd need to revert
https://gcc.gnu.org/legacy-ml/gcc-patches/2016-12/msg01390.html
otherwise so that it works again with global_options otherwise).
The options that arm_configure_build_target cares about will change only
during option parsing (which is where it is called already), or during
arm_set_current_function, where it is done during the
cl_target_option_restore.
Now, arm_override_options_after_change_1 wants to adjust the
str_align_functions, which depends on the current Optimization options (e.g.
optimize_size and flag_align_options and str_align_functions) as well as
the target options target_flags, so IMHO needs to be called both
when the Optimization options (possibly) change, i.e. from
the targetm.override_options_after_change hook, and from when the target
options change (set_current_function hook).
2020-09-10 Jakub Jelinek <jakub@redhat.com>
PR target/96939
* config/arm/arm.c (arm_override_options_after_change): Don't call
arm_configure_build_target here.
(arm_set_current_function): Call arm_override_options_after_change_1
at the end.
* gcc.target/arm/lto/pr96939_0.c: New test.
* gcc.target/arm/lto/pr96939_1.c: New file.
|
|
I noticed that some of the VSR<->GPR move instructions are not typed correctly. This patch fixes those instructions so that the scheduler treats them with the correct latency.
2020-09-10 Pat Haugen <pthaugen@linux.ibm.com>
gcc/
* config/rs6000/rs6000.md
(lfiwzx, floatunssi<mode>2_lfiwzx, p8_mtvsrwz, p8_mtvsrd_sf): Fix insn
type.
* config/rs6000/vsx.md
(vsx_concat_<mode>, vsx_splat_<mode>_reg, vsx_splat_v4sf): Likewise.
|
|
Our handling of block-scope extern decls is insufficient for modern
C++, in particular modules, (but also constexprs). We mark such local
function decls, and this patch extends that to marking local var decls
too, so mainly a macro rename. Also, we set this flag earlier, rather
than learning about it when pushing the decl. This is a step towards
handling these properly.
gcc/cp/
* cp-tree.h (DECL_LOCAL_FUNCTION_P): Rename to ...
(DECL_LOCAL_DECL_P): ... here. Accept both fns and vars.
* decl.c (start_decl): Set DECL_LOCAL_DECL_P for local externs.
(omp_declare_variant_finalize_one): Use DECL_LOCAL_DECL_P.
(local_variable_p): Simplify.
* name-lookup.c (set_decl_context_in_fn): Assert DECL_LOCAL_DECL_P
is as expected. Simplify.
(do_pushdecl): Don't set decl_context_in_fn for friends.
(is_local_extern): Simplify.
* call.c (equal_functions): Use DECL_LOCAL_DECL_P.
* parser.c (cp_parser_postfix_expression): Likewise.
(cp_parser_omp_declare_reduction): Likewise.
* pt.c (check_default_tmpl_args): Likewise.
(tsubst_expr): Assert nested reduction function is local.
(type_dependent_expression_p): Use DECL_LOCAL_DECL_P.
* semantics.c (finish_call_expr): Likewise.
libcc1/
* libcp1plugin.cc (plugin_build_call_expr): Use DECL_LOCAL_DECL_P.
|
|
Add a missing require-effect-target alloca directive.
Tested on nvptx.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/vla-1.c: Add require-effective-target alloca.
|
|
GCC on Linux already uses liblto_plugin.so directly without
the libtool version suffix, adjust windows GCC to do the same.
gcc/ChangeLog:
* config.host: Adjust plugin name for Windows.
lto-plugin/ChangeLog:
* Makefile.am: drop versioning from libtool completely.
* Makefile.in: regenerate.
|
|
There's an invariant for IFN_UNIQUE, listed here in
gimple_call_initialize_ctrl_altering:
...
/* IFN_UNIQUE should be the last insn, to make checking for it
as cheap as possible. */
|| (gimple_call_internal_p (stmt)
&& gimple_call_internal_unique_p (stmt)))
gimple_call_set_ctrl_altering (stmt, true);
...
Recent commit fab77644842 "tree-optimization/96931 - clear ctrl-altering flag
more aggressively" breaks this invariant, causing an ICE triggered during
libgomp testing for x86_64 with nvptx accelerator:
...
during RTL pass: mach
asyncwait-1.f90: In function ‘MAIN__._omp_fn.0’:
asyncwait-1.f90:19: internal compiler error: in nvptx_find_par, at \
config/nvptx/nvptx.c:3293
...
Fix this by listing IFN_UNIQUE as exception in
cleanup_call_ctrl_altering_flag.
Build for x86_64 with nvptx accelerator, tested libgomp.
gcc/ChangeLog:
PR tree-optimization/97000
* tree-cfgcleanup.c (cleanup_call_ctrl_altering_flag): Don't clear
flag for IFN_UNIQUE.
|
|
and adjust relative paths [PR93865]
If the gcc -c -flto ... commands to compile some or all objects are run in a
different directory (or in different directories) from the directory in
which the gcc -flto link line is invoked, then the .debug_line will be
incorrect if there are any relative filenames, it will use those relative
filenames while .debug_info will contain a different DW_AT_comp_dir.
The following patch streams (at most once after each clear_line_info)
the current working directory (what we record in DW_AT_comp_dir) when
encountering the first relative pathname, and when reading the location info
reads it back and if the current working directory at that point is
different from the saved one, adjusts relative paths by adding a relative
prefix how to go from the current working directory to the previously saved
path (with a fallback e.g. for DOS e:\\foo vs. d:\\bar change to use
absolute directory).
2020-09-10 Jakub Jelinek <jakub@redhat.com>
PR debug/93865
* lto-streamer.h (struct output_block): Add emit_pwd member.
* lto-streamer-out.c: Include toplev.h.
(clear_line_info): Set emit_pwd.
(lto_output_location_1): Encode the ob->current_file != xloc.file
bit directly into the location number. If changing file, emit
additionally a bit whether pwd is emitted and emit it before the
first relative pathname since clear_line_info.
(output_function, output_constructor): Don't call clear_line_info
here.
* lto-streamer-in.c (struct string_pair_map): New type.
(struct string_pair_map_hasher): New type.
(string_pair_map_hasher::hash): New method.
(string_pair_map_hasher::equal): New method.
(path_name_pair_hash_table, string_pair_map_allocator): New variables.
(relative_path_prefix, canon_relative_path_prefix,
canon_relative_file_name): New functions.
(canon_file_name): Add relative_prefix argument, if non-NULL
and string is a relative path, return canon_relative_file_name.
(lto_location_cache::input_location_and_block): Decode file change
bit from the location number. If changing file, unpack bit whether
pwd is streamed and stream in pwd. Adjust canon_file_name caller.
(lto_free_file_name_hash): Delete path_name_pair_hash_table
and string_pair_map_allocator.
|
|
This makes the BB vectorizer cost independent SLP subgraphs
separately. While on pristine trunk and for x86_64 I failed to
distill a testcase where the vectorizer would think _any_
basic-block vectorization opportunity is not profitable I do
have pending work that would make the cost savings of a
profitable opportunity make another independently not
profitable opportunity vectorized.
2020-09-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/96043
* tree-vectorizer.h (_slp_instance::cost_vec): New.
(_slp_instance::subgraph_entries): Likewise.
(BB_VINFO_TARGET_COST_DATA): Remove.
* tree-vect-slp.c (vect_free_slp_instance): Free
cost_vec and subgraph_entries.
(vect_analyze_slp_instance): Initialize them.
(vect_slp_analyze_operations): Defer passing costs to
the target, instead record them in the SLP graph entry.
(get_ultimate_leader): New helper for graph partitioning.
(vect_bb_partition_graph_r): Likewise.
(vect_bb_partition_graph): New function to partition the
SLP graph into independently costable parts.
(vect_bb_vectorization_profitable_p): Adjust to work on
a subgraph.
(vect_bb_vectorization_profitable_p): New wrapper,
discarding non-profitable vectorization of subgraphs.
(vect_slp_analyze_bb_1): Call vect_bb_partition_graph before
costing.
* gcc.dg/vect/costmodel/x86_64/costmodel-pr69297.c: Adjust.
|
|
|
|
This patch corrects our handling of array new-expression with ()-init:
new int[4](1, 2, 3, 4);
should work even with the explicit array bound, and
new char[3]("so_sad");
should cause an error, but we weren't giving any.
Fixed by handling array new-expressions with ()-init in the same spot
where we deduce the array bound in array new-expression. I'm now
always passing STRING_CSTs to build_new_1 wrapped in { } which allowed
me to remove the special handling of STRING_CSTs in build_new_1. And
since the DIRECT_LIST_INIT_P block in build_new_1 calls digest_init, we
report errors about too short arrays. reshape_init now does the {"foo"}
-> "foo" transformation even for CONSTRUCTOR_IS_PAREN_INIT, so no need
to do it in build_new.
I took a stab at cp_complete_array_type's "FIXME: this code is duplicated
from reshape_init", but calling reshape_init there, I ran into issues
with has_designator_problem: when we reshape an already reshaped
CONSTRUCTOR again, d.cur.index has been filled, so we think that we
have a user-provided designator (though there was no designator in the
source code), and report an error.
gcc/cp/ChangeLog:
PR c++/77841
* decl.c (reshape_init): If we're initializing a char array from
a string-literal that is enclosed in braces, unwrap it.
* init.c (build_new_1): Don't handle string-initializers here.
(build_new): Handle new-expression with paren-init when the
array bound is known. Always pass string constants to build_new_1
enclosed in braces. Don't handle string-initializers in any
special way.
gcc/testsuite/ChangeLog:
PR c++/77841
* g++.old-deja/g++.ext/arrnew2.C: Expect the error only in C++17
and less.
* g++.old-deja/g++.robertl/eb58.C: Adjust dg-error.
* g++.old-deja/g++.robertl/eb63.C: Expect the error only in C++17
and less.
* g++.dg/cpp2a/new-array5.C: New test.
* g++.dg/cpp2a/paren-init36.C: New test.
* g++.dg/cpp2a/paren-init37.C: New test.
* g++.dg/pr84729.C: Adjust dg-error.
|
|
This patch fixes a long-standing bug in reshape_init_r. Since r209314
we implement DR 1467 which handles list-initialization with a single
initializer of the same type as the target. In this test this causes
a crash in reshape_init_r when we're processing a constructor that has
undergone the DR 1467 transformation.
Take e.g. the
foo({{1, {H{k}}}});
line in the attached test. {H{k}} initializes the field b of H in I.
H{k} is a functional cast, so has TREE_HAS_CONSTRUCTOR set, so is
COMPOUND_LITERAL_P. We perform the DR 1467 transformation and turn
{H{k}} into H{k}. Then we attempt to reshape H{k} again and since
first_initializer_p is null and it's COMPOUND_LITERAL_P, we go here:
else if (COMPOUND_LITERAL_P (stripped_init))
gcc_assert (!BRACE_ENCLOSED_INITIALIZER_P (stripped_init));
then complain about the missing braces, go to reshape_init_class and ICE
on
gcc_checking_assert (d->cur->index
== get_class_binding (type, id));
because due to the missing { } we're looking for 'b' in H, but that's
not found.
So we have to be prepared to handle an initializer whose outer braces
have been removed due to DR 1467.
gcc/cp/ChangeLog:
PR c++/95164
* decl.c (reshape_init_r): When initializing an aggregate member
with an initializer from an initializer-list, also consider
COMPOUND_LITERAL_P.
gcc/testsuite/ChangeLog:
PR c++/95164
* g++.dg/cpp0x/initlist123.C: New test.
|
|
This patch generalizes the state machine in sm-malloc.c to support
multiple allocator APIs, and adds just enough support for C++ new and
delete to demonstrate the feature, allowing for detection of code
paths where the result of new in C++ can leak - for some crude examples,
at least (bearing in mind that the analyzer doesn't yet know about
e.g. vfuncs, exceptions, inheritance, RTTI, etc)
It also implements a new warning: -Wanalyzer-mismatching-deallocation.
For example:
demo.cc: In function 'void test()':
demo.cc:8:8: warning: 'f' should have been deallocated with 'delete'
but was deallocated with 'free' [CWE-762] [-Wanalyzer-mismatching-deallocation]
8 | free (f);
| ~~~~~^~~
'void test()': events 1-2
|
| 7 | foo *f = new foo;
| | ^~~
| | |
| | (1) allocated here (expects deallocation with 'delete')
| 8 | free (f);
| | ~~~~~~~~
| | |
| | (2) deallocated with 'free' here; allocation at (1) expects deallocation with 'delete'
|
The patch also adds just enough knowledge of exception-handling to
suppress a false positive from -Wanalyzer-malloc-leak on
g++.dg/analyzer/pr96723.C on the exception-handling CFG edge after
operator new. It does this by adding a constraint that the result is
NULL if an exception was thrown from operator new, since the result from
operator new is lost when following that exception-handling CFG edge.
gcc/analyzer/ChangeLog:
PR analyzer/94355
* analyzer.opt (Wanalyzer-mismatching-deallocation): New warning.
* region-model-impl-calls.cc
(region_model::impl_call_operator_new): New.
(region_model::impl_call_operator_delete): New.
* region-model.cc (region_model::on_call_pre): Detect operator new
and operator delete.
(region_model::on_call_post): Likewise.
(region_model::maybe_update_for_edge): Detect EH edges and call...
(region_model::apply_constraints_for_exception): New function.
* region-model.h (region_model::impl_call_operator_new): New decl.
(region_model::impl_call_operator_delete): New decl.
(region_model::apply_constraints_for_exception): New decl.
* sm-malloc.cc (enum resource_state): New.
(struct allocation_state): New state subclass.
(enum wording): New.
(struct api): New.
(malloc_state_machine::custom_data_t): New typedef.
(malloc_state_machine::add_state): New decl.
(malloc_state_machine::m_unchecked)
(malloc_state_machine::m_nonnull)
(malloc_state_machine::m_freed): Delete these states in favor
of...
(malloc_state_machine::m_malloc)
(malloc_state_machine::m_scalar_new)
(malloc_state_machine::m_vector_new): ...this new api instances,
which own their own versions of these states.
(malloc_state_machine::on_allocator_call): New decl.
(malloc_state_machine::on_deallocator_call): New decl.
(api::api): New ctor.
(dyn_cast_allocation_state): New.
(as_a_allocation_state): New.
(get_rs): New.
(unchecked_p): New.
(nonnull_p): New.
(freed_p): New.
(malloc_diagnostic::describe_state_change): Use unchecked_p and
nonnull_p.
(class mismatching_deallocation): New.
(double_free::double_free): Add funcname param for initializing
m_funcname.
(double_free::emit): Use m_funcname in warning message rather
than hardcoding "free".
(double_free::describe_state_change): Likewise. Use freed_p.
(double_free::describe_call_with_state): Use freed_p.
(double_free::describe_final_event): Use m_funcname in message
rather than hardcoding "free".
(double_free::m_funcname): New field.
(possible_null::describe_state_change): Use unchecked_p.
(possible_null::describe_return_of_state): Likewise.
(use_after_free::use_after_free): Add param for initializing m_api.
(use_after_free::emit): Use m_api->m_dealloc_funcname in message
rather than hardcoding "free".
(use_after_free::describe_state_change): Use freed_p. Change the
wording of the message based on the API.
(use_after_free::describe_final_event): Use
m_api->m_dealloc_funcname in message rather than hardcoding
"free". Change the wording of the message based on the API.
(use_after_free::m_api): New field.
(malloc_leak::describe_state_change): Use unchecked_p. Update
for renaming of m_malloc_event to m_alloc_event.
(malloc_leak::describe_final_event): Update for renaming of
m_malloc_event to m_alloc_event.
(malloc_leak::m_malloc_event): Rename...
(malloc_leak::m_alloc_event): ...to this.
(free_of_non_heap::free_of_non_heap): Add param for initializing
m_funcname.
(free_of_non_heap::emit): Use m_funcname in message rather than
hardcoding "free".
(free_of_non_heap::describe_final_event): Likewise.
(free_of_non_heap::m_funcname): New field.
(allocation_state::dump_to_pp): New.
(allocation_state::get_nonnull): New.
(malloc_state_machine::malloc_state_machine): Update for changes
to state fields and new api fields.
(malloc_state_machine::add_state): New.
(malloc_state_machine::on_stmt): Move malloc/calloc handling to
on_allocator_call and call it, passing in the API pointer.
Likewise for free, moving it to on_deallocator_call. Handle calls
to operator new and delete in an analogous way. Use unchecked_p
when testing for possibly-null-arg and possibly-null-deref, and
transition to the non-null for the correct API. Remove redundant
node param from call to on_zero_assignment. Use freed_p for
use-after-free check, and pass in API.
(malloc_state_machine::on_allocator_call): New, based on code in
on_stmt.
(malloc_state_machine::on_deallocator_call): Likewise.
(malloc_state_machine::on_phi): Mark node param with
ATTRIBUTE_UNUSED; don't pass it to on_zero_assignment.
(malloc_state_machine::on_condition): Mark node param with
ATTRIBUTE_UNUSED. Replace on_transition calls with get_state and
set_next_state pairs, transitioning to the non-null state for the
appropriate API.
(malloc_state_machine::can_purge_p): Port to new state approach.
(malloc_state_machine::on_zero_assignment): Replace on_transition
calls with get_state and set_next_state pairs. Drop redundant
node param.
* sm.h (state_machine::add_custom_state): New.
gcc/ChangeLog:
PR analyzer/94355
* doc/invoke.texi: Document -Wanalyzer-mismatching-deallocation.
gcc/testsuite/ChangeLog:
PR analyzer/94355
* g++.dg/analyzer/new-1.C: New test.
* g++.dg/analyzer/new-vs-malloc.C: New test.
|