Age | Commit message (Collapse) | Author | Files | Lines |
|
An entity can be defined within an expression (the best example is probably a
declare expression, but a quantified expression is another; there are others).
When making a deep copy of an expression, the Entity nodes for such entities
were sometimes not copied, apparently for performance reasons. This caused
correctness problems in some cases, so do not perform that "optimization".
gcc/ada/
* sem_util.adb
(New_Copy_Tree.Visit_Entity): Delete code that prevented copying some entities.
|
|
Minor rewording of a warning.
Disallow positional notation for <> (but disable this check),
and fix resulting errors.
Copy use clauses.
gcc/ada/
* sem_ch12.adb (Check_Fixed_Point_Actual): Minor rewording; it seems
more proper to say "operator" rather than "operation".
(Matching_Actual): Give an error for <> in positional notation.
This is a syntax error. Disable this for now.
(Analyze_Associations): Copy the use clause in all cases.
The "mustn't recopy" comment seems wrong, because New_Copy_Tree
preserves Slocs.
* libgnat/a-ticoau.ads: Fix violation of new postion-box error.
* libgnat/a-wtcoau.ads: Likewise.
* libgnat/a-ztcoau.ads: Likewise.
|
|
This message provides only inner details of how the compiler
handles this kind of construct and does not provide meaningful
information that the user can interact on.
gcc/ada/
* par-labl.adb (Rewrite_As_Loop): Remove info message
|
|
Remove warning insertion characters without switch characters
from info messages.
gcc/ada/
* par-ch7.adb: Remove warning characters from info message
* par-endh.adb: Remove warning characters from info message
* sem_res.adb: Remove warning characters from info message
|
|
The info message about the freeze point should be considered
a continuation of the error message about the change of visibility
after the freeze point. This improves the error layout for formatted
error messages with the -gnatdF switch.
gcc/ada/
* sem_ch13.adb (Check_Aspect_At_End_Of_Declarations): change the
info message to a continuation message.
|
|
gcc/ada/
* inline.adb (Cannot_Inline): Simplify string handling logic.
|
|
Add entities of kind E_Subprogram_Body to the list of entities associated
to a given scope. This ensures that representation information is
correctly output for object and type declarations inside these subprogram
bodies. This is useful for outputing that information fron the compiler
with the switch -gnatR, as well as for getting precise representation
information inside GNATprove.
Remove ad-hoc code inside repinfo.adb that retrieved this information
in only some cases.
gcc/ada/
* exp_ch5.adb (Expand_Iterator_Loop_Over_Container): Skip entities
of kind E_Subprogram_Body.
* repinfo.adb (List_Entities): Remove special case for subprogram
bodies.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): List subprogram
body entities in the enclosing scope.
|
|
When the first formal parameter of a subprogram is a class-wide
interface type (or an access to a class-wide interface type),
changing the order of the interface types implemented by a
type declaration T enables or disables the ability to use the
prefix notation to call it with objects of type T. When the
call is disabled the compiler rejects it reporting an error.
gcc/ada/
* sem_ch4.adb (Traverse_Interfaces): Add missing support
for climbing to parents of interface types.
|
|
The GNAT-defined Super attribute was formerly disallowed for an object of a
derived tagged type having an abstract parent type. This rule has been relaxed;
an abstract parent type is now permitted as long as it is not an interface type.
Update the GNAT RM accordingly.
gcc/ada/
* doc/gnat_rm/implementation_defined_attributes.rst:
Update Super attribute documentation.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
|
|
System.Tasking.Protected_Objects.Lock can raise exceptions, but that
wasn't taken into account by the expansion of protected subprogram
bodies before this patch. More precisely, there were cases where
calls to System.Tasking.Initialization.Abort_Undefer were
incorrectly omitted. This patch fixes this.
gcc/ada/
* exp_ch7.adb (Build_Cleanup_Statements): Adapt to changes
made to Build_Protected_Subprogram_Call_Cleanup.
* exp_ch9.adb (Make_Unlock_Statement, Wrap_Unprotected_Call):
New functions.
(Build_Protected_Subprogram_Body): Fix resource management in
generated code.
(Build_Protected_Subprogram_Call_Cleanup): Make use of newly
introduced Make_Unlock_Statement.
|
|
The Defining_Identifier of a renaming may be a E_Constant in the context.
gcc/ada/
PR ada/114710
* exp_util.adb (Find_Renamed_Object): Recurse for any renaming.
|
|
We already checked that a global item of mode Output is not an Input of
the enclosing subprograms. With this change we also check that if this
global item is a constituent, then none of its encapsulating abstract
states is an Input of the enclosing subprograms.
gcc/ada/
* sem_prag.adb (Check_Mode_Restriction_In_Enclosing_Context):
Iterate over encapsulating abstract states.
|
|
This set of changes is aimed at streamlining the code generated for the
elaboration of local tagged types. The dispatch tables and other related
data structures are built dynamically on the stack for them and a few of
the patterns used for this turn out to be problematic for the optimizer:
1. the array of primitives in the dispatch table is default-initialized to
null values by calling the initialization routine of an unconstrained
array type, and then immediately assigned an aggregate made up of the
same null values.
2. the external tag is initialized by means of a dynamic concatenation
involving the secondary stack, but all the elements have a fixed size.
3. the _size primitive is saved in the TSD by means of the dereference of
the address of the TSD that was previously saved in the dispatch table.
gcc/ada/
* Makefile.rtl (GNATRTL_NONTASKING_OBJS): Add s-imad32$(objext),
s-imad64$(objext) and s-imagea$(objext).
* exp_atag.ads (Build_Set_Size_Function): Replace Tag_Node parameter
with Typ parameter.
* exp_atag.adb: Add clauses for Sinfo.Utils.
(Build_Set_Size_Function): Retrieve the TSD object statically.
* exp_disp.adb: Add clauses for Ttypes.
(Make_DT): Call Address_Image{32,64] instead of Address_Image.
(Register_Primitive): Pass Tag_Typ to Build_Set_Size_Function.
* rtsfind.ads (RTU_Id): Remove System_Address_Image and add
System_Img_Address_{32;64}.
(RE_Id): Remove entry for RE_Address_Image and add entries for
RE_Address_Image{32,64}.
* rtsfind.adb (System_Descendant): Adjust to above changes.
* libgnat/a-tags.ads (Address_Array): Suppress initialization.
* libgnat/s-addima.adb (System.Address_Image): Call the appropriate
routine based on the address size.
* libgnat/s-imad32.ads: New file.
* libgnat/s-imad64.ads: Likewise.
* libgnat/s-imagea.ads: Likewise.
* libgnat/s-imagea.adb: Likewise.
* gcc-interface/Make-lang.in (GNAT_ADA_OBJS) [$(STAGE1)=False]: Add
ada/libgnat/s-imad32.o and ada/libgnat/s-imad64.o.
|
|
Inlining in GNATprove a subprogram containing a constant declaration with
an address clause/aspect might lead to a spurious error if the address
expression is based on a constant view of a mutable object at call site.
Do not allow such inlining in GNATprove.
gcc/ada/
* inline.adb (Can_Be_Inlined_In_GNATprove_Mode): Do not inline
when constant with address clause is found.
|
|
This patch fixes code in gnatlink that incorrectly assumed that the
lower bound of a particular string was always 1.
gcc/ada/
* gnatlink.adb (Gnatlink): Fix incorrect lower bound assumption.
(Is_Prefix): New function.
|
|
In some cases the compiler incorrectly concludes that a package body is
required for a package specification that includes the implicit declaration
of one or more inherited subprograms for an explicitly declared derived type.
Spurious error messages (e.g., "cannot generate code for file") may result.
gcc/ada/
* sem_ch7.adb
(Requires_Completion_In_Body): Modify the Comes_From_Source test so that
the implicit declaration of an inherited subprogram does not cause
an incorrect result of True.
|
|
gcc/ada/
* exp_ch6.adb (Expand_Ctrl_Function_Call): Inline if -gnatn in
CCG mode even if -O0.
|
|
Now that Is_Finalizable_Transient only looks at the renamings coming from
nontransient objects serviced by transient scopes, it must find the object
ultimately renamed by them through a chain of renamings.
gcc/ada/
PR ada/114710
* exp_util.adb (Find_Renamed_Object): Recurse if the renamed object
is itself a renaming.
|
|
The compiler reports an error when the prefix of 'Old is
a call to an overloaded function that has no parameters.
gcc/ada/
* sem_attr.adb (Analyze_Attribute): Enhance support for
using 'Old with a prefix that references an overloaded
function that has no parameters; add missing support
for the use of 'Old within qualified expressions.
* sem_util.ads (Preanalyze_And_Resolve_Without_Errors):
New subprogram.
* sem_util.adb (Preanalyze_And_Resolve_Without_Errors):
New subprogram.
|
|
Where possible, we can use high-level wrapper routines instead of the
low-level Get_Attribute_Definition_Clause.
Code cleanup; semantics is unaffected.
gcc/ada/
* layout.adb (Layout_Type): Use high-level wrapper routine.
* sem_ch13.adb (Inherit_Delayed_Rep_Aspects): Likewise.
* sem_ch3.adb (Analyze_Object_Declaration): Likewise.
|
|
This puts Windows on par with Linux as far as backtraces are concerned.
gcc/ada/
* libgnat/s-tsmona__linux.adb (Get): Move down descriptive comment.
* libgnat/s-tsmona__mingw.adb: Add with clause and use clause for
System.Storage_Elements.
(Get): Pass GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT in the call
to GetModuleHandleEx and remove the subsequent call to FreeLibrary.
Upon success, set Load_Addr to the base address of the module.
* libgnat/s-win32.ads (GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS): Use
shorter literal.
(GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT): New constant.
|
|
The problem is that Is_Finalizable_Transient returns false when a transient
object is subject to a renaming by another transient object present in the
same transient scope, thus forcing its finalization to be deferred to the
enclosing scope. That's not necessary, as only renamings by nontransient
objects serviced by transient scopes need to be rejected by the predicate.
The change also removes now dead code in the finalization machinery.
gcc/ada/
PR ada/114710
* exp_ch7.adb (Build_Finalizer.Process_Declarations): Remove dead
code dealing with renamings.
* exp_util.ads (Is_Finalizable_Transient): Rename Rel_Node to N.
* exp_util.adb (Is_Finalizable_Transient): Likewise.
(Is_Aliased): Remove obsolete code dealing wih EWA nodes and only
consider renamings present in N itself.
(Requires_Cleanup_Actions): Remove dead code dealing with renamings.
|
|
The compiler does not generate dynamic predicate checks when
they are enabled for one type declaration and ignored for
other type declarations defined in the same scope.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Set the applicable policy
of a type declaration when its aspect Dynamic_Predicate is
analyzed.
* sem_prag.adb (Handle_Dynamic_Predicate_Check): New subprogram
that enables or ignores dynamic predicate checks depending on
whether dynamic checks are enabled in the context where the
associated type declaration is defined; used in the analysis
of pragma check. In addition, for pragma Predicate, do not
disable it when the aspect was internally build as part of
processing a dynamic predicate aspect.
|
|
LWG 3860 added this alias template. Both libc++ and MSVC treat this as a
DR for C++20, so this change does so too.
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (range_common_reference_t): New
alias template, as per LWG 3860.
* testsuite/std/ranges/range.cc: Check it.
|
|
The P2278R4 additions for C++23 are currently guarded by a check for
__cplusplus > 202002L but can use __glibcxx_ranges_as_const instead.
libstdc++-v3/ChangeLog:
* include/bits/ranges_base.h (const_iterator_t): Change
preprocessor condition to use __glibcxx_ranges_as_const.
(const_sentinel_t, range_const_reference_t): Likewise.
(__access::__possibly_const_range, cbegin, cend, crbegin)
(crend, cdata): Likewise.
* include/bits/stl_iterator.h (iter_const_reference_t)
(basic_const_iterator, const_iterator, const_sentinel)
(make_const_iterator): Likewise.
|
|
When using a key type without a valid std::hash specialization the
unordered containers give confusing diagnostics about the default
constructor being deleted. Add a static_assert that will fail for
disabled std::hash specializations (and for a subset of custom hash
functions).
libstdc++-v3/ChangeLog:
PR libstdc++/115420
* include/bits/hashtable.h (_Hashtable): Add static_assert to
check that hash function is copy constructible.
* testsuite/23_containers/unordered_map/115420.cc: New test.
|
|
When we rebased the PSTL on upstream, in r14-2109-g3162ca09dbdc2e, a
change to how _PSTL_USAGE_WARNINGS is set was missed out, but the change
to how it's tested was included. This means that the macro is always
defined, so testing it with #ifdef (instead of using #if to test its
value) doesn't work as intended.
Revert the test to use #if again, since that part of the upstream change
was unnecessary in the first place (the macro is always defined, so
there's no need to use #ifdef to avoid -Wundef warnings).
libstdc++-v3/ChangeLog:
PR libstdc++/113376
* include/pstl/pstl_config.h: Use #if instead of #ifdef to test
the _PSTL_USAGE_WARNINGS macro.
|
|
This patch optimizes the compilation performance of
std::is_nothrow_invocable by dispatching to the new
__is_nothrow_invocable built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_nothrow_invocable): Use
__is_nothrow_invocable built-in trait.
* testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc:
Handle the new error from __is_nothrow_invocable.
* testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc:
Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::is_invocable
by dispatching to the new __is_invocable built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_invocable): Use __is_invocable
built-in trait.
* testsuite/20_util/is_invocable/incomplete_args_neg.cc: Handle
the new error from __is_invocable.
* testsuite/20_util/is_invocable/incomplete_neg.cc: Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::rank
by dispatching to the new __array_rank built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (rank): Use __array_rank built-in
trait.
(rank_v): Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::decay
by dispatching to the new __decay built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (decay): Use __decay built-in trait.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of
std::remove_all_extents by dispatching to the new
__remove_all_extents built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (remove_all_extents): Use
__remove_all_extents built-in trait.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::remove_extent
by dispatching to the new __remove_extent built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (remove_extent): Use __remove_extent
built-in trait.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::add_pointer
by dispatching to the new __add_pointer built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (add_pointer): Use __add_pointer
built-in trait.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of
std::is_unbounded_array by dispatching to the new
__is_unbounded_array built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_unbounded_array_v): Use
__is_unbounded_array built-in trait.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::is_volatile
by dispatching to the new __is_volatile built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_volatile): Use __is_volatile
built-in trait.
(is_volatile_v): Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
This patch optimizes the compilation performance of std::is_const
by dispatching to the new __is_const built-in trait.
libstdc++-v3/ChangeLog:
* include/std/type_traits (is_const): Use __is_const built-in
trait.
(is_const_v): Likewise.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
|
|
The testcase extracts one arm_neon.h vector from a pair (one subreg)
and then reinterprets the result as an SVE vector (another subreg).
Each subreg makes sense individually, but we can't fold them together
into a single subreg: it's 32 bytes -> 16 bytes -> 16*N bytes,
but the interpretation of 32 bytes -> 16*N bytes depends on
whether N==1 or N>1.
Since the second subreg makes sense individually, simplify_subreg
should bail out rather than ICE on it. simplify_gen_subreg will
then do the same (because it already checks validate_subreg).
This leaves simplify_gen_subreg returning null, requiring the
caller to take appropriate action.
I think this is relatively likely to occur elsewhere, so the patch
adds a helper for forcing a subreg, allowing a temporary pseudo to
be created where necessary.
I'll follow up by using force_subreg in more places. This patch
is intended to be a minimal backportable fix for the PR.
gcc/
PR target/115464
* simplify-rtx.cc (simplify_context::simplify_subreg): Don't try
to fold two subregs together if their relationship isn't known
at compile time.
* explow.h (force_subreg): Declare.
* explow.cc (force_subreg): New function.
* config/aarch64/aarch64-sve-builtins-base.cc
(svset_neonq_impl::expand): Use it instead of simplify_gen_subreg.
gcc/testsuite/
PR target/115464
* gcc.target/aarch64/sve/acle/general/pr115464.c: New test.
|
|
We have vec_extract pattern which takes ZVFHMIN as the mode
iterator of the VLS mode. Aka V_VLS. But it will expand to
pred_extract_first pattern which takes the ZVFH as the mode
iterator of the VLS mode. AKa V_VLSF. The mismatch will
result in one ICE similar as below:
error: unrecognizable insn:
27 | }
| ^
(insn 19 18 20 2 (set (reg:HF 150 [ _13 ])
(unspec:HF [
(vec_select:HF (reg:V4HF 134 [ _1 ])
(parallel [
(const_int 0 [0])
]))
(reg:SI 67 vtype)
] UNSPEC_VPREDICATE)) "compress_run-2.c":24:5 -1
(nil))
during RTL pass: vregs
compress_run-2.c:27:1: internal compiler error: in extract_insn, at
recog.cc:2812
0x1a627ef _fatal_insn(char const*, rtx_def const*, char const*, int,
char const*)
../../../gcc/gcc/rtl-error.cc:108
0x1a62834 _fatal_insn_not_found(rtx_def const*, char const*, int, char
const*)
../../../gcc/gcc/rtl-error.cc:116
0x1a0f356 extract_insn(rtx_insn*)
../../../gcc/gcc/recog.cc:2812
0x159ee61 instantiate_virtual_regs_in_insn
../../../gcc/gcc/function.cc:1612
0x15a04aa instantiate_virtual_regs
../../../gcc/gcc/function.cc:1995
0x15a058e execute
../../../gcc/gcc/function.cc:2042
This patch would like to fix this issue by align the mode
iterator restriction to ZVFH.
The below test suites are passed for this patch.
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc.
PR target/115456
gcc/ChangeLog:
* config/riscv/autovec.md: Take ZVFH mode iterator instead of
the ZVFHMIN for the alignment.
* config/riscv/vector-iterators.md: Add 2 new iterator
V_VLS_ZVFH and VLS_ZVFH.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr115456-1.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
For CTEST, we don't have conditional AND so there's no optimization
opportunity to write a new ctest pattern. Emit ctest when ccmp did
comparison to const 0 to save bytes.
gcc/ChangeLog:
* config/i386/i386.md (@ccmp<mode>): Add new alternative
<r>,C and adjust output templates. Also adjust UNSPEC mode
to CCmode.
gcc/testsuite/ChangeLog:
* gcc.target/i386/apx-ccmp-1.c: Adjust output to scan ctest.
* gcc.target/i386/apx-ccmp-2.c: Adjust some condition to
compare with 0.
|
|
No need to talk about potential implementation bugs in older versions
than what we require. And no need to talk about building GCC 3.3 and
earlier at this point.
gcc:
PR other/69374
* doc/install.texi (Prerequisites): Simplify note on the C++
compiler required. Drop requirements for versions of GCC prior
to 3.4. Fix grammar.
|
|
This avoids falling back to elementwise accesses for strided SLP
loads when the group size is not a multiple of the vector element
size. Instead we can use a smaller vector or integer type for the load.
For stores we can do the same though restrictions on stores we handle
and the fact that store-merging covers up makes this mostly effective
for cost modeling which shows for gcc.target/i386/vect-strided-3.c
which we now vectorize with V4SI vectors rather than just V2SI ones.
For all of this there's still the opportunity to use non-uniform
accesses, say for a 6-element group with a VF of two do
V4SI, { V2SI, V2SI }, V4SI. But that's for a possible followup.
* tree-vect-stmts.cc (get_group_load_store_type): Consistently
use VMAT_STRIDED_SLP for strided SLP accesses and not
VMAT_ELEMENTWISE.
(vectorizable_store): Adjust VMAT_STRIDED_SLP handling to
allow not only half-size but also smaller accesses.
(vectorizable_load): Likewise.
* gcc.target/i386/vect-strided-1.c: New testcase.
* gcc.target/i386/vect-strided-2.c: Likewise.
* gcc.target/i386/vect-strided-3.c: Likewise.
* gcc.target/i386/vect-strided-4.c: Likewise.
|
|
The following makes peeling of a single scalar iteration handle more
gaps, including non-power-of-two cases. This can be done by rounding
up the remaining access to the next power-of-two which ensures that
the next scalar iteration will pick at least the number of excess
elements we access.
I've added a correctness testcase and one x86 specific scanning for
the optimization.
PR tree-optimization/115385
* tree-vect-stmts.cc (get_group_load_store_type): Peeling
of a single scalar iteration is sufficient if we can narrow
the access to the next power of two of the bits in the last
access.
(vectorizable_load): Ensure that the last access is narrowed.
* gcc.dg/vect/pr115385.c: New testcase.
* gcc.target/i386/vect-pr115385.c: Likewise.
|
|
The following refactors the code to detect necessary peeling for
gaps, in particular the PR103116 case when there is no gap but
the group size is smaller than the vector size. The testcase in
PR114107 shows we fail to SLP
for (int i=0; i<n; i++)
for (int k=0; k<4; k++)
data[4*i+k] *= factor[i];
because peeling one scalar iteration isn't enough to cover a gap
of 3 elements of factor[i]. But the code detecting this is placed
after the logic that detects cases we handle properly already as
we'd code generate { factor[i], 0., 0., 0. } for V4DFmode vectorization
already. In fact the check to detect when peeling a single iteration
isn't enough seems improperly guarded as it should apply to all cases.
I'm not sure we correctly handle VMAT_CONTIGUOUS_REVERSE but I
checked that VMAT_STRIDED_SLP and VMAT_ELEMENTWISE correctly avoid
touching excess elements.
With this change we can use SLP for the above testcase and the
PR103116 testcases no longer require an epilogue on x86-64. It
might be different on other targets so I made those testcases
runtime FAIL only instead of relying on dump scanning there's
currently no easy way to properly constrain.
PR tree-optimization/114107
PR tree-optimization/110445
* tree-vect-stmts.cc (get_group_load_store_type): Refactor
contiguous access case. Make sure peeling for gap constraints
are always tested and consistently relax when we know we can
avoid touching excess elements during code generation. But
rewrite the check poly-int aware.
* gcc.dg/vect/pr114107.c: New testcase.
* gcc.dg/vect/pr103116-1.c: Adjust.
* gcc.dg/vect/pr103116-2.c: Likewise.
|
|
gcc/cp/ChangeLog:
* parser.cc (cp_parser_asm_string_expression): Use correct error
message.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/constexpr-asm-3.C: Adjust for new message.
|
|
To get better error recovery.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_asm_string_expression): Parse close
parent when constexpr extraction fails.
|
|
asm constexpr now only accepts the same string types as C++26 assert,
e.g. string_view and string. Adjust test suite and documentation.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_asm_string_expression): Remove support
for const char * for asm constexpr.
gcc/ChangeLog:
* doc/extend.texi: Use std::string_view in asm constexpr
example.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/constexpr-asm-1.C: Use std::std_string_view.
* g++.dg/cpp1z/constexpr-asm-3.C: Dito.
|
|
Use reg_or_subregno instead.
gcc/ChangeLog:
PR target/115452
* config/i386/i386-features.cc (scalar_chain::convert_op): Use
reg_or_subregno instead of REGNO to avoid ICE.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr115452.c: New test.
|
|
The test cases of pr115387 are target independent, at least x86
and riscv are able to reproduce. Thus, move these cases to
the gcc.dg/torture.
The below test suites are passed.
1. The rv64gcv fully regression test.
2. The x86 fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr115387-1.c: Move to...
* gcc.dg/torture/pr115387-1.c: ...here.
* gcc.target/riscv/pr115387-2.c: Move to...
* gcc.dg/torture/pr115387-2.c: ...here.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Jeff's commit r15-831-g05daf617ea22e1 changed the instruction we expected
for this test case into an equivalent instruction. Modify the test case
so it will accept any of three instructions we could get depending on the
options used.
2024-06-12 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR testsuite/115262
* gcc.target/powerpc/pr66144-3.c (dg-do): Compile for all targets.
(dg-options): Add -fno-unroll-loops and remove -mvsx.
(scan-assembler): Change from this...
(scan-assembler-times): ...to this. Tweak regex to accept multiple
allowable instructions.
|