Age | Commit message (Collapse) | Author | Files | Lines |
|
FORTRAN currently has a pragma NOVECTOR for indicating that vectorization should
not be applied to a particular loop.
ICC/ICX also has such a pragma for C and C++ called #pragma novector.
As part of this patch series I need a way to easily turn off vectorization of
particular loops, particularly for testsuite reasons.
This patch proposes a #pragma GCC novector that does the same for C++
as gfortan does for FORTRAN and what ICX/ICX does for C++.
I added only some basic tests here, but the next patch in the series uses this
in the testsuite in about ~800 tests.
gcc/cp/ChangeLog:
* cp-tree.h (RANGE_FOR_NOVECTOR): New.
(cp_convert_range_for, finish_while_stmt_cond, finish_do_stmt,
finish_for_cond): Add novector param.
* init.cc (build_vec_init): Default novector to false.
* method.cc (build_comparison_op): Likewise.
* parser.cc (cp_parser_statement): Likewise.
(cp_parser_for, cp_parser_c_for, cp_parser_range_for,
cp_convert_range_for, cp_parser_iteration_statement,
cp_parser_omp_for_loop, cp_parser_pragma): Support novector.
(cp_parser_pragma_novector): New.
* pt.cc (tsubst_expr): Likewise.
* semantics.cc (finish_while_stmt_cond, finish_do_stmt,
finish_for_cond): Likewise.
gcc/ChangeLog:
* doc/extend.texi: Document it.
gcc/testsuite/ChangeLog:
* g++.dg/vect/vect.exp (support vect- prefix).
* g++.dg/vect/vect-novector-pragma.cc: New test.
|
|
ATtiny42*, ATtiny82*, ATtiny162*, ATtiny322*, ATtiny10*.
gcc/
* config/avr/avr-mcus.def (avr64dd14, avr64dd20, avr64dd28, avr64dd32)
(avr64ea28, avr64ea32, avr64ea48, attiny424, attiny426, attiny427)
(attiny824, attiny826, attiny827, attiny1624, attiny1626, attiny1627)
(attiny3224, attiny3226, attiny3227, avr16dd14, avr16dd20, avr16dd28)
(avr16dd32, avr32dd14, avr32dd20, avr32dd28, avr32dd32)
(attiny102, attiny104): New devices.
* doc/avr-mmcu.texi: Regenerate.
|
|
This patch updates the support for the BPF CO-RE builtins
__builtin_preserve_access_index and __builtin_preserve_field_info,
and adds support for the CO-RE builtins __builtin_btf_type_id,
__builtin_preserve_type_info and __builtin_preserve_enum_value.
These CO-RE relocations are now converted to __builtin_core_reloc which
abstracts all of the original builtins in a polymorphic relocation
specific builtin.
The builtin processing is now split in 2 stages, the first (pack) is
executed right after the front-end and the second (process) right before
the asm output.
In expand pass the __builtin_core_reloc is converted to a
unspec:UNSPEC_CORE_RELOC rtx entry.
The data required to process the builtin is now collected in the packing
stage (after front-end), not allowing the compiler to optimize any of
the relevant information required to compose the relocation when
necessary.
At expansion, that information is recovered and CTF/BTF is queried to
construct the information that will be used in the relocation.
At this point the relocation is added to specific section and the
builtin is expanded to the expected default value for the builtin.
In order to process __builtin_preserve_enum_value, it was necessary to
hook the front-end to collect the original enum value reference.
This is needed since the parser folds all the enum values to its
integer_cst representation.
More details can be found within the core-builtins.cc.
Regtested in host x86_64-linux-gnu and target bpf-unknown-none.
gcc/ChangeLog:
PR target/107844
PR target/107479
PR target/107480
PR target/107481
* config.gcc: Added core-builtins.cc and .o files.
* config/bpf/bpf-passes.def: Removed file.
* config/bpf/bpf-protos.h (bpf_add_core_reloc,
bpf_replace_core_move_operands): New prototypes.
* config/bpf/bpf.cc (enum bpf_builtins, is_attr_preserve_access,
maybe_make_core_relo, bpf_core_field_info, bpf_core_compute,
bpf_core_get_index, bpf_core_new_decl, bpf_core_walk,
bpf_is_valid_preserve_field_info_arg, is_attr_preserve_access,
handle_attr_preserve, pass_data_bpf_core_attr, pass_bpf_core_attr):
Removed.
(def_builtin, bpf_expand_builtin, bpf_resolve_overloaded_builtin): Changed.
* config/bpf/bpf.md (define_expand mov<MM:mode>): Changed.
(mov_reloc_core<mode>): Added.
* config/bpf/core-builtins.cc (struct cr_builtin, enum
cr_decision struct cr_local, struct cr_final, struct
core_builtin_helpers, enum bpf_plugin_states): Added types.
(builtins_data, core_builtin_helpers, core_builtin_type_defs):
Added variables.
(allocate_builtin_data, get_builtin-data, search_builtin_data,
remove_parser_plugin, compare_same_kind, compare_same_ptr_expr,
compare_same_ptr_type, is_attr_preserve_access, core_field_info,
bpf_core_get_index, compute_field_expr,
pack_field_expr_for_access_index, pack_field_expr_for_preserve_field,
process_field_expr, pack_enum_value, process_enum_value, pack_type,
process_type, bpf_require_core_support, make_core_relo, read_kind,
kind_access_index, kind_preserve_field_info, kind_enum_value,
kind_type_id, kind_preserve_type_info, get_core_builtin_fndecl_for_type,
bpf_handle_plugin_finish_type, bpf_init_core_builtins,
construct_builtin_core_reloc, bpf_resolve_overloaded_core_builtin,
bpf_expand_core_builtin, bpf_add_core_reloc,
bpf_replace_core_move_operands): Added functions.
* config/bpf/core-builtins.h (enum bpf_builtins): Added.
(bpf_init_core_builtins, bpf_expand_core_builtin,
bpf_resolve_overloaded_core_builtin): Added functions.
* config/bpf/coreout.cc (struct bpf_core_extra): Added.
(bpf_core_reloc_add, output_asm_btfext_core_reloc): Changed.
* config/bpf/coreout.h (bpf_core_reloc_add) Changed prototype.
* config/bpf/t-bpf: Added core-builtins.o.
* doc/extend.texi: Added documentation for new BPF builtins.
|
|
The .spec files used for linking on ppc-vx6, when the rtp-smp runtime
is selected, add -L flags for /lib_smp/ and /lib/.
There was a problem, though: although /lib_smp/ and /lib/ were to be
searched in this order, and the specs files do that correctly, the
compiler would search /lib/ first regardless, because
STARTFILE_PREFIX_SPEC said so, and specs files cannot override that.
With this patch, we arrange for the presence of -msmp to affect
STARTFILE_PREFIX_SPEC, so that the compiler searches /lib_smp/ rather
than /lib/ for crt files. A separate patch for GNAT ensures that when
the rtp-smp runtime is selected, -msmp is passed to the compiler
driver for linking, along with the --specs flags.
for gcc/ChangeLog
* config/vxworks-smp.opt: New. Introduce -msmp.
* config.gcc: Enable it on powerpc* vxworks prior to 7r*.
* config/rs6000/vxworks.h (STARTFILE_PREFIX_SPEC): Choose
lib_smp when -msmp is present in the command line.
* doc/invoke.texi: Document it.
|
|
Fix spelling mistakes introduced by my previous patch in this area.
2023-08-01 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* doc/sourcebuild.texi (arm_v8_1m_main_cde_mve_fp): Fix spelling.
|
|
Resolves:
PR c/65213 - Extend -Wmissing-declarations to variables [i.e. add
-Wmissing-variable-declarations]
gcc/c-family/ChangeLog:
PR c/65213
* c.opt (-Wmissing-variable-declarations): New option.
gcc/c/ChangeLog:
PR c/65213
* c-decl.cc (start_decl): Handle
-Wmissing-variable-declarations.
gcc/ChangeLog:
PR c/65213
* doc/invoke.texi (-Wmissing-variable-declarations): Document
new option.
gcc/testsuite/ChangeLog:
PR c/65213
* gcc.dg/Wmissing-variable-declarations.c: New test.
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
|
|
This patch adds support for embeddding profiling information about the
compiler itself into the SARIF output.
Specifically, if SARIF diagnostic output is requested, via
-fdiagnostics-format=sarif-file or -fdiagnostics-format=sarif-stderr,
then any -ftime-report output is written in JSON form into the SARIF
output, rather than to stderr.
In earlier versions of this patch I extended -ftime-report so that
*as well* as writing to stderr, it would embed the information in any
SARIF output. This turned out to be awkward to use, in that I found
myself needing to get the data in JSON form without also having it
emitted on stderr (which was fouling my build scripts).
The timing information is written to the SARIF as a "gcc/timeReport"
property within a property bag of the "invocation" object.
Here's an example of the output:
"invocations": [
{
"executionSuccessful": true,
"toolExecutionNotifications": [],
"properties": {
"gcc/timeReport": {
"timevars": [
{
"name": "phase setup",
"elapsed": {
"user": 0.04,
"sys": 0,
"wall": 0.04,
"ggc_mem": 1863472
}
},
[...snip...]
{
"name": "analyzer: processing worklist",
"elapsed": {
"user": 0.06,
"sys": 0,
"wall": 0.06,
"ggc_mem": 48
}
},
{
"name": "analyzer: emitting diagnostics",
"elapsed": {
"user": 0.01,
"sys": 0,
"wall": 0.01,
"ggc_mem": 0
}
},
{
"name": "TOTAL",
"elapsed": {
"user": 0.21,
"sys": 0.03,
"wall": 0.24,
"ggc_mem": 3368736
}
}
],
"CHECKING_P": true,
"flag_checking": true
}
}
}
]
The documentation notes that the precise output format is subject
to change.
I have successfully used this in my analyzer integration tests to get
timing information about which source files get slowed down by the
analyzer. I've validated the generated .sarif files against the SARIF
schema.
gcc/ChangeLog:
PR analyzer/109361
* diagnostic-client-data-hooks.h (class sarif_object): New forward
decl.
(diagnostic_client_data_hooks::add_sarif_invocation_properties):
New vfunc.
* diagnostic-format-sarif.cc: Include "diagnostic-format-sarif.h".
(class sarif_invocation): Inherit from sarif_object rather than
json::object.
(class sarif_result): Likewise.
(class sarif_ice_notification): Likewise.
(sarif_object::get_or_create_properties): New.
(sarif_invocation::prepare_to_flush): Add "context" param. Use it
to call the context's add_sarif_invocation_properties hook.
(sarif_builder::flush_to_file): Pass m_context to
sarif_invocation::prepare_to_flush.
* diagnostic-format-sarif.h: New header.
* doc/invoke.texi (Developer Options): Clarify that -ftime-report
writes to stderr. Document that if SARIF diagnostic output is
requested then any timing information is written in JSON form as
part of the SARIF output, rather than to stderr.
* timevar.cc: Include "json.h".
(timer::named_items::m_hash_map): Split out type into...
(timer::named_items::hash_map_t): ...this new typedef.
(timer::named_items::make_json): New function.
(timevar_diff): New function.
(make_json_for_timevar_time_def): New function.
(timer::timevar_def::make_json): New function.
(timer::make_json): New function.
* timevar.h (class json::value): New forward decl.
(timer::make_json): New decl.
(timer::timevar_def::make_json): New decl.
* tree-diagnostic-client-data-hooks.cc: Include
"diagnostic-format-sarif.h" and "timevar.h".
(compiler_data_hooks::add_sarif_invocation_properties): New vfunc
implementation.
gcc/testsuite/ChangeLog:
PR analyzer/109361
* c-c++-common/diagnostic-format-sarif-file-timevars-1.c: New test.
* c-c++-common/diagnostic-format-sarif-file-timevars-2.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
BPF ISA V4 introduces sign-extending move and load operations. This
patch makes the BPF backend generate those instructions, when enabled
and useful.
A new option, -m[no-]smov gates generation of these instructions, and is
enabled by default for -mcpu=v4 and above. Tests for the new
instructions and documentation for the new options are included.
PR target/110782
PR target/110784
gcc/
* config/bpf/bpf.opt (msmov): New option.
* config/bpf/bpf.cc (bpf_option_override): Handle it here.
* config/bpf/bpf.md (*extendsidi2): New.
(extendhidi2): New.
(extendqidi2): New.
(extendsisi2): New.
(extendhisi2): New.
(extendqisi2): New.
* doc/invoke.texi (Option Summary): Add -msmov eBPF option.
(eBPF Options): Add -m[no-]smov. Document that -mcpu=v4
also enables -msmov.
gcc/testsuite/
* gcc.target/bpf/sload-1.c: New test.
* gcc.target/bpf/sload-pseudoc-1.c: New test.
* gcc.target/bpf/smov-1.c: New test.
* gcc.target/bpf/smov-pseudoc-1.c: New test.
|
|
This patch makes some minor cleanups to eBPF options documented in
invoke.texi:
- Delete some vestigal docs for removed -mkernel option
- Add -mbswap and -msdiv to the option summary
- Note the negative versions of several options
- Note that -mcpu=v4 also enables -msdiv.
gcc/
* doc/invoke.texi (Option Summary): Remove -mkernel eBPF option.
Add -mbswap and -msdiv eBPF options.
(eBPF Options): Remove -mkernel. Add -mno-{jmpext, jmp32,
alu32, v3-atomics, bswap, sdiv}. Document that -mcpu=v4 also
enables -msdiv.
|
|
This patch adds support for the general atomic operations introduced in
eBPF v3. In addition to the existing atomic add instruction, this adds:
- Atomic and, or, xor
- Fetching versions of these operations (including add)
- Atomic exchange
- Atomic compare-and-exchange
To control emission of these instructions, a new target option
-m[no-]v3-atomics is added. This option is enabled by -mcpu=v3
and above.
Support for these instructions was recently added in binutils.
gcc/
* config/bpf/bpf.opt (mv3-atomics): New option.
* config/bpf/bpf.cc (bpf_option_override): Handle it here.
* config/bpf/bpf.h (enum_reg_class): Add R0 class.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(REGNO_REG_CLASS): Handle R0.
* config/bpf/bpf.md (UNSPEC_XADD): Rename to UNSPEC_AADD.
(UNSPEC_AAND): New unspec.
(UNSPEC_AOR): Likewise.
(UNSPEC_AXOR): Likewise.
(UNSPEC_AFADD): Likewise.
(UNSPEC_AFAND): Likewise.
(UNSPEC_AFOR): Likewise.
(UNSPEC_AFXOR): Likewise.
(UNSPEC_AXCHG): Likewise.
(UNSPEC_ACMPX): Likewise.
(atomic_add<mode>): Use UNSPEC_AADD and atomic type attribute.
Move to...
* config/bpf/atomic.md: ...Here. New file.
* config/bpf/constraints.md (t): New constraint for R0.
* doc/invoke.texi (eBPF Options): Document -mv3-atomics.
gcc/testsuite/
* gcc.target/bpf/atomic-cmpxchg-1.c: New test.
* gcc.target/bpf/atomic-cmpxchg-2.c: New test.
* gcc.target/bpf/atomic-fetch-op-1.c: New test.
* gcc.target/bpf/atomic-fetch-op-2.c: New test.
* gcc.target/bpf/atomic-fetch-op-3.c: New test.
* gcc.target/bpf/atomic-op-1.c: New test.
* gcc.target/bpf/atomic-op-2.c: New test.
* gcc.target/bpf/atomic-op-3.c: New test.
* gcc.target/bpf/atomic-xchg-1.c: New test.
* gcc.target/bpf/atomic-xchg-2.c: New test.
|
|
We used to support signed division and signed modulus instructions in
the XBPF GCC-specific extensions to BPF. However, BPF catched up by
adding these instructions in the V4 of the ISA.
This patch changes GCC in order to use sdiv/smod instructions when
-mcpu=v4 or higher. The testsuite and the manual have been updated
accordingly.
Tested in bpf-unknown-none.
gcc/ChangeLog
PR target/110783
* config/bpf/bpf.opt: New command-line option -msdiv.
* config/bpf/bpf.md: Conditionalize sdiv/smod on bpf_has_sdiv.
* config/bpf/bpf.cc (bpf_option_override): Initialize
bpf_has_sdiv.
* doc/invoke.texi (eBPF Options): Document -msdiv.
gcc/testsuite/ChangeLog
PR target/110783
* gcc.target/bpf/xbpf-sdiv-1.c: Renamed to sdiv-1.c
* gcc.target/bpf/xbpf-smod-1.c: Renamed to smod-1.c
* gcc.target/bpf/sdiv-1.c: Renamed from xbpf-sdiv-1.c, use -mcpu=v4.
* gcc.target/bpf/smod-1.c: Renamed from xbpf-smod-1.c, use -mcpu=v4.
* gcc.target/bpf/diag-sdiv.c: Use -mcpu=v3.
* gcc.target/bpf/diag-smod.c: Likewise.
|
|
This patch makes the BPF backend to use the new V4 bswap{16,32,64}
instructions in order to implement the __builtin_bswap{16,32,64}
built-ins. It also adds support for -mcpu=v4 and -m[no]bswap
command-line options. Tests and doc updates are includes.
Tested in bpf-unknown-none.
gcc/ChangeLog
PR target/110786
* config/bpf/bpf.opt (mcpu): Add ISA_V4 and make it the default.
(mbswap): New option.
* config/bpf/bpf-opts.h (enum bpf_isa_version): New value ISA_V4.
* config/bpf/bpf.cc (bpf_option_override): Set bpf_has_bswap.
* config/bpf/bpf.md: Use bswap instructions if available for
bswap* insn, and fix constraint.
* doc/invoke.texi (eBPF Options): Document -mcpu=v4 and -mbswap.
gcc/testsuite/ChangeLog
PR target/110786
* gcc.target/bpf/bswap-1.c: Pass -mcpu=v3 to build test.
* gcc.target/bpf/bswap-2.c: New test.
|
|
New pseudo-c BPF assembly dialect already supported by clang and widely
used in the linux kernel.
gcc/ChangeLog:
PR target/110770
* config/bpf/bpf.opt: Added option -masm=<dialect>.
* config/bpf/bpf-opts.h (enum bpf_asm_dialect): New type.
* config/bpf/bpf.cc (bpf_print_register): New function.
(bpf_print_register): Support pseudo-c syntax for registers.
(bpf_print_operand_address): Likewise.
* config/bpf/bpf.h (ASM_SPEC): handle -msasm.
(ASSEMBLER_DIALECT): Define.
* config/bpf/bpf.md: Added pseudo-c templates.
* doc/invoke.texi (-masm=): New eBPF option item.
|
|
and len
This patch is depending on:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625121.html
Hi, Richard and Richi.
This patch is to align the order of mask and len.
Currently, According to this piece code:
if (final_len && final_mask)
call = gimple_build_call_internal (
IFN_LEN_MASK_GATHER_LOAD, 7, dataref_ptr,
vec_offset, scale, zero, final_mask, final_len,
bias);
You can see the order of mask and len, is {mask,len,bias}.
"mask" comes before "len". The reason of this order is that we want to
reuse the current codes of MASK_GATHER_LOAD/MASK_SCATTER_STORE.
Same situation for COND_LEN_*, we want to reuse the codes of COND_*.
Reusing codes from the existing MASK_* or COND_* can allow us not to
change the codes too much and make the codes elegant and easy to maintain && read.
To avoid any confusions of auto-vectorization patterns that includes both mask and len,
this patch align the order of mask and len for both Gimple IR and RTL pattern into
{mask, len, bias} to make everything cleaner and more elegant.
Bootstrap and Regression is on the way.
gcc/ChangeLog:
* config/riscv/autovec.md: Align order of mask and len.
* config/riscv/riscv-v.cc (expand_load_store): Ditto.
(expand_gather_scatter): Ditto.
* doc/md.texi: Ditto.
* internal-fn.cc (add_len_and_mask_args): Ditto.
(add_mask_and_len_args): Ditto.
(expand_partial_load_optab_fn): Ditto.
(expand_partial_store_optab_fn): Ditto.
(expand_scatter_store_optab_fn): Ditto.
(expand_gather_load_optab_fn): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_len_load_store_bias): Ditto.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
|
|
Hi.
Since start from LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE, COND_LEN_* patterns,
the order of len and mask is {mask,len,bias}.
The reason we make "mask" argument comes before "len" is because we want to keep
the "mask" location same as mask_* or cond_* patterns to make use of current codes flow
of mask_* and cond_*. Otherwise, we will need to change codes much more and make codes
hard to maintain.
Now, we already have COND_LEN_*, it's naturally that we should rename "LEN_MASK" into "MASK_LEN"
to keep name scheme consistent.
This patch only changes the name "LEN_MASK" into "MASK_LEN".
No codes functionality change.
gcc/ChangeLog:
* config/riscv/autovec.md (len_maskload<mode><vm>): Change LEN_MASK into MASK_LEN.
(mask_len_load<mode><vm>): Ditto.
(len_maskstore<mode><vm>): Ditto.
(mask_len_store<mode><vm>): Ditto.
(len_mask_gather_load<RATIO64:mode><RATIO64I:mode>): Ditto.
(mask_len_gather_load<RATIO64:mode><RATIO64I:mode>): Ditto.
(len_mask_gather_load<RATIO32:mode><RATIO32I:mode>): Ditto.
(mask_len_gather_load<RATIO32:mode><RATIO32I:mode>): Ditto.
(len_mask_gather_load<RATIO16:mode><RATIO16I:mode>): Ditto.
(mask_len_gather_load<RATIO16:mode><RATIO16I:mode>): Ditto.
(len_mask_gather_load<RATIO8:mode><RATIO8I:mode>): Ditto.
(mask_len_gather_load<RATIO8:mode><RATIO8I:mode>): Ditto.
(len_mask_gather_load<RATIO4:mode><RATIO4I:mode>): Ditto.
(mask_len_gather_load<RATIO4:mode><RATIO4I:mode>): Ditto.
(len_mask_gather_load<RATIO2:mode><RATIO2I:mode>): Ditto.
(mask_len_gather_load<RATIO2:mode><RATIO2I:mode>): Ditto.
(len_mask_gather_load<RATIO1:mode><RATIO1:mode>): Ditto.
(mask_len_gather_load<RATIO1:mode><RATIO1:mode>): Ditto.
(len_mask_scatter_store<RATIO64:mode><RATIO64I:mode>): Ditto.
(mask_len_scatter_store<RATIO64:mode><RATIO64I:mode>): Ditto.
(len_mask_scatter_store<RATIO32:mode><RATIO32I:mode>): Ditto.
(mask_len_scatter_store<RATIO32:mode><RATIO32I:mode>): Ditto.
(len_mask_scatter_store<RATIO16:mode><RATIO16I:mode>): Ditto.
(mask_len_scatter_store<RATIO16:mode><RATIO16I:mode>): Ditto.
(len_mask_scatter_store<RATIO8:mode><RATIO8I:mode>): Ditto.
(mask_len_scatter_store<RATIO8:mode><RATIO8I:mode>): Ditto.
(len_mask_scatter_store<RATIO4:mode><RATIO4I:mode>): Ditto.
(mask_len_scatter_store<RATIO4:mode><RATIO4I:mode>): Ditto.
(len_mask_scatter_store<RATIO2:mode><RATIO2I:mode>): Ditto.
(mask_len_scatter_store<RATIO2:mode><RATIO2I:mode>): Ditto.
(len_mask_scatter_store<RATIO1:mode><RATIO1:mode>): Ditto.
(mask_len_scatter_store<RATIO1:mode><RATIO1:mode>): Ditto.
* doc/md.texi: Ditto.
* genopinit.cc (main): Ditto.
(CMP_NAME): Ditto. Ditto.
* gimple-fold.cc (arith_overflowed_p): Ditto.
(gimple_fold_partial_load_store_mem_ref): Ditto.
(gimple_fold_call): Ditto.
* internal-fn.cc (len_maskload_direct): Ditto.
(mask_len_load_direct): Ditto.
(len_maskstore_direct): Ditto.
(mask_len_store_direct): Ditto.
(expand_call_mem_ref): Ditto.
(expand_len_maskload_optab_fn): Ditto.
(expand_mask_len_load_optab_fn): Ditto.
(expand_len_maskstore_optab_fn): Ditto.
(expand_mask_len_store_optab_fn): Ditto.
(direct_len_maskload_optab_supported_p): Ditto.
(direct_mask_len_load_optab_supported_p): Ditto.
(direct_len_maskstore_optab_supported_p): Ditto.
(direct_mask_len_store_optab_supported_p): Ditto.
(internal_load_fn_p): Ditto.
(internal_store_fn_p): Ditto.
(internal_gather_scatter_fn_p): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
(internal_len_load_store_bias): Ditto.
* internal-fn.def (LEN_MASK_GATHER_LOAD): Ditto.
(MASK_LEN_GATHER_LOAD): Ditto.
(LEN_MASK_LOAD): Ditto.
(MASK_LEN_LOAD): Ditto.
(LEN_MASK_SCATTER_STORE): Ditto.
(MASK_LEN_SCATTER_STORE): Ditto.
(LEN_MASK_STORE): Ditto.
(MASK_LEN_STORE): Ditto.
* optabs-query.cc (supports_vec_gather_load_p): Ditto.
(supports_vec_scatter_store_p): Ditto.
* optabs-tree.cc (target_supports_mask_load_store_p): Ditto.
(target_supports_len_load_store_p): Ditto.
* optabs.def (OPTAB_CD): Ditto.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Ditto.
(call_may_clobber_ref_p_1): Ditto.
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Ditto.
(dse_optimize_stmt): Ditto.
* tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Ditto.
(get_alias_ptr_type_for_ptr_address): Ditto.
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Ditto.
* tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Ditto.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto.
(vect_get_strided_load_store_ops): Ditto.
(vectorizable_store): Ditto.
(vectorizable_load): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/gimple_fold-1.c: Ditto.
|
|
This patch documents the analyzer parameters introduced in
r14-2029-g0e466e978c7286 also in gcc/doc/invoke.texi.
2023-07-20 Martin Jambor <mjambor@suse.cz>
* doc/invoke.texi (analyzer-text-art-string-ellipsis-threshold): New.
(analyzer-text-art-ideal-canvas-width): Likewise.
(analyzer-text-art-string-ellipsis-head-len): Likewise.
(analyzer-text-art-string-ellipsis-tail-len): Likewise.
|
|
iseqsig() is a C2x library function, for signaling floating-point
equality checks. Provide a GCC-builtin for it, which is folded to
a series of comparisons.
2022-09-01 Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
PR middle-end/77928
gcc/
* doc/extend.texi: Document iseqsig builtin.
* builtins.cc (fold_builtin_iseqsig): New function.
(fold_builtin_2): Handle BUILT_IN_ISEQSIG.
(is_inexpensive_builtin): Handle BUILT_IN_ISEQSIG.
* builtins.def (BUILT_IN_ISEQSIG): New built-in.
gcc/c-family/
* c-common.cc (check_builtin_function_arguments):
Handle BUILT_IN_ISEQSIG.
gcc/testsuite/
* gcc.dg/torture/builtin-iseqsig-1.c: New test.
* gcc.dg/torture/builtin-iseqsig-2.c: New test.
* gcc.dg/torture/builtin-iseqsig-3.c: New test.
|
|
gcc/Changelog:
* doc/invoke.texi: Remove AVX512VP2INTERSECT in
Granite Rapids{, D} from documentation.
|
|
Hi, Richard and Richi.
This patch adds mask_len_fold_left_plus pattern to support in-order floating-point
reduction for target support len loop control.
Consider this following case:
double
foo2 (double *__restrict a,
double init,
int *__restrict cond,
int n)
{
for (int i = 0; i < n; i++)
if (cond[i])
init += a[i];
return init;
}
ARM SVE:
...
vec_mask_and_60 = loop_mask_54 & mask__23.33_57;
vect__ifc__35.37_64 = .VCOND_MASK (vec_mask_and_60, vect__8.36_61, { 0.0, ... });
_36 = .MASK_FOLD_LEFT_PLUS (init_20, vect__ifc__35.37_64, loop_mask_54);
...
For RVV, we want to see:
...
_36 = .MASK_LEN_FOLD_LEFT_PLUS (init_20, vect__ifc__35.37_64, control_mask, loop_len, bias);
...
gcc/ChangeLog:
* doc/md.texi: Add mask_len_fold_left_plus.
* internal-fn.cc (mask_len_fold_left_direct): Ditto.
(expand_mask_len_fold_left_optab_fn): Ditto.
(direct_mask_len_fold_left_optab_supported_p): Ditto.
* internal-fn.def (MASK_LEN_FOLD_LEFT_PLUS): Ditto.
* optabs.def (OPTAB_D): Ditto.
|
|
This patch fixes many limitations of the uninitialized static analysis.
NEW is understood, local variable pointers and non var parameters
will be tracked.
gcc/ChangeLog:
* doc/gm2.texi (Semantic checking): Change example testwithptr
to testnew6.
gcc/m2/ChangeLog:
* Make-lang.in: Minor formatting change.
* gm2-compiler/M2GCCDeclare.mod
(DeclareUnboundedProcedureParameters): Rename local variables.
(WalkUnboundedProcedureParameters): Rename local variables.
(DoVariableDeclaration): Avoid declaration of a variable if
it is on the heap (used by static analysis only).
* gm2-compiler/M2GenGCC.mod: Formatting.
* gm2-compiler/M2Quads.def (GetQuadTrash): New procedure function.
* gm2-compiler/M2Quads.mod (GetQuadTrash): New procedure function.
(QuadFrame): Add Trash field.
(BuildRealFuncProcCall): Detect ALLOCATE and DEALLOCATE and create
a heap variable for parameter 1 saving it as the trashed variable
for static analysis.
(GenQuadOTrash): New procedure.
(DisplayQuadRange): Bugfix. Write the scope number.
* gm2-compiler/M2SymInit.mod: Rewritten to separate LValue
equivalence from LValue to RValue pairings. Comprehensive
detection of variant record implemented. Allow dereferencing
of pointers through LValue/RValue chains.
* gm2-compiler/SymbolTable.def (PutVarHeap): New procedure.
(IsVarHeap): New procedure function.
(ForeachParamSymDo): New procedure.
* gm2-compiler/SymbolTable.mod (PutVarHeap): New procedure.
(IsVarHeap): New procedure function.
(ForeachParamSymDo): New procedure.
(MakeVariableForParam): Reformatted.
(CheckForUnknownInModule): Reformatted.
(SymVar): Add field Heap.
(MakeVar): Assign Heap to FALSE.
gcc/testsuite/ChangeLog:
* gm2/switches/uninit-variable-checking/pass/assignparam.mod: New test.
* gm2/switches/uninit-variable-checking/pass/tiny.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/switches-uninit-variable-checking-procedures-fail.exp:
New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testnew.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testnew2.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testnew3.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testnew4.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testnew5.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testnew6.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/fail/testptrptr.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/assignparam2.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/switches-uninit-variable-checking-procedures-pass.exp:
New test.
* gm2/switches/uninit-variable-checking/procedures/pass/testnew5.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/testnew6.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/testparamlvalue.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/testparamrvalue.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/testproc.mod: New test.
* gm2/switches/uninit-variable-checking/procedures/pass/testptrptr.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
gcc/ChangeLog:
* doc/extend.texi: Add @cindex on __auto_type.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_intel_cpu): Handle Lunar Lake,
Arrow Lake and Arrow Lake S.
* common/config/i386/i386-common.cc:
(processor_name): Add arrowlake.
(processor_alias_table): Add arrow lake, arrow lake s and lunar
lake.
* common/config/i386/i386-cpuinfo.h (enum processor_subtypes):
Add INTEL_COREI7_ARROWLAKE and INTEL_COREI7_ARROWLAKE_S.
* config.gcc: Add -march=arrowlake and -march=arrowlake-s.
* config/i386/driver-i386.cc (host_detect_local_cpu): Handle
arrowlake-s.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add
arrowlake.
* config/i386/i386-options.cc (m_ARROWLAKE): New.
(processor_cost_table): Add arrowlake.
* config/i386/i386.h (enum processor_type):
Add PROCESSOR_ARROWLAKE.
* config/i386/x86-tune.def: Add m_ARROWLAKE.
* doc/extend.texi: Add arrowlake and arrowlake-s.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv16.C: Add arrowlake and arrowlake-s.
* gcc.target/i386/funcspec-56.inc: Handle new march.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Detech SM4.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SM4_SET,
OPTION_MASK_ISA2_SM4_UNSET): New.
(OPTION_MASK_ISA2_AVX_UNSET): Add SM4.
(ix86_handle_option): Handle -msm4.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_SM4.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
sm4.
* config.gcc: Add sm4intrin.h.
* config/i386/cpuid.h (bit_SM4): New.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__SM4__.
* config/i386/i386-isa.def (SM4): Add DEF_PTA(SM4).
* config/i386/i386-options.cc (isa2_opts): Add -msm4.
(ix86_valid_target_attribute_inner_p): Handle sm4.
* config/i386/i386.opt: Add option -msm4.
* config/i386/immintrin.h: Include sm4intrin.h
* config/i386/sse.md (vsm4key4_<mode>): New define insn.
(vsm4rnds4_<mode>): Ditto.
* doc/extend.texi: Document sm4.
* doc/invoke.texi: Document -msm4.
* doc/sourcebuild.texi: Document target sm4.
* config/i386/sm4intrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -msm4.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -msm4.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add sm4.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp (check_effective_target_sm4): New.
* gcc.target/i386/sm4-1.c: New test.
* gcc.target/i386/sm4-check.h: Ditto.
* gcc.target/i386/sm4key4-2.c: Ditto.
* gcc.target/i386/sm4rnds4-2.c: Ditto.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Detect SHA512.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SHA512_SET,
OPTION_MASK_ISA2_SHA512_UNSET): New.
(OPTION_MASK_ISA2_AVX_UNSET): Add SHA512.
(ix86_handle_option): Handle -msha512.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_SHA512.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
sha512.
* config.gcc: Add sha512intrin.h.
* config/i386/cpuid.h (bit_SHA512): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V4DI, V4DI, V4DI, V2DI).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__SHA512__.
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
V4DI_FTYPE_V4DI_V4DI_V2DI and V4DI_FTYPE_V4DI_V2DI.
* config/i386/i386-isa.def (SHA512): Add DEF_PTA(SHA512).
* config/i386/i386-options.cc (isa2_opts): Add -msha512.
(ix86_valid_target_attribute_inner_p): Handle sha512.
* config/i386/i386.opt: Add option -msha512.
* config/i386/immintrin.h: Include sha512intrin.h.
* config/i386/sse.md (vsha512msg1): New define insn.
(vsha512msg2): Ditto.
(vsha512rnds2): Ditto.
* doc/extend.texi: Document sha512.
* doc/invoke.texi: Document -msha512.
* doc/sourcebuild.texi: Document target sha512.
* config/i386/sha512intrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -msha512.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -msha512.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add sha512.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp (check_effective_target_sha512): New.
* gcc.target/i386/sha512-1.c: New test.
* gcc.target/i386/sha512-check.h: Ditto.
* gcc.target/i386/sha512msg1-2.c: Ditto.
* gcc.target/i386/sha512msg2-2.c: Ditto.
* gcc.target/i386/sha512rnds2-2.c: Ditto.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Detect SM3.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SM3_SET,
OPTION_MASK_ISA2_SM3_UNSET): New.
(OPTION_MASK_ISA2_AVX_UNSET): Add SM3.
(ix86_handle_option): Handle -msm3.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_SM3.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
SM3.
* config.gcc: Add sm3intrin.h
* config/i386/cpuid.h (bit_SM3): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V4SI, V4SI, V4SI, V4SI, INT).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__SM3__.
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
V4SI_FTYPE_V4SI_V4SI_V4SI_INT.
* config/i386/i386-isa.def (SM3): Add DEF_PTA(SM3).
* config/i386/i386-options.cc (isa2_opts): Add -msm3.
(ix86_valid_target_attribute_inner_p): Handle sm3.
* config/i386/i386.opt: Add option -msm3.
* config/i386/immintrin.h: Include sm3intrin.h.
* config/i386/sse.md (vsm3msg1): New define insn.
(vsm3msg2): Ditto.
(vsm3rnds2): Ditto.
* doc/extend.texi: Document sm3.
* doc/invoke.texi: Document -msm3.
* doc/sourcebuild.texi: Document target sm3.
* config/i386/sm3intrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -msm3.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-1.c: Add new define for immediate.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -msm3.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add sm3.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp (check_effective_target_sm3): New.
* gcc.target/i386/sm3-1.c: New test.
* gcc.target/i386/sm3-check.h: Ditto.
* gcc.target/i386/sm3msg1-2.c: Ditto.
* gcc.target/i386/sm3msg2-2.c: Ditto.
* gcc.target/i386/sm3rnds2-2.c: Ditto.
|
|
gcc/ChangeLog
* common/config/i386/cpuinfo.h (get_available_features): Detect
avxvnniint16.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVXVNNIINT16_SET): New.
(OPTION_MASK_ISA2_AVXVNNIINT16_UNSET): Ditto.
(ix86_handle_option): Handle -mavxvnniint16.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVXVNNIINT16.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxvnniint16.
* config.gcc: Add avxvnniint16.h.
* config/i386/avxvnniint16intrin.h: New file.
* config/i386/cpuid.h (bit_AVXVNNIINT16): New.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXVNNIINT16__.
* config/i386/i386-options.cc (isa2_opts): Add -mavxvnniint16.
(ix86_valid_target_attribute_inner_p): Handle avxvnniint16intrin.h.
* config/i386/i386-isa.def: Add DEF_PTA(AVXVNNIINT16).
* config/i386/i386.opt: Add option -mavxvnniint16.
* config/i386/immintrin.h: Include avxvnniint16.h.
* config/i386/sse.md
(vpdp<vpdpwprodtype>_<mode>): New define_insn.
* doc/extend.texi: Document avxvnniint16.
* doc/invoke.texi: Document -mavxvnniint16.
* doc/sourcebuild.texi: Document target avxvnniint16.
gcc/testsuite/ChangeLog
* g++.dg/other/i386-2.C: Add -mavxvnniint16.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-check.h: Add avxvnniint16 check.
* gcc.target/i386/sse-12.c: Add -mavxvnniint16.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxvnniint16): New.
* gcc.target/i386/avxvnniint16-1.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwusd-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwusds-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwsud-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwsuds-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwuud-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwuuds-2.c: Ditto.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
Committed as obvious after making sure the documentation still builds.
gcc/ChangeLog:
* doc/contrib.texi: Update my entry.
|
|
Change the return value from void to double for __builtin_set_fpscr_rn.
The return value consists of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI,
RN bit positions. A new test file, test powerpc/test_fpscr_rn_builtin_2.c,
is added to test the new return value for the built-in.
The value __SET_FPSCR_RN_RETURNS_FPSCR__ is defined if
__builtin_set_fpscr_rn returns a double.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
built-in definition return type.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Add check,
define __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
* config/rs6000/rs6000.md (rs6000_set_fpscr_rn): Add return
argument to return FPSCR fields.
* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
the return value. Add description for
__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/test_fpscr_rn_builtin.c: Rename to
test_fpscr_rn_builtin_1.c. Add comment.
* gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
return value of __builtin_set_fpscr_rn builtin.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_intel_cpu): Handle Granite Rapids D.
* common/config/i386/i386-common.cc:
(processor_alias_table): Add graniterapids-d.
* common/config/i386/i386-cpuinfo.h
(enum processor_subtypes): Add INTEL_COREI7_GRANITERAPIDS_D.
* config.gcc: Add -march=graniterapids-d.
* config/i386/driver-i386.cc (host_detect_local_cpu):
Handle graniterapids-d.
* config/i386/i386.h: (PTA_GRANITERAPIDS_D): New.
* doc/extend.texi: Add graniterapids-d.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv16.C: Add graniterapids-d.
* gcc.target/i386/funcspec-56.inc: Handle new march.
|
|
This patch combines basic blocks for static analysis of uninitialized
variables providing that they are not the top of a loop, are not reached
by a conditional and are not reached after a procedure call. It also
avoids checking array accesses for static analysis. Finally the patch
adds switch modifiers to allow static analysis to include conditional
branches for subsequent basic block analysis.
gcc/ChangeLog:
* doc/gm2.texi (-Wuninit-variable-checking=) New item.
gcc/m2/ChangeLog:
* gm2-compiler/M2BasicBlock.def (InitBasicBlocksFromRange): New
parameter ScopeSym.
* gm2-compiler/M2BasicBlock.mod (ConvertQuads2BasicBlock): New
parameter ScopeSym.
(InitBasicBlocksFromRange): New parameter ScopeSym. Call
ConvertQuads2BasicBlock with ScopeSym.
(DisplayBasicBlocks): Uncomment.
* gm2-compiler/M2Code.mod: Replace VariableAnalysis with
ScopeBlockVariableAnalysis.
(InitialDeclareAndOptiomize): Add parameter scope.
(SecondDeclareAndOptimize): Add parameter scope.
* gm2-compiler/M2GCCDeclare.mod (DeclareConstructor): Add scope
parameter to DeclareTypesConstantsProceduresInRange.
(DeclareTypesConstantsProceduresInRange): New parameter scope.
Pass scope to DisplayQuadRange. Reformatted.
* gm2-compiler/M2GenGCC.def (ConvertQuadsToTree): New parameter
scope.
* gm2-compiler/M2GenGCC.mod (ConvertQuadsToTree): New parameter
scope.
* gm2-compiler/M2Optimize.mod (KnownReachable): New parameter
scope.
* gm2-compiler/M2Options.def (SetUninitVariableChecking): Add
arg parameter.
* gm2-compiler/M2Options.mod (SetUninitVariableChecking): Add
arg parameter and set boolean UninitVariableChecking and
UninitVariableConditionalChecking.
(UninitVariableConditionalChecking): New boolean set to FALSE.
* gm2-compiler/M2Quads.def (IsGoto): New procedure function.
(DisplayQuadRange): Add scope parameter.
(LoopAnalysis): Add scope parameter.
* gm2-compiler/M2Quads.mod: Import PutVarArrayRef.
(IsGoto): New procedure function.
(LoopAnalysis): Add scope parameter and use MetaErrorT1 instead
of WarnStringAt.
(BuildStaticArray): Call PutVarArrayRef.
(BuildDynamicArray): Call PutVarArrayRef.
(DisplayQuadRange): Add scope parameter.
(GetM2OperatorDesc): Add relational condition cases.
* gm2-compiler/M2Scope.def (ScopeProcedure): Add parameter.
* gm2-compiler/M2Scope.mod (DisplayScope): Pass scopeSym to
DisplayQuadRange.
(ForeachScopeBlockDo): Pass scopeSym to p.
* gm2-compiler/M2SymInit.def (VariableAnalysis): Rename to ...
(ScopeBlockVariableAnalysis): ... this.
* gm2-compiler/M2SymInit.mod (ScopeBlockVariableAnalysis): Add
scope parameter.
(bbEntry): New pointer to record.
(bbArray): New array.
(bbFreeList): New variable.
(errorList): New list.
(IssueConditional): New procedure.
(GenerateNoteFlow): New procedure.
(IssueWarning): New procedure.
(IsUniqueWarning): New procedure.
(CheckDeferredRecordAccess): Re-implement.
(CheckBinary): Add warning and lst parameters.
(CheckUnary): Add warning and lst parameters.
(CheckXIndr): Add warning and lst parameters.
(CheckIndrX): Add warning and lst parameters.
(CheckBecomes): Add warning and lst parameters.
(CheckComparison): Add warning and lst parameters.
(CheckReadBeforeInitQuad): Add warning and lst parameters to all
Check procedures. Add all case quadruple clauses.
(FilterCheckReadBeforeInitQuad): Add warning and lst parameters.
(CheckReadBeforeInitFirstBasicBlock): Add warning and lst parameters.
(bbArrayKill): New procedure.
(DumpBBEntry): New procedure.
(DumpBBArray): New procedure.
(DumpBBSequence): New procedure.
(TestBBSequence): New procedure.
(CreateBBPermultations): New procedure.
(ScopeBlockVariableAnalysis): New procedure.
(GetOp3): New procedure.
(GenerateCFG): New procedure.
(NewEntry): New procedure.
(AppendEntry): New procedure.
(init): Initialize bbFreeList and errorList.
* gm2-compiler/SymbolTable.def (PutVarArrayRef): New procedure.
(IsVarArrayRef): New procedure function.
* gm2-compiler/SymbolTable.mod (SymVar): ArrayRef new field.
(MakeVar): Set ArrayRef to FALSE.
(PutVarArrayRef): New procedure.
(IsVarArrayRef): New procedure function.
* gm2-gcc/init.cc (_M2_M2SymInit_init): New prototype.
(init_PerCompilationInit): Add call to _M2_M2SymInit_init.
* gm2-gcc/m2options.h (M2Options_SetUninitVariableChecking):
New definition.
* gm2-lang.cc (gm2_langhook_handle_option): Add new case
OPT_Wuninit_variable_checking_.
* lang.opt: Wuninit-variable-checking= new entry.
gcc/testsuite/ChangeLog:
* gm2/switches/uninit-variable-checking/cascade/fail/cascadedif.mod: New test.
* gm2/switches/uninit-variable-checking/cascade/fail/switches-uninit-variable-checking-cascade-fail.exp:
New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
Hi, Richard and Richi.
This patch is adding cond_len_* operations pattern for target support loop control with length.
These patterns will be used in these following case:
1. Integer division:
void
f (int32_t *restrict a, int32_t *restrict b, int32_t *restrict c, int n)
{
for (int i = 0; i < n; ++i)
{
a[i] = b[i] / c[i];
}
}
ARM SVE IR:
...
max_mask_36 = .WHILE_ULT (0, bnd.5_32, { 0, ... });
Loop:
...
# loop_mask_29 = PHI <next_mask_37(4), max_mask_36(3)>
...
vect__4.8_28 = .MASK_LOAD (_33, 32B, loop_mask_29);
...
vect__6.11_25 = .MASK_LOAD (_20, 32B, loop_mask_29);
vect__8.12_24 = .COND_DIV (loop_mask_29, vect__4.8_28, vect__6.11_25, vect__4.8_28);
...
.MASK_STORE (_1, 32B, loop_mask_29, vect__8.12_24);
...
next_mask_37 = .WHILE_ULT (_2, bnd.5_32, { 0, ... });
...
For target like RVV who support loop control with length, we want to see IR as follows:
Loop:
...
# loop_len_29 = SELECT_VL
...
vect__4.8_28 = .LEN_MASK_LOAD (_33, 32B, loop_len_29);
...
vect__6.11_25 = .LEN_MASK_LOAD (_20, 32B, loop_len_29);
vect__8.12_24 = .COND_LEN_DIV (dummp_mask, vect__4.8_28, vect__6.11_25, vect__4.8_28, loop_len_29, bias);
...
.LEN_MASK_STORE (_1, 32B, loop_len_29, vect__8.12_24);
...
next_mask_37 = .WHILE_ULT (_2, bnd.5_32, { 0, ... });
...
Notice here, we use dummp_mask = { -1, -1, .... , -1 }
2. Integer conditional division:
Similar case with (1) but with condtion:
void
f (int32_t *restrict a, int32_t *restrict b, int32_t *restrict c, int32_t * cond, int n)
{
for (int i = 0; i < n; ++i)
{
if (cond[i])
a[i] = b[i] / c[i];
}
}
ARM SVE:
...
max_mask_76 = .WHILE_ULT (0, bnd.6_52, { 0, ... });
Loop:
...
# loop_mask_55 = PHI <next_mask_77(5), max_mask_76(4)>
...
vect__4.9_56 = .MASK_LOAD (_51, 32B, loop_mask_55);
mask__29.10_58 = vect__4.9_56 != { 0, ... };
vec_mask_and_61 = loop_mask_55 & mask__29.10_58;
...
vect__6.13_62 = .MASK_LOAD (_24, 32B, vec_mask_and_61);
...
vect__8.16_66 = .MASK_LOAD (_1, 32B, vec_mask_and_61);
vect__10.17_68 = .COND_DIV (vec_mask_and_61, vect__6.13_62, vect__8.16_66, vect__6.13_62);
...
.MASK_STORE (_2, 32B, vec_mask_and_61, vect__10.17_68);
...
next_mask_77 = .WHILE_ULT (_3, bnd.6_52, { 0, ... });
Here, ARM SVE use vec_mask_and_61 = loop_mask_55 & mask__29.10_58; to gurantee the correct result.
However, target with length control can not perform this elegant flow, for RVV, we would expect:
Loop:
...
loop_len_55 = SELECT_VL
...
mask__29.10_58 = vect__4.9_56 != { 0, ... };
...
vect__10.17_68 = .COND_LEN_DIV (mask__29.10_58, vect__6.13_62, vect__8.16_66, vect__6.13_62, loop_len_55, bias);
...
Here we expect COND_LEN_DIV predicated by a real mask which is the outcome of comparison: mask__29.10_58 = vect__4.9_56 != { 0, ... };
and a real length which is produced by loop control : loop_len_55 = SELECT_VL
3. conditional Floating-point operations (no -ffast-math):
void
f (float *restrict a, float *restrict b, int32_t *restrict cond, int n)
{
for (int i = 0; i < n; ++i)
{
if (cond[i])
a[i] = b[i] + a[i];
}
}
ARM SVE IR:
max_mask_70 = .WHILE_ULT (0, bnd.6_46, { 0, ... });
...
# loop_mask_49 = PHI <next_mask_71(4), max_mask_70(3)>
...
mask__27.10_52 = vect__4.9_50 != { 0, ... };
vec_mask_and_55 = loop_mask_49 & mask__27.10_52;
...
vect__9.17_62 = .COND_ADD (vec_mask_and_55, vect__6.13_56, vect__8.16_60, vect__6.13_56);
...
next_mask_71 = .WHILE_ULT (_22, bnd.6_46, { 0, ... });
...
For RVV, we would expect IR:
...
loop_len_49 = SELECT_VL
...
mask__27.10_52 = vect__4.9_50 != { 0, ... };
...
vect__9.17_62 = .COND_LEN_ADD (mask__27.10_52, vect__6.13_56, vect__8.16_60, vect__6.13_56, loop_len_49, bias);
...
4. Conditional un-ordered reduction:
int32_t
f (int32_t *restrict a,
int32_t *restrict cond, int n)
{
int32_t result = 0;
for (int i = 0; i < n; ++i)
{
if (cond[i])
result += a[i];
}
return result;
}
ARM SVE IR:
Loop:
# vect_result_18.7_37 = PHI <vect__33.16_51(4), { 0, ... }(3)>
...
# loop_mask_40 = PHI <next_mask_58(4), max_mask_57(3)>
...
mask__17.11_43 = vect__4.10_41 != { 0, ... };
vec_mask_and_46 = loop_mask_40 & mask__17.11_43;
...
vect__33.16_51 = .COND_ADD (vec_mask_and_46, vect_result_18.7_37, vect__7.14_47, vect_result_18.7_37);
...
next_mask_58 = .WHILE_ULT (_15, bnd.6_36, { 0, ... });
...
Epilogue:
_53 = .REDUC_PLUS (vect__33.16_51); [tail call]
For RVV, we expect:
Loop:
# vect_result_18.7_37 = PHI <vect__33.16_51(4), { 0, ... }(3)>
...
loop_len_40 = SELECT_VL
...
mask__17.11_43 = vect__4.10_41 != { 0, ... };
...
vect__33.16_51 = .COND_LEN_ADD (mask__17.11_43, vect_result_18.7_37, vect__7.14_47, vect_result_18.7_37, loop_len_40, bias);
...
next_mask_58 = .WHILE_ULT (_15, bnd.6_36, { 0, ... });
...
Epilogue:
_53 = .REDUC_PLUS (vect__33.16_51); [tail call]
I name these patterns as "cond_len_*" since I want the length operand comes after mask operand and all other operands except length operand
same order as "cond_*" patterns. Such order will make life easier in the following loop vectorizer support.
gcc/ChangeLog:
* doc/md.texi: Add COND_LEN_* operations for loop control with length.
* internal-fn.cc (cond_len_unary_direct): Ditto.
(cond_len_binary_direct): Ditto.
(cond_len_ternary_direct): Ditto.
(expand_cond_len_unary_optab_fn): Ditto.
(expand_cond_len_binary_optab_fn): Ditto.
(expand_cond_len_ternary_optab_fn): Ditto.
(direct_cond_len_unary_optab_supported_p): Ditto.
(direct_cond_len_binary_optab_supported_p): Ditto.
(direct_cond_len_ternary_optab_supported_p): Ditto.
* internal-fn.def (COND_LEN_ADD): Ditto.
(COND_LEN_SUB): Ditto.
(COND_LEN_MUL): Ditto.
(COND_LEN_DIV): Ditto.
(COND_LEN_MOD): Ditto.
(COND_LEN_RDIV): Ditto.
(COND_LEN_MIN): Ditto.
(COND_LEN_MAX): Ditto.
(COND_LEN_FMIN): Ditto.
(COND_LEN_FMAX): Ditto.
(COND_LEN_AND): Ditto.
(COND_LEN_IOR): Ditto.
(COND_LEN_XOR): Ditto.
(COND_LEN_SHL): Ditto.
(COND_LEN_SHR): Ditto.
(COND_LEN_FMA): Ditto.
(COND_LEN_FMS): Ditto.
(COND_LEN_FNMA): Ditto.
(COND_LEN_FNMS): Ditto.
(COND_LEN_NEG): Ditto.
* optabs.def (OPTAB_D): Ditto.
|
|
Document `z` and `i` operand modifiers, we have much more modifiers
other than those two, but they are the only two implement on both
GCC and LLVM, consider the compatibility I would like to document those
two first, and then review other modifiers later to see if any other should
expose and implement on RISC-V LLVM too.
gcc/ChangeLog:
* doc/extend.texi (RISC-V Operand Modifiers): New.
|
|
The arm_v8_1m_main_cde_mve_fp family of effective targets was not
documented when it was introduced.
2023-07-07 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* doc/sourcebuild.texi (arm_v8_1m_main_cde_mve_fp): Document.
|
|
gcc/ChangeLog:
PR c++/110595
PR c++/110596
* doc/invoke.texi (Warning Options): Fix typos.
|
|
gcc/ChangeLog:
* doc/extend.texi (ARC Built-in Functions): Update documentation
with missing builtins.
|
|
We're (currently) not aware of any actual use of 'ht_identifier's with NUL
characters embedded; its 'len' field appears to exist for optimization
purposes, since "forever". Before 'struct ht_identifier' was added in
commit 2a967f3d3a45294640e155381ef549e0b8090ad4 (Subversion r42334), we had in
'gcc/cpplib.h:struct cpp_hashnode': 'unsigned short len', or earlier 'length',
earlier in 'gcc/cpphash.h:struct hashnode': 'unsigned short length', earlier
'size_t length' with comment: "length of token, for quick comparison", earlier
'int length', ever since the 'gcc/cpp*' files were added in
commit 7f2935c734c36f84ab62b20a04de465e19061333 (Subversion r9191).
This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a
"pch: Fix streaming of strings with embedded null bytes".
gcc/
* doc/gty.texi (GTY Options) <string_length>: Enhance.
libcpp/
* include/symtab.h (struct ht_identifier): Document different
rationale.
|
|
This is preparational for another thing that I'm working on. No change in
behavior -- other than a more explicit error message.
The 'string_length' option currently is not supported for (fields in) global
variables. For example, if we apply the following (made-up) changes:
--- gcc/c-family/c-cppbuiltin.cc
+++ gcc/c-family/c-cppbuiltin.cc
@@ -1777 +1777 @@ struct GTY(()) lazy_hex_fp_value_struct
- const char *hex_str;
+ const char * GTY((string_length("strlen(%h.hex_str) + 1"))) hex_str;
--- gcc/varasm.cc
+++ gcc/varasm.cc
@@ -66 +66 @@ along with GCC; see the file COPYING3. If not see
-extern GTY(()) const char *first_global_object_name;
+extern GTY((string_length("strlen(%h.first_global_object_name) + 1"))) const char *first_global_object_name;
..., we get:
[...]
build/gengtype \
-S [...]/source-gcc/gcc -I gtyp-input.list -w tmp-gtype.state
/bin/sh [...]/source-gcc/gcc/../move-if-change tmp-gtype.state gtype.state
build/gengtype \
-r gtype.state
[...]/source-gcc/gcc/varasm.cc:66: global `first_global_object_name' has unknown option `string_length'
[...]/source-gcc/gcc/c-family/c-cppbuiltin.cc:1789: field `hex_str' of global `lazy_hex_fp_values[0]' has unknown option `string_length'
make[2]: *** [Makefile:2890: s-gtype] Error 1
[...]
These errors occur when writing "GC roots", where -- per my understanding --
'string_length' isn't relevant for actual GC purposes. However, like
elsewhere, it is for PCH purposes, and simply accepting 'string_length' here
isn't sufficient: we'll still get '(gt_pointer_walker) >_pch_n_S' used in the
'struct ggc_root_tab' instances, and there's no easy way to change that to
instead use 'gt_pch_n_S2' with explicit 'size_t string_len' argument. (At
least not sufficiently easy to justify spending any further time on, given that
I don't have an actual use for that feature.)
So, until an actual need arises, and/or to avoid the next person looking into
this having to figure out the same thing again, let's just document this
limitation:
[...]/source-gcc/gcc/varasm.cc:66: option `string_length' not supported for global `first_global_object_name'
[...]/source-gcc/gcc/c-family/c-cppbuiltin.cc:1789: option `string_length' not supported for field `hex_str' of global `lazy_hex_fp_values[0]'
This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a
"pch: Fix streaming of strings with embedded null bytes".
gcc/
* gengtype.cc (write_root, write_roots): Explicitly reject
'string_length' option.
* doc/gty.texi (GTY Options) <string_length>: Document.
|
|
gcc/ChangeLog:
* doc/extend.texi: Move x86 inlining rule to a new subsubsection
and add description for inling of function with arch and tune
attributes.
|
|
gcc/ChangeLog:
* doc/contrib.texi (Contributors): Update my entry.
|
|
In gimple-isel we already deduce a vec_set pattern from an
ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a
vec_extract.
The code is largely similar to the vec_set one
including the addition of a can_vec_extract_var_idx_p function
in optabs.cc to check if the backend can handle a register
operand as index. We already have can_vec_extract in
optabs-query but that one checks whether we can extract
specific modes.
With the introduction of an internal function for vec_extract
the expander must not FAIL. For vec_set this has already been
the case so adjust the documentation accordingly.
Additionally, clarify the wording of the vector-vector case for
vec_extract.
gcc/ChangeLog:
* doc/md.texi: Document that vec_set and vec_extract must not
fail.
* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
(gimple_expand_vec_set_extract_expr): ...to this.
(gimple_expand_vec_exprs): Call renamed function.
* internal-fn.cc (vec_extract_direct): Add.
(expand_vec_extract_optab_fn): New function to expand
vec_extract optab.
(direct_vec_extract_optab_supported_p): Add.
* internal-fn.def (VEC_EXTRACT): Add.
* optabs.cc (can_vec_extract_var_idx_p): New function.
* optabs.h (can_vec_extract_var_idx_p): Declare.
|
|
Enable ENQCMD and UINTR for march=sierraforest according to Intel ISE
https://cdrdv2.intel.com/v1/dl/getContent/671368
gcc/ChangeLog
* config/i386/i386.h: Add PTA_ENQCMD and PTA_UINTR to PTA_SIERRAFOREST.
* doc/invoke.texi: Update new isa to march=sierraforest and grandridge.
|
|
Hi, Richi and Richard.
Base one the review comments from Richard:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html
I change len_mask_gather_load/len_mask_scatter_store order into:
{len,bias,mask}
We adjust adding len and mask using using add_len_and_mask_args
which is same as partial_load/parial_store.
Now, the codes become more reasonable and easier maintain.
This patch is adding LEN_MASK_{GATHER_LOAD,SCATTER_STORE} to allow targets
handle flow control by mask and loop control by length on gather/scatter memory
operations. Consider this following case:
void
f (uint8_t *restrict a,
uint8_t *restrict b, int n,
int base, int step,
int *restrict cond)
{
for (int i = 0; i < n; ++i)
{
if (cond[i])
a[i * step + base] = b[i * step + base];
}
}
We hope RVV can vectorize such case into following IR:
loop_len = SELECT_VL
control_mask = comparison
v = LEN_MASK_GATHER_LOAD (.., loop_len, bias, control_mask)
LEN_SCATTER_STORE (... v, ..., loop_len, bias, control_mask)
This patch doesn't apply such patterns into vectorizer, just add patterns
and update the documents.
Will send patch which apply such patterns into vectorizer soon after this
patch is approved.
Ok for trunk?
gcc/ChangeLog:
* doc/md.texi: Add len_mask_gather_load/len_mask_scatter_store.
* internal-fn.cc (expand_scatter_store_optab_fn): Ditto.
(expand_gather_load_optab_fn): Ditto.
(internal_load_fn_p): Ditto.
(internal_store_fn_p): Ditto.
(internal_gather_scatter_fn_p): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
* internal-fn.def (LEN_MASK_GATHER_LOAD): Ditto.
(LEN_MASK_SCATTER_STORE): Ditto.
* optabs.def (OPTAB_CD): Ditto.
|
|
Hi, Richard. I fix the order as you suggeted.
Before this patch, the order is {len,mask,bias}.
Now, after this patch, the order becomes {len,bias,mask}.
Since you said we should not need 'internal_fn_bias_index', the bias index should always be the len index + 1.
I notice LEN_STORE order is {len,vector,bias}, to make them consistent, I reorder into LEN_STORE {len,bias,vector}.
Just like MASK_STORE {mask,vector}.
Ok for trunk ?
gcc/ChangeLog:
* config/riscv/autovec.md: Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
* config/riscv/riscv-v.cc (expand_load_store): Ditto.
* doc/md.texi: Ditto.
* gimple-fold.cc (gimple_fold_partial_load_store_mem_ref): Ditto.
* internal-fn.cc (len_maskload_direct): Ditto.
(len_maskstore_direct): Ditto.
(add_len_and_mask_args): New function.
(expand_partial_load_optab_fn): Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
(expand_partial_store_optab_fn): Ditto.
(internal_fn_len_index): New function.
(internal_fn_mask_index): Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
(internal_fn_stored_value_index): Ditto.
(internal_len_load_store_bias): Ditto.
* internal-fn.h (internal_fn_len_index): New function.
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Change order of
LEN_MASK_LOAD/LEN_MASK_STORE/LEN_LOAD/LEN_STORE arguments.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
|
|
The modula-2 static analysis incorrectly identifies variables as
uninitialized if they are initialized within a WITH statement. This bug
fix re-implements the variable static analysis and will detect simple
pointer record fields being accessed before being initialized.
The static analysis is limited to the first basic block in a procedure.
It does not check variant records, arrays or sets. A new option
-Wuninit-variable-checking will turn on the new semantic checking
(-Wall also enables the new checking).
gcc/ChangeLog:
PR modula2/110125
* doc/gm2.texi (Semantic checking): Include examples using
-Wuninit-variable-checking.
gcc/m2/ChangeLog:
PR modula2/110125
* Make-lang.in (GM2-COMP-BOOT-DEFS): Add M2SymInit.def.
(GM2-COMP-BOOT-MODS): Add M2SymInit.mod.
* gm2-compiler/M2BasicBlock.mod: Formatting changes.
* gm2-compiler/M2Code.mod: Remove import of VariableAnalysis from
M2Quads. Import VariableAnalysis from M2SymInit.mod.
* gm2-compiler/M2GCCDeclare.mod (PrintVerboseFromList):
Add debugging print for a component.
(TypeConstFullyDeclared): Call RememberType for every type.
* gm2-compiler/M2GenGCC.mod (CodeReturnValue): Add parameter to
GetQuadOtok.
(CodeBecomes): Add parameter to GetQuadOtok.
(CodeXIndr): Add parameter to GetQuadOtok.
* gm2-compiler/M2Optimize.mod (ReduceBranch): Reformat and
preserve operand token positions when reducing the branch
quadruples.
(ReduceGoto): Reformat.
(FoldMultipleGoto): Reformat.
(KnownReachable): Reformat.
* gm2-compiler/M2Options.def (UninitVariableChecking): New
variable declared and exported.
(SetUninitVariableChecking): New procedure.
* gm2-compiler/M2Options.mod (SetWall): Set
UninitVariableChecking.
(SetUninitVariableChecking): New procedure.
* gm2-compiler/M2Quads.def (PutQuadOtok): Exported and declared.
(VariableAnalysis): Removed.
* gm2-compiler/M2Quads.mod (PutQuadOtok): New procedure.
(doVal): Reformatted.
(MarkAsWrite): Reformatted.
(MarkArrayAsWritten): Reformatted.
(doIndrX): Use PutQuadOtok.
(MakeRightValue): Use GenQuadOtok.
(MakeLeftValue): Use GenQuadOtok.
(CheckReadBeforeInitialized): Remove.
(IsNeverAltered): Reformat.
(DebugLocation): New procedure.
(BuildDesignatorPointer): Use GenQuadO to preserve operand token
position.
(BuildRelOp): Use GenQuadOtok ditto.
* gm2-compiler/SymbolTable.def (VarCheckReadInit): New procedure.
(VarInitState): New procedure.
(PutVarInitialized): New procedure.
(PutVarFieldInitialized): New procedure function.
(GetVarFieldInitialized): New procedure function.
(PrintInitialized): New procedure.
* gm2-compiler/SymbolTable.mod (VarCheckReadInit): New procedure.
(VarInitState): New procedure.
(PutVarInitialized): New procedure.
(PutVarFieldInitialized): New procedure function.
(GetVarFieldInitialized): New procedure function.
(PrintInitialized): New procedure.
(LRInitDesc): New type.
(SymVar): InitState new field.
(MakeVar): Initialize InitState.
* gm2-gcc/m2options.h (M2Options_SetUninitVariableChecking):
New function declaration.
* gm2-lang.cc (gm2_langhook_handle_option): Detect
OPT_Wuninit_variable_checking and call SetUninitVariableChecking.
* lang.opt: Add Wuninit-variable-checking.
* gm2-compiler/M2SymInit.def: New file.
* gm2-compiler/M2SymInit.mod: New file.
gcc/testsuite/ChangeLog:
PR modula2/110125
* gm2/switches/uninit-variable-checking/fail/testinit.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testlarge.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testlarge2.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testrecinit.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testrecinit2.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testrecinit5.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testsmallrec.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testsmallrec2.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testsmallvec.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testvarinit.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testwithnoptr.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testwithptr.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testwithptr2.mod: New test.
* gm2/switches/uninit-variable-checking/fail/testwithptr3.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testrecinit3.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testrecinit5.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testsmallrec.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testsmallrec2.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testvarinit.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testwithptr.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testwithptr2.mod: New test.
* gm2/switches/uninit-variable-checking/pass/testwithptr3.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
The MIPS16e2 ASE is an enhancement to the MIPS16e ASE,
which includes all MIPS16e instructions, with some addition.
It defines new special instructions for increasing
code density (e.g. Extend, PC-relative instructions, etc.).
This patch adds basic support for mips16e2 used by the
following series of patches.
gcc/ChangeLog:
* config/mips/mips.cc(mips_file_start): Add mips16e2 info
for output file.
* config/mips/mips.h(__mips_mips16e2): Defined a new
predefine macro.
(ISA_HAS_MIPS16E2): Defined a new macro.
(ASM_SPEC): Pass mmips16e2 to the assembler.
* config/mips/mips.opt: Add -m(no-)mips16e2 option.
* config/mips/predicates.md: Add clause for TARGET_MIPS16E2.
* doc/invoke.texi: Add -m(no-)mips16e2 option..
gcc/testsuite/ChangeLog:
* gcc.target/mips/mips.exp(mips_option_groups): Add -mmips16e2
option.
(mips-dg-init): Handle the recognization of mips16e2 targets.
(mips-dg-options): Add dependencies for mips16e2.
|
|
This support the -fconstant-cfstrings option as used by clang (and
expect by some build scripts) as an alias to the target-specific
-mconstant-cfstrings.
The documentation is also updated to reflect that the 'f' option is
only available on Darwin, and to add the 'm' option to the Darwin
section of the invocation text.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR target/108743
gcc/ChangeLog:
* config/darwin.opt: Add fconstant-cfstrings alias to
mconstant-cfstrings.
* doc/invoke.texi: Amend invocation descriptions to reflect
that the fconstant-cfstrings is a target-option alias and to
add the missing mconstant-cfstrings option description to the
Darwin section.
|
|
This updates vect_recog_abd_pattern to recognize the widening
variant of absolute difference (ABDL, ABDL2).
gcc/ChangeLog:
* internal-fn.def (VEC_WIDEN_ABD): New internal hilo optab.
* optabs.def (vec_widen_sabd_optab,
vec_widen_sabd_hi_optab, vec_widen_sabd_lo_optab,
vec_widen_sabd_odd_even, vec_widen_sabd_even_optab,
vec_widen_uabd_optab,
vec_widen_uabd_hi_optab, vec_widen_uabd_lo_optab,
vec_widen_uabd_odd_even, vec_widen_uabd_even_optab):
New optabs.
* doc/md.texi: Document them.
* tree-vect-patterns.cc (vect_recog_abd_pattern): Update to
to build a VEC_WIDEN_ABD call if the input precision is smaller
than the precision of the output.
(vect_recog_widen_abd_pattern): Should an ABD expression be
found preceeding an extension, replace the two with a
VEC_WIDEN_ABD.
|
|
on a structure with a C99 flexible array member being nested in
another structure.
"The GCC extension accepts a structure containing an ISO C99 "flexible array
member", or a union containing such a structure (possibly recursively)
to be a member of a structure.
There are two situations:
* A structure containing a C99 flexible array member, or a union
containing such a structure, is the last field of another structure,
for example:
struct flex { int length; char data[]; };
union union_flex { int others; struct flex f; };
struct out_flex_struct { int m; struct flex flex_data; };
struct out_flex_union { int n; union union_flex flex_data; };
In the above, both 'out_flex_struct.flex_data.data[]' and
'out_flex_union.flex_data.f.data[]' are considered as flexible
arrays too.
* A structure containing a C99 flexible array member, or a union
containing such a structure, is not the last field of another structure,
for example:
struct flex { int length; char data[]; };
struct mid_flex { int m; struct flex flex_data; int n; };
In the above, accessing a member of the array 'mid_flex.flex_data.data[]'
might have undefined behavior. Compilers do not handle such a case
consistently, Any code relying on this case should be modified to ensure
that flexible array members only end up at the ends of structures.
Please use the warning option '-Wflex-array-member-not-at-end' to
identify all such cases in the source code and modify them. This extension
is now deprecated.
"
PR c/77650
gcc/c-family/ChangeLog:
* c.opt: New option -Wflex-array-member-not-at-end.
gcc/c/ChangeLog:
* c-decl.cc (finish_struct): Issue warnings for new option.
gcc/ChangeLog:
* doc/extend.texi: Document GCC extension on a structure containing
a flexible array member to be a member of another structure.
gcc/testsuite/ChangeLog:
* gcc.dg/variable-sized-type-flex-array.c: New test.
|
|
Introduce 'leafy' to auto-select between 'used' and 'all' for leaf and
nonleaf functions, respectively.
for gcc/ChangeLog
* doc/extend.texi (zero-call-used-regs): Document leafy and
variants thereof.
* flag-types.h (zero_regs_flags): Add LEAFY_MODE, as well as
LEAFY and variants.
* function.cc (gen_call_ued_regs_seq): Set only_used for leaf
functions in leafy mode.
* opts.cc (zero_call_used_regs_opts): Add leafy and variants.
for gcc/testsuite/ChangeLog
* c-c++-common/zero-scratch-regs-leafy-1.c: New.
* c-c++-common/zero-scratch-regs-leafy-2.c: New.
* gcc.target/i386/zero-scratch-regs-leafy-1.c: New.
* gcc.target/i386/zero-scratch-regs-leafy-2.c: New.
|
|
There is a missing space between the return type and the name
which causes the name not to be outputted in the html docs.
Committed as obvious after building html docs.
gcc/ChangeLog:
* doc/extend.texi (__builtin_alloca_with_align_and_max): Fix
defbuiltin usage.
|