aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-06-22rust: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy1-3/+3
This was noticed when fixing the gccgo usage of the macro, the rust usage is very similar. TARGET_AIX is defined as a non-zero value on linux/powerpc64le which may cause unexpected behavior. TARGET_AIX_OS should be used to toggle AIX specific behavior. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/rust/ * rust-object-export.cc [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS.
2023-06-22go: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy2-7/+7
TARGET_AIX is defined to a non-zero value on linux and maybe other powerpc64le targets. This leads to unexpected behavior such as dropping the .go_export section when linking a shared library on linux/powerpc64le. Instead, use TARGET_AIX_OS to toggle AIX specific behavior. Fixes golang/go#60798. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/go/ * go-backend.cc [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS. * go-lang.cc: Likewise.
2023-06-22configure: Implement --enable-host-bind-nowMarek Polacek3-3/+36
As promised in the --enable-host-pie patch, this patch adds another configure option, --enable-host-bind-now, which adds -z now when linking the compiler executables in order to extend hardening. BIND_NOW with RELRO allows the GOT to be marked RO; this prevents GOT modification attacks. This option does not affect linking of target libraries; you can use LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW. With this patch: $ readelf -Wd cc1{,plus,obj,gm2} f951 lto1 cpp rust1 gnat1 | grep FLAGS 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE c++tools/ChangeLog: * configure.ac (--enable-host-bind-now): New check. * configure: Regenerate. gcc/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Add -Wl,-z,now to LD_PICFLAG if --enable-host-bind-now. * configure: Regenerate. * doc/install.texi: Document --enable-host-bind-now. lto-plugin/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Link with -z,now. * configure: Regenerate.
2023-06-22Change fma_reassoc_width tuning for ampere1Di Zhao OS1-1/+1
This patch enables reassociation of floating-point additions on ampere1. This brings about 1% overall benefit on spec2017 fprate cases. (There are minor regressions in 510.parest_r and 508.namd_r, analyzed here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110279 .) gcc/ChangeLog: * config/aarch64/aarch64.cc: Change fma_reassoc_width for ampere1.
2023-06-22tree-optimization/110332 - fix ICE with phipropRichard Biener4-3/+46
The following fixes an ICE that occurs when we visit an edge inserted load from the code validating correctness for inserting an aggregate copy there. We can simply skip those loads here. PR tree-optimization/110332 * tree-ssa-phiprop.cc (propagate_with_phi): Always check aliasing with edge inserted loads. * g++.dg/torture/pr110332.C: New testcase. * gcc.dg/torture/pr110332-1.c: Likewise. * gcc.dg/torture/pr110332-2.c: Likewise.
2023-06-22i386: Convert ptestz of pandn into ptestc.Roger Sayle11-10/+259
This patch is the next installment in a set of backend patches around improvements to ptest/vptest. A previous patch optimized the sequence t=pand(x,y); ptestz(t,t) into the equivalent ptestz(x,y), using the property that ZF is set to (X&Y) == 0. This patch performs a similar transformation, converting t=pandn(x,y); ptestz(t,t) into the (almost) equivalent ptestc(y,x), using the property that the CF flags is set to (~X&Y) == 0. The tricky bit is that this sets the CF flag instead of the ZF flag, so we can only perform this transformation when we can also convert the flags consumer, as well as the producer. For the test case: int foo (__m128i x, __m128i y) { __m128i a = x & ~y; return __builtin_ia32_ptestz128 (a, a); } With -O2 -msse4.1 we previously generated: foo: pandn %xmm0, %xmm1 xorl %eax, %eax ptest %xmm1, %xmm1 sete %al ret with this patch we now generate: foo: xorl %eax, %eax ptest %xmm0, %xmm1 setc %al ret At the same time, this patch also provides alternative fixes for PR target/109973 and PR target/110118, by recognizing that ptestc(x,x) always sets the carry flag (X&~X is always zero). This is achieved both by recognizing the special case in ix86_expand_sse_ptest and with a splitter to convert an eligible ptest into an stc. 2023-06-22 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_sse_ptest): Recognize expansion of ptestc with equal operands as producing const1_rtx. * config/i386/i386.cc (ix86_rtx_costs): Provide accurate cost estimates of UNSPEC_PTEST, where the ptest performs the PAND or PAND of its operands. * config/i386/sse.md (define_split): Transform CCCmode UNSPEC_PTEST of reg_equal_p operands into an x86_stc instruction. (define_split): Split pandn/ptestz/set{n?}e into ptestc/set{n?}c. (define_split): Similar to above for strict_low_part destinations. (define_split): Split pandn/ptestz/j{n?}e into ptestc/j{n?}c. gcc/testsuite/ChangeLog * gcc.target/i386/avx-vptest-4.c: New test case. * gcc.target/i386/avx-vptest-5.c: Likewise. * gcc.target/i386/avx-vptest-6.c: Likewise. * gcc.target/i386/pr109973-1.c: Update test case. * gcc.target/i386/pr109973-2.c: Likewise. * gcc.target/i386/sse4_1-ptest-4.c: New test case. * gcc.target/i386/sse4_1-ptest-5.c: Likewise. * gcc.target/i386/sse4_1-ptest-6.c: Likewise.
2023-06-21analyzer: add text-art visualizations of out-of-bounds accesses [PR106626]David Malcolm54-148/+4383
This patch extends -Wanalyzer-out-of-bounds so that, where possible, it will emit a text art diagram visualizing the spatial relationship between (a) the memory region that the analyzer predicts would be accessed, versus (b) the range of memory that is valid to access - whether they overlap, are touching, are close or far apart; which one is before or after in memory, the relative sizes involved, the direction of the access (read vs write), and, in some cases, the values of data involved. This diagram can be suppressed using -fdiagnostics-text-art-charset=none. For example, given: int32_t arr[10]; int32_t int_arr_read_element_before_start_far(void) { return arr[-100]; } it emits: demo-1.c: In function ‘int_arr_read_element_before_start_far’: demo-1.c:7:13: warning: buffer under-read [CWE-127] [-Wanalyzer-out-of-bounds] 7 | return arr[-100]; | ~~~^~~~~~ ‘int_arr_read_element_before_start_far’: event 1 | | 7 | return arr[-100]; | | ~~~^~~~~~ | | | | | (1) out-of-bounds read from byte -400 till byte -397 but ‘arr’ starts at byte 0 | demo-1.c:7:13: note: valid subscripts for ‘arr’ are ‘[0]’ to ‘[9]’ ┌───────────────────────────┐ │read of ‘int32_t’ (4 bytes)│ └───────────────────────────┘ ^ │ │ ┌───────────────────────────┐ ┌────────┬────────┬─────────┐ │ │ │ [0] │ ... │ [9] │ │ before valid range │ ├────────┴────────┴─────────┤ │ │ │‘arr’ (type: ‘int32_t[10]’)│ └───────────────────────────┘ └───────────────────────────┘ ├─────────────┬─────────────┤├─────┬──────┤├─────────────┬─────────────┤ │ │ │ ╭────────────┴───────────╮ ╭────┴────╮ ╭───────┴──────╮ │⚠️ under-read of 4 bytes│ │396 bytes│ │size: 40 bytes│ ╰────────────────────────╯ ╰─────────╯ ╰──────────────╯ and given: #include <string.h> void test_non_ascii () { char buf[5]; strcpy (buf, "文字化け"); } it emits: demo-2.c: In function ‘test_non_ascii’: demo-2.c:7:3: warning: stack-based buffer overflow [CWE-121] [-Wanalyzer-out-of-bounds] 7 | strcpy (buf, "文字化け"); | ^~~~~~~~~~~~~~~~~~~~~~~~ ‘test_non_ascii’: events 1-2 | | 6 | char buf[5]; | | ^~~ | | | | | (1) capacity: 5 bytes | 7 | strcpy (buf, "文字化け"); | | ~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) out-of-bounds write from byte 5 till byte 12 but ‘buf’ ends at byte 5 | demo-2.c:7:3: note: write of 8 bytes to beyond the end of ‘buf’ 7 | strcpy (buf, "文字化け"); | ^~~~~~~~~~~~~~~~~~~~~~~~ demo-2.c:7:3: note: valid subscripts for ‘buf’ are ‘[0]’ to ‘[4]’ ┌─────┬─────┬─────┬────┬────┐┌────┬────┬────┬────┬────┬────┬────┬──────┐ │ [0] │ [1] │ [2] │[3] │[4] ││[5] │[6] │[7] │[8] │[9] │[10]│[11]│ [12] │ ├─────┼─────┼─────┼────┼────┤├────┼────┼────┼────┼────┼────┼────┼──────┤ │0xe6 │0x96 │0x87 │0xe5│0xad││0x97│0xe5│0x8c│0x96│0xe3│0x81│0x91│ 0x00 │ ├─────┴─────┴─────┼────┴────┴┴────┼────┴────┴────┼────┴────┴────┼──────┤ │ U+6587 │ U+5b57 │ U+5316 │ U+3051 │U+0000│ ├─────────────────┼───────────────┼──────────────┼──────────────┼──────┤ │ 文 │ 字 │ 化 │ け │ NUL │ ├─────────────────┴───────────────┴──────────────┴──────────────┴──────┤ │ string literal (type: ‘char[13]’) │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ v v v v v v v v v v v v v ┌─────┬────────────────┬────┐┌─────────────────────────────────────────┐ │ [0] │ ... │[4] ││ │ ├─────┴────────────────┴────┤│ after valid range │ │ ‘buf’ (type: ‘char[5]’) ││ │ └───────────────────────────┘└─────────────────────────────────────────┘ ├─────────────┬─────────────┤├────────────────────┬────────────────────┤ │ │ ╭────────┴────────╮ ╭───────────┴──────────╮ │capacity: 5 bytes│ │⚠️ overflow of 8 bytes│ ╰─────────────────╯ ╰──────────────────────╯ showing that the overflow occurs partway through the UTF-8 encoding of the U+5b57 code point. There are lots more examples in the test suite. It doesn't show up in this email, but the above diagrams are colorized to constrast the valid and invalid access ranges. gcc/ChangeLog: PR analyzer/106626 * Makefile.in (ANALYZER_OBJS): Add analyzer/access-diagram.o. * doc/invoke.texi (Wanalyzer-out-of-bounds): Add description of text art. (fanalyzer-debug-text-art): New. gcc/analyzer/ChangeLog: PR analyzer/106626 * access-diagram.cc: New file. * access-diagram.h: New file. * analyzer.h (class region_offset): Add default ctor. (region_offset::make_byte_offset): New decl. (region_offset::concrete_p): New. (region_offset::get_concrete_byte_offset): New. (region_offset::calc_symbolic_bit_offset): New decl. (region_offset::calc_symbolic_byte_offset): New decl. (region_offset::dump_to_pp): New decl. (region_offset::dump): New decl. (operator<, operator<=, operator>, operator>=): New decls for region_offset. * analyzer.opt (-param=analyzer-text-art-string-ellipsis-threshold=): New. (-param=analyzer-text-art-string-ellipsis-head-len=): New. (-param=analyzer-text-art-string-ellipsis-tail-len=): New. (-param=analyzer-text-art-ideal-canvas-width=): New. (fanalyzer-debug-text-art): New. * bounds-checking.cc: Include "intl.h", "diagnostic-diagram.h", and "analyzer/access-diagram.h". (class out_of_bounds::oob_region_creation_event_capacity): New. (out_of_bounds::out_of_bounds): Add "model" and "sval_hint" params. (out_of_bounds::mark_interesting_stuff): Use the base region. (out_of_bounds::add_region_creation_events): Use oob_region_creation_event_capacity. (out_of_bounds::get_dir): New pure vfunc. (out_of_bounds::maybe_show_notes): New. (out_of_bounds::maybe_show_diagram): New. (out_of_bounds::make_access_diagram): New. (out_of_bounds::m_model): New field. (out_of_bounds::m_sval_hint): New field. (out_of_bounds::m_region_creation_event_id): New field. (concrete_out_of_bounds::concrete_out_of_bounds): Update for new fields. (concrete_past_the_end::concrete_past_the_end): Likewise. (concrete_past_the_end::add_region_creation_events): Use oob_region_creation_event_capacity. (concrete_buffer_overflow::concrete_buffer_overflow): Update for new fields. (concrete_buffer_overflow::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_overflow::get_dir): New. (concrete_buffer_over_read::concrete_buffer_over_read): Update for new fields. (concrete_buffer_over_read::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_overflow::get_dir): New. (concrete_buffer_underwrite::concrete_buffer_underwrite): Update for new fields. (concrete_buffer_underwrite::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_underwrite::get_dir): New. (concrete_buffer_under_read::concrete_buffer_under_read): Update for new fields. (concrete_buffer_under_read::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_under_read::get_dir): New. (symbolic_past_the_end::symbolic_past_the_end): Update for new fields. (symbolic_buffer_overflow::symbolic_buffer_overflow): Likewise. (symbolic_buffer_overflow::emit): Call maybe_show_notes. (symbolic_buffer_overflow::get_dir): New. (symbolic_buffer_over_read::symbolic_buffer_over_read): Update for new fields. (symbolic_buffer_over_read::emit): Call maybe_show_notes. (symbolic_buffer_over_read::get_dir): New. (region_model::check_symbolic_bounds): Add "sval_hint" param. Pass it and sized_offset_reg to diagnostics. (region_model::check_region_bounds): Add "sval_hint" param, passing it to diagnostics. * diagnostic-manager.cc (diagnostic_manager::emit_saved_diagnostic): Pass logger to pending_diagnostic::emit. * engine.cc: Add logger param to pending_diagnostic::emit implementations. * infinite-recursion.cc: Likewise. * kf-analyzer.cc: Likewise. * kf.cc: Likewise. Add nullptr for new param of check_region_for_write. * pending-diagnostic.h: Likewise in decl. * region-model-manager.cc (region_model_manager::get_or_create_int_cst): Convert param from poly_int64 to const poly_wide_int_ref &. (region_model_manager::maybe_fold_binop): Support type being NULL when checking for floating-point types. Check for (X + Y) - X => Y. Be less strict about types when folding associative ops. Check for (X + Y) * CST => (X * CST) + (Y * CST). * region-model-manager.h (region_model_manager::get_or_create_int_cst): Convert param from poly_int64 to const poly_wide_int_ref &. * region-model.cc: Add logger param to pending_diagnostic::emit implementations. (region_model::check_external_function_for_access_attr): Update for new param of check_region_for_write. (region_model::deref_rvalue): Use nullptr rather than NULL. (region_model::get_capacity): Handle RK_STRING. (region_model::check_region_access): Add "sval_hint" param; pass it to check_region_bounds. (region_model::check_region_for_write): Add "sval_hint" param; pass it to check_region_access. (region_model::check_region_for_read): Add NULL for new param to check_region_access. (region_model::set_value): Pass rhs_sval to check_region_for_write. (region_model::get_representative_path_var_1): Handle SK_CONSTANT in the check for infinite recursion. * region-model.h (region_model::check_region_for_write): Add "sval_hint" param. (region_model::check_region_access): Likewise. (region_model::check_symbolic_bounds): Likewise. (region_model::check_region_bounds): Likewise. * region.cc (region_offset::make_byte_offset): New. (region_offset::calc_symbolic_bit_offset): New. (region_offset::calc_symbolic_byte_offset): New. (region_offset::dump_to_pp): New. (region_offset::dump): New. (struct linear_op): New. (operator<, operator<=, operator>, operator>=): New, for region_offset. (region::get_next_offset): New. (region::get_relative_symbolic_offset): Use ptrdiff_type_node. (field_region::get_relative_symbolic_offset): Likewise. (element_region::get_relative_symbolic_offset): Likewise. (bit_range_region::get_relative_symbolic_offset): Likewise. * region.h (region::get_next_offset): New decl. * sm-fd.cc: Add logger param to pending_diagnostic::emit implementations. * sm-file.cc: Likewise. * sm-malloc.cc: Likewise. * sm-pattern-test.cc: Likewise. * sm-sensitive.cc: Likewise. * sm-signal.cc: Likewise. * sm-taint.cc: Likewise. * store.cc (bit_range::contains_p): Allow "out" to be null. * store.h (byte_range::get_start_bit_offset): New. (byte_range::get_next_bit_offset): New. * varargs.cc: Add logger param to pending_diagnostic::emit implementations. gcc/testsuite/ChangeLog: PR analyzer/106626 * gcc.dg/analyzer/data-model-1.c (test_16): Update for out-of-bounds working. * gcc.dg/analyzer/out-of-bounds-diagram-1-ascii.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-debug.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-emoji.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-json.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-sarif.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-unicode.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-10.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-11.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-12.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-13.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-14.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-15.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-2.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-3.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-4.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-5-ascii.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-5-unicode.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-6.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-7.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-8.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-9.c: New test. * gcc.dg/analyzer/pattern-test-2.c: Update expected results. * gcc.dg/analyzer/pr101962.c: Update expected results. * gcc.dg/plugin/analyzer_gil_plugin.c: Add logger param to pending_diagnostic::emit implementations. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21diagnostics: add support for "text art" diagramsDavid Malcolm43-12/+7144
Existing text output in GCC has to be implemented by writing sequentially to a pretty_printer instance. This makes it hard to implement some kinds of diagnostic output (see e.g. diagnostic-show-locus.cc). This patch adds more flexible ways of creating text output: - a canvas class, which can be "painted" to via random-access (rather that sequentially) - a table class for 2D grid layout, supporting items that span multiple rows/columns - a widget class for organizing diagrams hierarchically. The patch also expands GCC's diagnostics subsystem so that diagnostics can have "text art" diagrams - think ASCII art, but potentially including some Unicode characters, such as box-drawing chars. The new code is in a new "gcc/text-art" subdirectory and "text_art" namespace. The patch adds a new "-fdiagnostics-text-art-charset=VAL" option, with values: - "none": don't emit diagrams (added to -fdiagnostics-plain-output) - "ascii": use pure ASCII in diagrams - "unicode": allow for conservative use of unicode drawing characters (such as box-drawing characters). - "emoji" (the default): as "unicode", but potentially allow for conservative use of emoji in the output (such as U+26A0 WARNING SIGN). I made it possible to disable emoji separately from unicode as I believe there's a generation gap in acceptance of these characters (some older programmers have a visceral reaction against them, whereas younger programmers may have no problem with them). Diagrams are emitted to stderr by default. With SARIF output they are captured as a location in "relatedLocations", with the diagram as a code block in Markdown within a "markdown" property of a message. This patch doesn't add any such diagram usage to GCC, saving that for followups, apart from adding a plugin to the test suite to exercise the functionality. contrib/ChangeLog: * unicode/gen-box-drawing-chars.py: New file. * unicode/gen-combining-chars.py: New file. * unicode/gen-printable-chars.py: New file. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add text-art/box-drawing.o, text-art/canvas.o, text-art/ruler.o, text-art/selftests.o, text-art/style.o, text-art/styled-string.o, text-art/table.o, text-art/theme.o, and text-art/widget.o. * color-macros.h (COLOR_FG_BRIGHT_BLACK): New. (COLOR_FG_BRIGHT_RED): New. (COLOR_FG_BRIGHT_GREEN): New. (COLOR_FG_BRIGHT_YELLOW): New. (COLOR_FG_BRIGHT_BLUE): New. (COLOR_FG_BRIGHT_MAGENTA): New. (COLOR_FG_BRIGHT_CYAN): New. (COLOR_FG_BRIGHT_WHITE): New. (COLOR_BG_BRIGHT_BLACK): New. (COLOR_BG_BRIGHT_RED): New. (COLOR_BG_BRIGHT_GREEN): New. (COLOR_BG_BRIGHT_YELLOW): New. (COLOR_BG_BRIGHT_BLUE): New. (COLOR_BG_BRIGHT_MAGENTA): New. (COLOR_BG_BRIGHT_CYAN): New. (COLOR_BG_BRIGHT_WHITE): New. * common.opt (fdiagnostics-text-art-charset=): New option. (diagnostic-text-art.h): New SourceInclude. (diagnostic_text_art_charset) New Enum and EnumValues. * configure: Regenerate. * configure.ac (gccdepdir): Add text-art to loop. * diagnostic-diagram.h: New file. * diagnostic-format-json.cc (json_emit_diagram): New. (diagnostic_output_format_init_json): Wire it up to context->m_diagrams.m_emission_cb. * diagnostic-format-sarif.cc: Include "diagnostic-diagram.h" and "text-art/canvas.h". (sarif_result::on_nested_diagnostic): Move code to... (sarif_result::add_related_location): ...this new function. (sarif_result::on_diagram): New. (sarif_builder::emit_diagram): New. (sarif_builder::make_message_object_for_diagram): New. (sarif_emit_diagram): New. (diagnostic_output_format_init_sarif): Set context->m_diagrams.m_emission_cb to sarif_emit_diagram. * diagnostic-text-art.h: New file. * diagnostic.cc: Include "diagnostic-text-art.h", "diagnostic-diagram.h", and "text-art/theme.h". (diagnostic_initialize): Initialize context->m_diagrams and call diagnostics_text_art_charset_init. (diagnostic_finish): Clean up context->m_diagrams.m_theme. (diagnostic_emit_diagram): New. (diagnostics_text_art_charset_init): New. * diagnostic.h (text_art::theme): New forward decl. (class diagnostic_diagram): Likewise. (diagnostic_context::m_diagrams): New field. (diagnostic_emit_diagram): New decl. * doc/invoke.texi (Diagnostic Message Formatting Options): Add -fdiagnostics-text-art-charset=. (-fdiagnostics-plain-output): Add -fdiagnostics-text-art-charset=none. * gcc.cc: Include "diagnostic-text-art.h". (driver_handle_option): Handle OPT_fdiagnostics_text_art_charset_. * opts-common.cc (decode_cmdline_options_to_array): Add "-fdiagnostics-text-art-charset=none" to expanded_args for -fdiagnostics-plain-output. * opts.cc: Include "diagnostic-text-art.h". (common_handle_option): Handle OPT_fdiagnostics_text_art_charset_. * pretty-print.cc (pp_unicode_character): New. * pretty-print.h (pp_unicode_character): New decl. * selftest-run-tests.cc: Include "text-art/selftests.h". (selftest::run_tests): Call text_art_tests. * text-art/box-drawing-chars.inc: New file, generated by contrib/unicode/gen-box-drawing-chars.py. * text-art/box-drawing.cc: New file. * text-art/box-drawing.h: New file. * text-art/canvas.cc: New file. * text-art/canvas.h: New file. * text-art/ruler.cc: New file. * text-art/ruler.h: New file. * text-art/selftests.cc: New file. * text-art/selftests.h: New file. * text-art/style.cc: New file. * text-art/styled-string.cc: New file. * text-art/table.cc: New file. * text-art/table.h: New file. * text-art/theme.cc: New file. * text-art/theme.h: New file. * text-art/types.h: New file. * text-art/widget.cc: New file. * text-art/widget.h: New file. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-text-art-ascii-bw.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-ascii-color.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-none.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-unicode-bw.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-unicode-color.c: New test. * gcc.dg/plugin/diagnostic_plugin_test_text_art.c: New test plugin. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add them. libcpp/ChangeLog: * charset.cc (get_cppchar_property): New function template, based on... (cpp_wcwidth): ...this function. Rework to use the above. Include "combining-chars.inc". (cpp_is_combining_char): New function Include "printable-chars.inc". (cpp_is_printable_char): New function * combining-chars.inc: New file, generated by contrib/unicode/gen-combining-chars.py. * include/cpplib.h (cpp_is_combining_char): New function decl. (cpp_is_printable_char): New function decl. * printable-chars.inc: New file, generated by contrib/unicode/gen-printable-chars.py. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21testsuite: move handle-multiline-outputs to before check for blank linesDavid Malcolm6-15/+21
I have followup patches that require checking for multiline patterns that have blank lines within them, so this moves the handling of multiline patterns before the check for blank lines, allowing for such multiline patterns. Doing so uncovers some issues with existing multiline directives, which the patch fixes. gcc/testsuite/ChangeLog: * c-c++-common/Wlogical-not-parentheses-2.c: Split up the multiline directive. * gcc.dg/analyzer/malloc-macro-inline-events.c: Remove redundant dg-regexp directives. * gcc.dg/missing-header-fixit-5.c: Split up the multiline directives. * lib/gcc-dg.exp (gcc-dg-prune): Move call to handle-multiline-outputs from prune_gcc_output to here. * lib/multiline.exp (dg-end-multiline-output): Move call to maybe-handle-nn-line-numbers from prune_gcc_output to here. * lib/prune.exp (prune_gcc_output): Move calls to maybe-handle-nn-line-numbers and handle-multiline-outputs from here to the above. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21compiler: determine types of Slice_{value,info} expressionsIan Lance Taylor3-4/+13
This fixes an accidental omission in the determine types pass. Test case is https://go.dev/cl/505015. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504797
2023-06-22Daily bump.GCC Administrator5-1/+317
2023-06-21function: Change return type of predicate function from int to boolUros Bizjak2-45/+42
Also change some internal variables to bool and some functions to void. gcc/ChangeLog: * function.h (emit_initial_value_sets): Change return type from int to void. (aggregate_value_p): Change return type from int to bool. (prologue_contains): Ditto. (epilogue_contains): Ditto. (prologue_epilogue_contains): Ditto. * function.cc (temp_slot): Make "in_use" variable bool. (make_slot_available): Update for changed "in_use" variable. (assign_stack_temp_for_type): Ditto. (emit_initial_value_sets): Change return type from int to void and update function body accordingly. (instantiate_virtual_regs): Ditto. (rest_of_handle_thread_prologue_and_epilogue): Ditto. (safe_insn_predicate): Change return type from int to bool. (aggregate_value_p): Change return type from int to bool and update function body accordingly. (prologue_contains): Change return type from int to bool. (prologue_epilogue_contains): Ditto.
2023-06-21c-family: implement -ffp-contract=onAlexander Monakov5-6/+89
Implement -ffp-contract=on for C and C++ without changing default behavior (=off for -std=cNN, =fast for C++ and -std=gnuNN). gcc/c-family/ChangeLog: * c-gimplify.cc (fma_supported_p): New helper. (c_gimplify_expr) [PLUS_EXPR, MINUS_EXPR]: Implement FMA contraction. gcc/ChangeLog: * common.opt (fp_contract_mode) [on]: Remove fallback. * config/sh/sh.md (*fmasf4): Correct flag_fp_contract_mode test. * doc/invoke.texi (-ffp-contract): Update. * trans-mem.cc (diagnose_tm_1): Skip internal function calls.
2023-06-21Fortran: Fix some bugs in associate [PR87477]Paul Thomas12-29/+286
2023-06-21 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/87477 PR fortran/88688 PR fortran/94380 PR fortran/107900 PR fortran/110224 * decl.cc (char_len_param_value): Fix memory leak. (resolve_block_construct): Remove unnecessary static decls. * expr.cc (gfc_is_ptr_fcn): New function. (gfc_check_vardef_context): Use it to permit pointer function result selectors to be used for associate names in variable definition context. * gfortran.h: Prototype for gfc_is_ptr_fcn. * match.cc (build_associate_name): New function. (gfc_match_select_type): Use the new function to replace inline version and to build a new associate name for the case where the supplied associate name is already used for that purpose. * resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow associate names with pointer function targets to be used in variable definition context. * trans-decl.cc (gfc_get_symbol_decl): Unlimited polymorphic variables need deferred initialisation of the vptr. (gfc_trans_deferred_vars): Do the vptr initialisation. * trans-stmt.cc (trans_associate_var): Ensure that a pointer associate name points to the target of the selector and not the selector itself. gcc/testsuite/ PR fortran/87477 PR fortran/107900 * gfortran.dg/pr107900.f90 : New test PR fortran/110224 * gfortran.dg/pr110224.f90 : New test PR fortran/88688 * gfortran.dg/pr88688.f90 : New test PR fortran/94380 * gfortran.dg/pr94380.f90 : New test PR fortran/95398 * gfortran.dg/pr95398.f90 : Set -std=f2008, bump the line numbers in the error tests by two and change the text in two.
2023-06-21Fortran: Seg fault passing string to type cptr dummy [PR108961].Paul Thomas2-1/+30
2023-06-21 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/108961 * trans-expr.cc (gfc_conv_procedure_call): The hidden string length must not be passed to a formal arg of type(cptr). gcc/testsuite/ PR fortran/108961 * gfortran.dg/pr108961.f90: New test.
2023-06-21vect: Add testcases for unsigned conversions [PR110018]Uros Bizjak2-10/+104
Also test convresions with unsigned types. PR target/110018 gcc/testsuite/ChangeLog: * gcc.target/i386/pr110018-1.c: Use explicit signed types. * gcc.target/i386/pr110018-2.c: New test.
2023-06-21aarch64: Avoid same input and output Z register for gather loadsKyrylo Tkachov4-71/+263
The architecture recommends that load-gather instructions avoid using the same Z register for the load address and the destination, and the Software Optimization Guides for Arm cores recommend that as well. This means that for code like: svuint64_t food (svbool_t p, uint64_t *in, svint64_t offsets, svuint64_t a) { return svadd_u64_x (p, a, svld1_gather_offset(p, in, offsets)); } we'll want to avoid generating the current: food: ld1d z0.d, p0/z, [x0, z0.d] // Z0 reused as input and output. add z0.d, z1.d, z0.d ret However, we still want to avoid generating extra moves where there were none before, so the tight aarch64-sve-acle.exp tests for load gathers should still pass as they are. This patch implements that recommendation for the load gather patterns by: * duplicating the alternatives * marking the output operand as early clobber * Tying the input Z register operand in the original alternatives to 0 * Penalising the original alternatives with '?' This results in a large-ish patch in terms of diff lines but the new compact syntax (thanks Tamar) makes it quite a readable an regular change. The benchmark numbers on a Neoverse V1 on fprate look okay: diff 503.bwaves_r 0.00% 507.cactuBSSN_r 0.00% 508.namd_r 0.00% 510.parest_r 0.55% 511.povray_r 0.22% 519.lbm_r 0.00% 521.wrf_r 0.00% 526.blender_r 0.00% 527.cam4_r 0.56% 538.imagick_r 0.00% 544.nab_r 0.00% 549.fotonik3d_r 0.00% 554.roms_r 0.00% fprate 0.10% Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>): Add alternatives to prefer to avoid same input and output Z register. (mask_gather_load<mode><v_int_container>): Likewise. (*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise. (*mask_gather_load<mode><v_int_container>_sxtw): Likewise. (*mask_gather_load<mode><v_int_container>_uxtw): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_sxtw): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_uxtw): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (*aarch64_ldff1_gather<mode>_sxtw): Likewise. (*aarch64_ldff1_gather<mode>_uxtw): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode> <VNx4_NARROW:mode>): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_sxtw): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_uxtw): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise. (@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode> <SVE_PARTIAL_I:mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/gather_earlyclobber.c: New test. * gcc.target/aarch64/sve2/gather_earlyclobber.c: New test.
2023-06-21aarch64: Convert SVE gather patterns to compact syntaxKyrylo Tkachov2-191/+211
This patch converts the SVE load gather patterns to the new compact syntax that Tamar introduced. This allows for a future patch I want to contribute to add more alternatives that are better viewed in the more compact form. The lines in some patterns are >80 long now, but I think that's unavoidable and those patterns already had overly long constraint strings. No functional change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>): Convert to compact alternatives syntax. (mask_gather_load<mode><v_int_container>): Likewise. (*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise. (*mask_gather_load<mode><v_int_container>_sxtw): Likewise. (*mask_gather_load<mode><v_int_container>_uxtw): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_sxtw): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_uxtw): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (*aarch64_ldff1_gather<mode>_sxtw): Likewise. (*aarch64_ldff1_gather<mode>_uxtw): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode> <VNx4_NARROW:mode>): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_sxtw): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_uxtw): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise. (@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode> <SVE_PARTIAL_I:mode>): Likewise.
2023-06-21Revert "aarch64: Convert SVE gather patterns to compact syntax"Kyrylo Tkachov2-275/+191
This reverts commit bb3c69058a5fb874ea3c5c26bfb331d33d0497c3.
2023-06-21Move can_vec_mask_load_store_p and get_len_load_store_mode from ↵Ju-Zhe Zhong5-69/+68
"optabs-query" into "optabs-tree" Since we want both can_vec_mask_load_store_p and get_len_load_store_mode can see "internal_fn", move these 2 functions into optabs-tree. gcc/ChangeLog: * optabs-query.cc (can_vec_mask_load_store_p): Move to optabs-tree.cc. (get_len_load_store_mode): Ditto. * optabs-query.h (can_vec_mask_load_store_p): Move to optabs-tree.h. (get_len_load_store_mode): Ditto. * optabs-tree.cc (can_vec_mask_load_store_p): New function. (get_len_load_store_mode): Ditto. * optabs-tree.h (can_vec_mask_load_store_p): Ditto. (get_len_load_store_mode): Ditto. * tree-if-conv.cc: include optabs-tree instead of optabs-query
2023-06-21Less strip_offset in IVOPTsRichard Biener1-3/+4
This avoids one strip_offset use in add_iv_candidate_for_use where we know it operates on a sizetype quantity. * tree-ssa-loop-ivopts.cc (add_iv_candidate_for_use): Use split_constant_offset for the POINTER_PLUS_EXPR case.
2023-06-21Less strip_offset in IVOPTsRichard Biener1-4/+4
This avoids a strip_offset use in record_group_use where we know it operates on addresses. * tree-ssa-loop-ivopts.cc (record_group_use): Use split_constant_offset.
2023-06-21Hide IVOPTs strip_offsetRichard Biener3-6/+9
PR110243 shows strip_offset has some correctness issues, the following avoids using it from loop distribution which can use the more correct split_constant_offset from data-ref analysis instead. The patch then un-exports the function from IVOPTs. * tree-loop-distribution.cc (classify_builtin_st): Use split_constant_offset. * tree-ssa-loop-ivopts.h (strip_offset): Remove. * tree-ssa-loop-ivopts.cc (strip_offset): Make static.
2023-06-21aarch64: Convert SVE gather patterns to compact syntaxKyrylo Tkachov2-191/+275
This patch converts the SVE load gather patterns to the new compact syntax that Tamar introduced. This allows for a future patch I want to contribute to add more alternatives that are better viewed in the more compact form. The lines in some patterns are >80 long now, but I think that's unavoidable and those patterns already had overly long constraint strings. No functional change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>): Convert to compact alternatives syntax. (mask_gather_load<mode><v_int_container>): Likewise. (*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise. (*mask_gather_load<mode><v_int_container>_sxtw): Likewise. (*mask_gather_load<mode><v_int_container>_uxtw): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_sxtw): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_uxtw): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (*aarch64_ldff1_gather<mode>_sxtw): Likewise. (*aarch64_ldff1_gather<mode>_uxtw): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode> <VNx4_NARROW:mode>): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_sxtw): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_uxtw): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise. (@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode> <SVE_PARTIAL_I:mode>): Likewise.
2023-06-21docs: replace backslashchar [PR 110329].Tamar Christina1-1/+1
It seems like @blackslashchar{} is a relatively new addition to texinfo. Other parts of the docs use @samp{\} so use it here too so older distros work. gcc/ChangeLog: PR other/110329 * doc/md.texi: Replace backslashchar.
2023-06-21[i386] Reject too large vectors for partial vector vectorizationRichard Biener3-0/+51
The following works around the lack of the x86 backend making the vectorizer compare the costs of the different possible vector sizes the backed advertises through the vector_modes hook. When enabling masked epilogues or main loops then this means we will select the prefered vector mode which is usually the largest even for loops that do not iterate close to the times the vector has lanes. When not using masking the vectorizer would reject any mode resulting in a VF bigger than the number of iterations but with masking they are simply masked out. So this overloads the finish_cost function and matches for the problematic case, forcing a high cost to make us try a smaller vector size. * config/i386/i386.cc (ix86_vector_costs::finish_cost): Overload. For masked main loops make sure the vectorization factor isn't more than double the number of iterations. * gcc.target/i386/vect-partial-vectors-1.c: New testcase. * gcc.target/i386/vect-partial-vectors-2.c: Likewise.
2023-06-21x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512FJan Beulich3-11/+59
There's no reason to constrain this to AVX512VL, unless instructed so by -mprefer-vector-width=, as the wider operation is unusable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign<mode>3 can benefit from the operation being a single-insn one (leaving aside moves which the compiler decides to insert for unclear reasons, and leaving aside the fact that bcst_mem_operand() is too restrictive for broadcast to be embedded right into VPTERNLOG*). While there also bring *<avx512>_vternlog<mode>_all's in sync with that of the three splitters. Along with this also request value duplication in ix86_expand_copysign()'s call to ix86_build_signbit_mask(), eliminating excess space allocation in .rodata.*, filled with zeros which are never read. gcc/ * config/i386/i386-expand.cc (ix86_expand_copysign): Request value duplication by ix86_build_signbit_mask() when AVX512F and not HFmode. * config/i386/sse.md (*<avx512>_vternlog<mode>_all): Convert to 2-alternative form. Adjust "mode" attribute. Add "enabled" attribute. (*<avx512>_vpternlog<mode>_1): Also permit when TARGET_AVX512F && !TARGET_PREFER_AVX256. (*<avx512>_vpternlog<mode>_2): Likewise. (*<avx512>_vpternlog<mode>_3): Likewise. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: New test.
2023-06-21x86: add -mprefer-vector-width=512 to new avx512f-dupv2di.c testcaseJan Beulich1-1/+1
This is to cover testing also being done with -march=cascadelake. gcc/testsuite/ * gcc.target/i386/avx512f-dupv2di.c: Add -mprefer-vector-width=512.
2023-06-21Use intermiediate integer type for float_expr/fix_trunc_expr when direct ↵liuhongt2-2/+158
optab is not existed. We have already use intermidate type in case WIDEN, but not for NONE, this patch extended that. gcc/ChangeLog: PR target/110018 * tree-vect-stmts.cc (vectorizable_conversion): Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110018-1.c: New test.
2023-06-21Daily bump.GCC Administrator5-1/+623
2023-06-20gensupport: drop suppport for define_cond_exec from compact syntacTamar Christina1-2/+2
define_cond_exec does not support the special @@ syntax and so can't support {@. As such just remove support for it. gcc/ChangeLog: PR bootstrap/110324 * gensupport.cc (convert_syntax): Explicitly check for RTX code.
2023-06-20libcpp: Improve location for macro names [PR66290]Lewis Hyatt25-333/+385
When libcpp reports diagnostics whose locus is a macro name (such as for -Wunused-macros), it uses the location in the cpp_macro object that was stored by _cpp_new_macro. This is currently set to pfile->directive_line, which contains the line number only and no column information. This patch changes the stored location to the src_loc for the token defining the macro name, which includes the location and range information. libcpp/ChangeLog: PR c++/66290 * macro.cc (_cpp_create_definition): Add location argument. * internal.h (_cpp_create_definition): Adjust prototype. * directives.cc (do_define): Pass new location argument to _cpp_create_definition. (do_undef): Stop passing inferior location to cpp_warning_with_line; the default from cpp_warning is better. (cpp_pop_definition): Pass new location argument to _cpp_create_definition. * pch.cc (cpp_read_state): Likewise. gcc/testsuite/ChangeLog: PR c++/66290 * c-c++-common/cpp/macro-ranges.c: New test. * c-c++-common/cpp/line-2.c: Adapt to check for column information on macro-related libcpp warnings. * c-c++-common/cpp/line-3.c: Likewise. * c-c++-common/cpp/macro-arg-count-1.c: Likewise. * c-c++-common/cpp/pr58844-1.c: Likewise. * c-c++-common/cpp/pr58844-2.c: Likewise. * c-c++-common/cpp/warning-zero-location.c: Likewise. * c-c++-common/pragma-diag-14.c: Likewise. * c-c++-common/pragma-diag-15.c: Likewise. * g++.dg/modules/macro-2_d.C: Likewise. * g++.dg/modules/macro-4_d.C: Likewise. * g++.dg/modules/macro-4_e.C: Likewise. * g++.dg/spellcheck-macro-ordering.C: Likewise. * gcc.dg/builtin-redefine.c: Likewise. * gcc.dg/cpp/Wunused.c: Likewise. * gcc.dg/cpp/redef2.c: Likewise. * gcc.dg/cpp/redef3.c: Likewise. * gcc.dg/cpp/redef4.c: Likewise. * gcc.dg/cpp/ucnid-11-utf8.c: Likewise. * gcc.dg/cpp/ucnid-11.c: Likewise. * gcc.dg/cpp/undef2.c: Likewise. * gcc.dg/cpp/warn-redefined-2.c: Likewise. * gcc.dg/cpp/warn-redefined.c: Likewise. * gcc.dg/cpp/warn-unused-macros-2.c: Likewise. * gcc.dg/cpp/warn-unused-macros.c: Likewise.
2023-06-20aarch64: Fix gcc.target/aarch64/sve/pcs failuresRichard Sandiford67-92/+95
Several gcc.target/aarch64/sve/pcs tests started failing after 6a2e8dcbbd4, because the tests weren't robust against whether an indirect argument register or the stack pointer was used as the base for stores. The patch allows either base register when there is only one indirect argument. It disables -fcprop-registers in cases where there are sometimes multiple indirect arguments, since the name of the argument register is then an important part of the test. Disabling -fcprop-registers gives poor final register allocation, since: * combine's make_more_copies hack adds extra redundant moves * code with those moves is not allocated as well as moves without them * we often rely on -fcprop-registers to clean up the allocation later The patch therefore disables combine in the same tests as cprop-registers. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/args_1.c: Match moves from the stack pointer to indirect argument registers and allow either to be used as the base register in subsequent stores. * gcc.target/aarch64/sve/pcs/args_8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_2.c: Allow the store of the indirect argument to happen via the argument register or the stack pointer. * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_bf16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_bf16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_bf16.c: Disable -fcprop-registers and combine. * gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_bf16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise.
2023-06-20aarch64: Robustify stack tie handlingRichard Sandiford2-7/+18
The SVE handling of stack clash protection copied the stack pointer to X11 before the probe and set up X11 as the CFA for unwind purposes: /* This is done to provide unwinding information for the stack adjustments we're about to do, however to prevent the optimizers from removing the R11 move and leaving the CFA note (which would be very wrong) we tie the old and new stack pointer together. The tie will expand to nothing but the optimizers will not touch the instruction. */ rtx stack_ptr_copy = gen_rtx_REG (Pmode, STACK_CLASH_SVE_CFA_REGNUM); emit_move_insn (stack_ptr_copy, stack_pointer_rtx); emit_insn (gen_stack_tie (stack_ptr_copy, stack_pointer_rtx)); /* We want the CFA independent of the stack pointer for the duration of the loop. */ add_reg_note (insn, REG_CFA_DEF_CFA, stack_ptr_copy); RTX_FRAME_RELATED_P (insn) = 1; -fcprop-registers is now smart enough to realise that X11 = SP, replace X11 with SP in the stack tie, and delete the instruction created above. This patch tries to prevent that by making stack_tie fussy about the register numbers. It fixes failures in gcc.target/aarch64/sve/pcs/stack_clash*.c. gcc/ * config/aarch64/aarch64.md (stack_tie): Hard-code the first register operand to the stack pointer. Require the second register operand to have the number specified in a separate const_int operand. * config/aarch64/aarch64.cc (aarch64_emit_stack_tie): New function. (aarch64_allocate_and_probe_stack_space): Use it. (aarch64_expand_prologue, aarch64_expand_epilogue): Likewise. (aarch64_expand_epilogue): Likewise.
2023-06-20tree-ssa-math-opts: Small uaddc/usubc pattern matching improvement [PR79173]Jakub Jelinek2-0/+38
In the following testcase we fail to pattern recognize the least significant .UADDC call. The reason is that arg3 in that case is _3 = .ADD_OVERFLOW (...); _2 = __imag__ _3; _1 = _2 != 0; arg3 = (unsigned long) _1; and while before the changes arg3 has a single use in some .ADD_OVERFLOW later on, we add a .UADDC call next to it (and gsi_remove/gsi_replace only what is strictly necessary and leave quite a few dead stmts around which next DCE cleans up) and so it all of sudden isn't used just once, but twice (.ADD_OVERFLOW and .UADDC) and so uaddc_cast fails. While we could tweak uaddc_cast and not require has_single_use in these uses, there is also no vrp that would figure out that because __imag__ _3 is in [0, 1] range, it can just use arg3 = __imag__ _3; and drop the comparison and cast. We already search if either arg2 or arg3 is ultimately set from __imag__ of .{{ADD,SUB}_OVERFLOW,U{ADD,SUB}C} call, so the following patch just remembers the lhs of __imag__ from that case and uses it later. 2023-06-20 Jakub Jelinek <jakub@redhat.com> PR middle-end/79173 * tree-ssa-math-opts.cc (match_uaddc_usubc): Remember lhs of IMAGPART_EXPR of arg2/arg3 and use that as arg3 if it has the right type. * g++.target/i386/pr79173-1.C: New test.
2023-06-20calls: Change return type of predicate function from int to boolUros Bizjak2-82/+86
Also change some internal variables and some function arguments to bool. gcc/ChangeLog: * calls.h (setjmp_call_p): Change return type from int to bool. * calls.cc (struct arg_data): Change "pass_on_stack" to bool. (store_one_arg): Change return type from int to bool and adjust function body accordingly. Change "sibcall_failure" variable to bool. (finalize_must_preallocate): Ditto. Change *must_preallocate pointer argument to bool. Change "partial_seen" variable to bool. (load_register_parameters): Change *sibcall_failure pointer argument to bool. (check_sibcall_argument_overlap_1): Change return type from int to bool and adjust function body accordingly. (check_sibcall_argument_overlap): Ditto. Change "mark_stored_args_map" argument to bool. (emit_call_1): Change "already_popped" variable to bool. (setjmp_call_p): Change return type from int to bool and adjust function body accordingly. (initialize_argument_information): Change *must_preallocate pointer argument to bool. (expand_call): Change "pcc_struct_value", "must_preallocate" and "sibcall_failure" variables to bool. (emit_library_call_value_1): Change "pcc_struct_value" variable to bool.
2023-06-20runtime: use a C function to call mmapIan Lance Taylor1-1/+1
The final argument to mmap, of type off_t, varies. In CL 445375 we changed it to always use the C off_t type, but that broke 32-bit big-endian Linux systems. On those systems, using the C off_t type requires calling the mmap64 function. In C this is automatically handled by the <sys/mman.h> file. In Go, we would have to change the magic //extern comment to call mmap64 when appropriate. Rather than try to get that right, we instead go through a C function that uses C implicit type conversions to pick the right type. Fixes PR go/110297 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504415
2023-06-20ipa-sra: Disable candidates with no known callers (PR 110276)Martin Jambor2-0/+26
In IPA-SRA we use can_be_local_p () predicate rather than just plain local call graph flag in order to figure out whether the node is a part of an external API that we cannot change. Although there are cases where this can allow more transformations, it also means we can analyze functions which have no callers at all, which is pointless. Moreover, it makes an assert of hint propagation trigger, which checks that we have looked at callers before processing hints that come from them. This has been reported as PR 110276. This patch simply adds a check that a node has at least one caller into the early checks and makes the node a non-candidate for any transformation if it does not. gcc/ChangeLog: 2023-06-16 Martin Jambor <mjambor@suse.cz> PR ipa/110276 * ipa-sra.cc (struct caller_issues): New field there_is_one. (check_for_caller_issues): Set it. (check_all_callers_for_issues): Check it. gcc/testsuite/ChangeLog: 2023-06-16 Martin Jambor <mjambor@suse.cz> PR ipa/110276 * gcc.dg/ipa/pr110276.c: New test.
2023-06-20ipa-cp: Avoid long linear searches through DECL_ARGUMENTSMartin Jambor3-33/+131
There have been concerns that linear searches through DECL_ARGUMENTS that are often necessary to compute the index of a particular PARM_DECL which is the key to results of IPA-CP can happen often enough to be a compile time issue, especially if we plug the results into value numbering, as I intend to do with a follow-up patch. This patch creates a vector sorted according to PARM_DECLs to do the look-up for all functions which have some information discovered by IPA-CP and which have 32 parameters or more. 32 is a hard-wired magical constant here to capture the trade-off between the memory allocation overhead and length of the linear search. I do not think it is worth making it a --param but if people think it appropriate, I can turn it into one. gcc/ChangeLog: 2023-05-31 Martin Jambor <mjambor@suse.cz> * ipa-prop.h (ipa_uid_to_idx_map_elt): New type. (struct ipcp_transformation): Rearrange members according to C++ class coding convention, add m_uid_to_idx, get_param_index and maybe_create_parm_idx_map. * ipa-cp.cc (ipcp_transformation::get_param_index): New function. (compare_uids): Likewise. (ipcp_transformation::maype_create_parm_idx_map): Likewise. * ipa-prop.cc (ipcp_get_parm_bits): Use get_param_index. (ipcp_update_bits): Accept TS as a parameter, assume it is not NULL. (ipcp_update_vr): Likewise. (ipcp_transform_function): Call, maybe_create_parm_idx_map of TS, bail out quickly if empty, pass it to ipcp_update_bits and ipcp_update_vr.
2023-06-20rs6000: Add builtins for IEEE 128-bit floating point valuesCarl Love9-26/+307
Add support for the following builtins: __vector unsigned long long int scalar_extract_exp_to_vec (__ieee128); __vector unsigned __int128 scalar_extract_sig_to_vec (__ieee128); __ieee128 scalar_insert_exp (__vector unsigned __int128, __vector unsigned long long); The instructions used in the builtins operate on vector registers. Thus the result must be moved to a scalar type. There is no clean, performant way to do this. The user code typically needs the result as a vector anyway. gcc/ * config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Rename CODE_FOR_xsxsigqp_tf to CODE_FOR_xsxsigqp_tf_ti. Rename CODE_FOR_xsxsigqp_kf to CODE_FOR_xsxsigqp_kf_ti. Rename CCDE_FOR_xsxexpqp_tf to CODE_FOR_xsxexpqp_tf_di. Rename CODE_FOR_xsxexpqp_kf to CODE_FOR_xsxexpqp_kf_di. (CODE_FOR_xsxexpqp_kf_v2di, CODE_FOR_xsxsigqp_kf_v1ti, CODE_FOR_xsiexpqp_kf_v2di): Add case statements. * config/rs6000/rs6000-builtins.def (__builtin_vsx_scalar_extract_exp_to_vec, __builtin_vsx_scalar_extract_sig_to_vec, __builtin_vsx_scalar_insert_exp_vqp): Add new builtin definitions. Rename xsxexpqp_kf, xsxsigqp_kf, xsiexpqp_kf to xsexpqp_kf_di, xsxsigqp_kf_ti, xsiexpqp_kf_di respectively. * config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin): Update case RS6000_OVLD_VEC_VSIE to handle MODE_VECTOR_INT for new overloaded instance. Update comments. * config/rs6000/rs6000-overload.def (__builtin_vec_scalar_insert_exp): Add new overload definition with vector arguments. (scalar_extract_exp_to_vec, scalar_extract_sig_to_vec): New overloaded definitions. * config/rs6000/vsx.md (V2DI_DI): New mode iterator. (DI_to_TI): New mode attribute. Rename xsxexpqp_<mode> to sxexpqp_<IEEE128:mode>_<V2DI_DI:mode>. Rename xsxsigqp_<mode> to xsxsigqp_<IEEE128:mode>_<VEC_TI:mode>. Rename xsiexpqp_<mode> to xsiexpqp_<IEEE128:mode>_<V2DI_DI:mode>. * doc/extend.texi (scalar_extract_exp_to_vec, scalar_extract_sig_to_vec): Add documentation for new builtins. (scalar_insert_exp): Add new overloaded builtin definition. gcc/testsuite/ * gcc.target/powerpc/bfp/scalar-extract-exp-8.c: New test case. * gcc.target/powerpc/bfp/scalar-extract-sig-8.c: New test case. * gcc.target/powerpc/bfp/scalar-insert-exp-16.c: New test case.
2023-06-20RISC-V: testsuite: Add missing -mabi=lp64d.Robin Dapp9-9/+9
This fixes more cases of missing -mabi=lp64d. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c: Add -mabi=lp64d. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Dito.
2023-06-20RISC-V: Set the natural size of constant vector mask modes to one RVV data ↵Li Xu2-0/+16
vector. If reinterpret vnx2bi as vnx16qi, vnx16qi must occupy no more of the underlying registers than vnx2bi. Consider this following case: void test_vreinterpret_v_b64_i8m1 (uint8_t *in, int8_t *out) { vbool64_t vmask = __riscv_vlm_v_b64 (in, 2); vint8m1_t vout = __riscv_vreinterpret_v_b64_i8m1 (vmask); __riscv_vse8_v_i8m1(out, vout, 16); } compiler parameters: -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -O3 Compilation fails with: test_vreinterpret_v_b64_i8m1during RTL pass: expand test.c: In function 'test_vreinterpret_v_b64_i8m1': test.c:11:22: internal compiler error: in gen_lowpart_general, at rtlhooks.cc:57 11 | vint8m1_t vout = __riscv_vreinterpret_v_b64_i8m1(src); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0xf11876 gen_lowpart_general(machine_mode, rtx_def*) ../.././riscv-gcc/gcc/rtlhooks.cc:57 0x191435e gen_vreinterpretvnx16qi(rtx_def*, rtx_def*) ../.././riscv-gcc/gcc/config/riscv/vector.md:486 0xe08858 maybe_expand_insn(insn_code, unsigned int, expand_operand*) ../.././riscv-gcc/gcc/optabs.cc:8213 0x1471209 riscv_vector::function_expander::generate_insn(insn_code) ../.././riscv-gcc/gcc/config/riscv/riscv-vector-builtins.cc:3813 0x147629c riscv_vector::function_expander::expand() ../.././riscv-gcc/gcc/config/riscv/riscv-vector-builtins.h:520 0x147629c riscv_vector::expand_builtin(unsigned int, tree_node*, rtx_def*) ../.././riscv-gcc/gcc/config/riscv/riscv-vector-builtins.cc:4103 0x9868f9 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) ../.././riscv-gcc/gcc/builtins.cc:7342 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_regmode_natural_size): set the natural size of vector mask mode to one rvv register. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vreinterpet-fixed.c: New test.
2023-06-20RISC-V: Optimize codegen of VLA SLPJuzhe-Zhong4-45/+128
Add comments for Robin: We want to create a pattern where value[ix] = floor (ix / NPATTERNS). As NPATTERNS is always a power of two we can rewrite this as = ix & -NPATTERNS. ` Recently, I figure out a better approach in case of codegen for VLA stepped vector. Here is the detail descriptions: Case 1: void f (uint8_t *restrict a, uint8_t *restrict b) { for (int i = 0; i < 100; ++i) { a[i * 8] = b[i * 8 + 37] + 1; a[i * 8 + 1] = b[i * 8 + 37] + 2; a[i * 8 + 2] = b[i * 8 + 37] + 3; a[i * 8 + 3] = b[i * 8 + 37] + 4; a[i * 8 + 4] = b[i * 8 + 37] + 5; a[i * 8 + 5] = b[i * 8 + 37] + 6; a[i * 8 + 6] = b[i * 8 + 37] + 7; a[i * 8 + 7] = b[i * 8 + 37] + 8; } } We need to generate the stepped vector: NPATTERNS = 8. { 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 8, 8, 8, 8, 8 } Before this patch: vid.v v4 ;; {0,1,2,3,4,5,6,7,...} vsrl.vi v4,v4,3 ;; {0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,...} li a3,8 ;; {8} vmul.vx v4,v4,a3 ;; {0,0,0,0,0,0,0,8,8,8,8,8,8,8,8,...} After this patch: vid.v v4 ;; {0,1,2,3,4,5,6,7,...} vand.vi v4,v4,-8(-NPATTERNS) ;; {0,0,0,0,0,0,0,8,8,8,8,8,8,8,8,...} Case 2: void f (uint8_t *restrict a, uint8_t *restrict b) { for (int i = 0; i < 100; ++i) { a[i * 8] = b[i * 8 + 3] + 1; a[i * 8 + 1] = b[i * 8 + 2] + 2; a[i * 8 + 2] = b[i * 8 + 1] + 3; a[i * 8 + 3] = b[i * 8 + 0] + 4; a[i * 8 + 4] = b[i * 8 + 7] + 5; a[i * 8 + 5] = b[i * 8 + 6] + 6; a[i * 8 + 6] = b[i * 8 + 5] + 7; a[i * 8 + 7] = b[i * 8 + 4] + 8; } } We need to generate the stepped vector: NPATTERNS = 4. { 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, ... } Before this patch: li a6,134221824 slli a6,a6,5 addi a6,a6,3 ;; 64-bit: 0x0003000200010000 vmv.v.x v6,a6 ;; {3, 2, 1, 0, ... } vid.v v4 ;; {0, 1, 2, 3, 4, 5, 6, 7, ... } vsrl.vi v4,v4,2 ;; {0, 0, 0, 0, 1, 1, 1, 1, ... } li a3,4 ;; {4} vmul.vx v4,v4,a3 ;; {0, 0, 0, 0, 4, 4, 4, 4, ... } vadd.vv v4,v4,v6 ;; {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, ... } After this patch: li a3,-536875008 slli a3,a3,4 addi a3,a3,1 slli a3,a3,16 vmv.v.x v2,a3 ;; {3, 1, -1, -3, ... } vid.v v4 ;; {0, 1, 2, 3, 4, 5, 6, 7, ... } vadd.vv v4,v4,v2 ;; {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, ... } gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Optimize codegen. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-1.c: Adapt testcase. * gcc.target/riscv/rvv/autovec/partial/slp-16.c: New test. * gcc.target/riscv/rvv/autovec/partial/slp_run-16.c: New test.
2023-06-20RISC-V: testsuite: Add -Wno-psabi to vec_set/vec_extract testcases.Robin Dapp10-10/+10
This fixes some fallout from the recent psabi changes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Add -Wno-psabi. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Dito. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Dito.
2023-06-20RISC-V: Fix compiler warning of riscv_arg_has_vectorLehua Ding1-2/+4
Hi, This little patch fixes a compile warning issue that my previous patch introduced, sorry for introducing this issue. Best, Lehua gcc/ChangeLog: * config/riscv/riscv.cc (riscv_arg_has_vector): Add default switch handler.
2023-06-20RISC-V: testsuite: Fix vmul test expectation and fix -ffast-math.Robin Dapp5-3/+5
I forgot to check for vfmul in the multiplication tests as well as some -ffast-math arguments. Fix this. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vadd-run.c: Add -ffast-math. * gcc.target/riscv/rvv/autovec/binop/vadd-zvfh-run.c: Dito. * gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Remove -ffast-math * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv.c: Check for vfmul. * gcc.target/riscv/rvv/autovec/binop/vmul-rv64gcv.c: Dito.
2023-06-20Fortran: Fix parse-dump-tree for OpenMP ALLOCATE clauseTobias Burnus1-3/+6
Commit r14-1301-gd64e8e1224708e added u2.allocator to gfc_omp_namelist for better readability and to permit to use namelist->expr for code like the following: !$omp allocators allocate(align(32) : dt%alloc_comp) allocate (dt%alloc_comp(5)) !$omp allocate(dt%alloc_comp2) align(64) allocate (dt%alloc_comp2(10)) However, for the parse-tree dump the change was incomplete. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_namelist): Fix dump of the allocator modifier of OMP_LIST_ALLOCATE.
2023-06-20ada: Minor tweaksEric Botcazou1-8/+6
gcc/ada/ * gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Variable>: Pass the NULL_TREE explicitly and test imported_p in lieu of Is_Imported. <E_Function>: Remove public_flag local variable and make extern_flag local variable a constant.
2023-06-20ada: Fix crash on inlining in GNATproveYannick Moy1-0/+9
After the recent change on detection of non-inlining, calls inside the iterator part of a quantified expression were not considered as preventing inlining anymore, leading to a crash later on inside GNATprove. Now fixed. gcc/ada/ * sem_res.adb (Resolve_Call): Fix change that replaced test for quantified expressions by the test for potentially unevaluated contexts. Both should be performed.
2023-06-20ada: Further fixes to handling of private views in instancesEric Botcazou9-237/+219
This removes more bypasses for private views in instances that are present in type predicates (Conforming_Types, Covers, Specific_Type and Wrong_Type), which in exchange requires additional work in Sem_Ch12 to restore the proper view of types during the instantiation of generic bodies. The main mechanism for this is the Has_Private_View flag, but it comes with the limitations that 1) there must be a direct reference to the global type in the generic construct (either a reference to a global object of this type or the explicit declaration of a local object of this type), which is not always the case e.g. for loop parameters and 2) it can deal with a single type at a time, e.g. it cannot deal with an array type and its component type if their respective views are not the same in the instance. To overcome the second limitation, a new Has_Secondary_Private_View flag is introduced to deal with a secondary type, which as of this writing is either the component type of an array type or the designated type of an access type (together they make up the vast majority of the problematic cases for the Has_Private_View flag alone). This new mechanism subsumes a specific treatment for them that was added in Copy_Generic_Node a few years ago, although a specific treatment still needs to be preserved for comparison and equality operators in a narrower case. Additional handling is also introduced to overcome the first limitation for loop parameters in Copy_Generic_Node, and a relaxed condition is used in Exp_Ch7.Convert_View to generate an unchecked conversion between views. gcc/ada/ * exp_ch7.adb (Convert_View): Detect more cases of mismatches for private types and use Implementation_Base_Type as main criterion. * gen_il-fields.ads (Opt_Field_Enum): Add Has_Secondary_Private_View * gen_il-gen-gen_nodes.adb (N_Expanded_Name): Likewise. (N_Direct_Name): Likewise. (N_Op): Likewise. * sem_ch12.ads (Check_Private_View): Document the usage of second flag Has_Secondary_Private_View. * sem_ch12.adb (Get_Associated_Entity): New function to retrieve the ultimate associated entity, if any. (Check_Private_View): Implement Has_Secondary_Private_View support. (Copy_Generic_Node): Remove specific treatment for Component_Type of an array type and Designated_Type of an access type. Add specific treatment for comparison and equality operators, as well as iterator and loop parameter specifications. (Instantiate_Type): Implement Has_Secondary_Private_View support. (Requires_Delayed_Save): Call Get_Associated_Entity. (Set_Global_Type): Implement Has_Secondary_Private_View support. * sem_ch6.adb (Conforming_Types): Remove bypass for private views in instances. * sem_type.adb (Covers): Return true if Is_Subtype_Of does so. Remove bypass for private views in instances. (Specific_Type): Likewise. * sem_util.adb (Wrong_Type): Likewise. * sinfo.ads (Has_Secondary_Private_View): Document new flag.