aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-06-23Tiny phiprop compile time optimizationJan Hubicka1-3/+5
gcc/ChangeLog: * tree-ssa-phiprop.cc (propagate_with_phi): Compute post dominators on demand. (pass_phiprop::execute): Do not compute it here; return update_ssa_only_virtuals if something changed. (pass_data_phiprop): Remove TODO_update_ssa from todos.
2023-06-23Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]Harald Anlauf2-0/+97
gcc/fortran/ChangeLog: PR fortran/110360 * trans-expr.cc (gfc_conv_procedure_call): Pass actual argument to scalar CHARACTER(1),VALUE dummy argument by value. gcc/testsuite/ChangeLog: PR fortran/110360 * gfortran.dg/value_9.f90: New test.
2023-06-23Fix power10 fusion bug with prefixed loads, PR target/105325Michael Meissner6-42/+84
This changes fixes PR target/105325. PR target/105325 is a bug where an invalid lwa instruction is generated due to power10 fusion of a load instruction to a GPR and an compare immediate instruction with the immediate being -1, 0, or 1. In some cases, when the load instruction is done, the GCC compiler would generate a load instruction with an offset that was too large to fit into the normal load instruction. In particular, loads from the stack might originally have a small offset, so that the load is not a prefixed load. However, after the stack is set up, and register allocation has been done, the offset now is large enough that we would have to use a prefixed load instruction. The support for prefixed loads did not consider that patterns with a fused load and compare might have a prefixed address. Without this support, the proper prefixed load won't be generated. In the original code, when the split2 pass is run after reload has finished the ds_form_mem_operand predicate that was used for lwa and ld no longer returns true. When the pattern was created, ds_form_mem_operand recognized the insn as being valid since the offset was small. But after register allocation, ds_form_mem_operand did not return true. Because it didn't return true, the insn could not be split. Since the insn was not split and the prefix support did not indicate a prefixed instruction was used, the wrong load is generated. The solution involves: 1) Don't use ds_form_mem_operand for ld and lwa, always use non_update_memory_operand. 2) Delete ds_form_mem_operand since it is no longer used. 3) Use the "YZ" constraints for ld/lwa instead of "m". 4) If we don't need to sign extend the lwa, convert it to lwz, and use cmpwi instead of cmpdi. Adjust the insn name to reflect the code generate. 5) Insure that the insn using lwa will be recognized as having a prefixed operand (and hence the insn length will be 16 bytes instead of 8 bytes). 5a) Set the prefixed and maybe_prefix attributes to know that fused_load_cmpi are also load insns; 5b) In the case where we are just setting CC and not using the memory afterward, set the clobber to use a DI register, and put an explicit sign_extend operation in the split; 5c) Set the sign_extend attribute to "yes" for lwa. 5d) 5a-5c are the things that prefixed_load_p in rs6000.cc checks to ensure that lwa is treated as a ds-form instruction and not as a d-form instruction (i.e. lwz). 6) Add a new test case for this case. 7) Adjust the insn counts in fusion-p10-ldcmpi.c. Because we are no longer using ds_form_mem_operand, the ld and lwa instructions will fuse x-form (reg+reg) addresses in addition ds-form (reg+offset or reg). 2023-06-23 Michael Meissner <meissner@linux.ibm.com> gcc/ PR target/105325 * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that allowed prefixed lwa to be generated. * config/rs6000/fusion.md: Regenerate. * config/rs6000/predicates.md (ds_form_mem_operand): Delete. * config/rs6000/rs6000.md (prefixed attribute): Add support for load plus compare immediate fused insns. (maybe_prefixed): Likewise. gcc/testsuite/ PR target/105325 * g++.target/powerpc/pr105325.C: New test. * gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn counts. Co-Authored-By: Aaron Sawdey <acsawdey@linux.ibm.com>
2023-06-23testsuite,objective-c++: Fix imported NSObjCRuntime.h.Iain Sandoe1-0/+3
We have imported some headers from the GNUStep project to allow us to maintain the testsuite independent to changing versions of system headers. One of these headers has a macro that (now we have support for __has_feature) expands to a declaration that triggers a warning. These headers are considered part of the implementation so that, in this case, we can suppress the warning with the system_header pragma. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/testsuite/ChangeLog: * objc-obj-c++-shared/GNUStep/Foundation/NSObjCRuntime.h: Make this header use pragma system_header.
2023-06-23Improved SUBREG simplifications in simplify-rtx.cc's simplify_subreg.Roger Sayle1-0/+32
An x86 backend improvement that I'm working results in combine attempting to recognize: (set (reg:DI 87 [ xD.2846 ]) (ior:DI (subreg:DI (ashift:TI (zero_extend:TI (reg:DI 92)) (const_int 64 [0x40])) 0) (reg:DI 91))) where the lowpart SUBREG has difficulty seeing through the (hi<<64) that the lowpart must be zero. Rather than workaround this in the backend, the better fix is to teach simplify-rtx that lowpart((hi<<64)|lo) -> lo and highpart((hi<<64)|lo) -> hi, so that all backends benefit. Reducing the number of places where the middle-end generates a SUBREG of something other than REG is a good thing. On x86_64-pc-linux-gnu, the testcase pr78904-1b.c FAILs with this patch, due to changes in expected/canonical RTL, for which a backend patch to i386.md has already been provisionally approved. 2023-06-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * simplify-rtx.cc (simplify_subreg): Optimize lowpart SUBREGs of ASHIFT to const0_rtx with sufficiently large shift count. Optimize highpart SUBREGs of ASHIFT as the shift operand when the shift count is the correct offset. Optimize SUBREGs of multi-word logic operations if the SUBREGs of both operands can be simplified.
2023-06-23Fix initializer_constant_valid_p_1 TYPE_PRECISION useRichard Biener1-1/+2
initializer_constant_valid_p_1 is letting through all conversions of float vector types that have the same number of elements but that's of course not valid. The following restricts the code to scalar floating point types as was probably intended (only scalar integer types are handled as well). * varasm.cc (initializer_constant_valid_p_1): Only allow conversions between scalar floating point types.
2023-06-23Deal with vector typed operands in conversionsRichard Biener1-5/+8
The following avoids using TYPE_PRECISION on VECTOR_TYPE when looking for bit-precision changes in vectorizable_assignment. We didn't anticipate a stmt like _21 = VIEW_CONVERT_EXPR<unsigned int>(vect__1.7_28); and the following makes sure to handle that. * tree-vect-stmts.cc (vectorizable_assignment): Properly handle non-integral operands when analyzing conversions.
2023-06-23[aarch64/match.pd] Fix ICE observed in PR110280.Prathamesh Kulkarni2-1/+20
gcc/ChangeLog: PR tree-optimization/110280 * match.pd (vec_perm_expr(v, v, mask) -> v): Explicitly build vector using build_vector_from_val with the element of input operand, and mask's type if operand and mask's types don't match. gcc/testsuite/ChangeLog: PR tree-optimization/110280 * gcc.target/aarch64/sve/pr110280.c: New test.
2023-06-23Fix tree_simple_nonnegative_warnv_p for VECTOR_TYPEsRichard Biener1-1/+2
tree_simple_nonnegative_warnv_p ends up being called on VECTOR_TYPEs which I think even gets the wrong answer here for tcc_comparison since vector bools are signed. The following properly guards that with !VECTOR_TYPE_P. * fold-const.cc (tree_simple_nonnegative_warnv_p): Guard the truth_value_p case with !VECTOR_TYPE_P.
2023-06-23Properly guard vect_look_through_possible_promotionRichard Biener1-1/+5
The function ends up getting called on VECTOR_TYPEs which it really isn't prepared for and with the TYPE_PRECISION checking changes will ICE. The following exits early when the type to work on isn't scalar integral. * tree-vect-patterns.cc (vect_look_through_possible_promotion): Exit early when the type isn't scalar integral.
2023-06-23Use element_precision for match.pd arith conversion optimizationRichard Biener1-4/+4
The simplification (outertype)((innertype0)a+(innertype1)b) to ((newtype)a+(newtype)b) ends up using TYPE_PRECISION to check whether it can elide a conversion but in some paths there can be VECTOR_TYPEs where this instead compares the number of lanes. The following fixes the missed optimizations and uses element_precision in those places. * match.pd ((outertype)((innertype0)a+(innertype1)b) -> ((newtype)a+(newtype)b)): Use element_precision where appropriate.
2023-06-23Bogus and missed folding on vector comparesRichard Biener2-4/+4
fold_binary tries to transform (double)float1 CMP (double)float2 into float1 CMP float2 but ends up using TYPE_PRECISION on the argument types. For vector types that compares the number of lanes which should be always equal (so it's harmless as to not generating wrong code). The following instead properly uses element_precision. The same happens in the corresponding match.pd pattern. * fold-const.cc (fold_binary_loc): Use element_precision when trying (double)float1 CMP (double)float2 to float1 CMP float2 simplification. * match.pd: Likewise.
2023-06-23Optimize vector codegen for invariant loads, fix SLP supportRichard Biener1-20/+19
The following avoids creating duplicate stmts for invariant loads which was necessary when the vector stmts were in a linked list. It also fixes SLP support which didn't correctly create the appropriate number of copies. * tree-vect-stmts.cc (vectorizable_load): Avoid useless copies of VMAT_INVARIANT vectorized stmts, fix SLP support.
2023-06-23Improve vector_vector_composition_typeRichard Biener1-0/+8
We sometimes get to ask to decompose, say V2DFmode into two halves. Currently this results in composing it from two DImode pieces instead of the obvious two DFmode pieces. The following adjusts vector_vector_composition_type for this trivial case and avoids a VIEW_CONVERT_EXPR in the initial code generation. * tree-vect-stmts.cc (vector_vector_composition_type): Handle composition of a vector from a number of elements that happens to match its number of lanes.
2023-06-23Daily bump.GCC Administrator6-1/+357
2023-06-22rust: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy1-3/+3
This was noticed when fixing the gccgo usage of the macro, the rust usage is very similar. TARGET_AIX is defined as a non-zero value on linux/powerpc64le which may cause unexpected behavior. TARGET_AIX_OS should be used to toggle AIX specific behavior. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/rust/ * rust-object-export.cc [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS.
2023-06-22go: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy2-7/+7
TARGET_AIX is defined to a non-zero value on linux and maybe other powerpc64le targets. This leads to unexpected behavior such as dropping the .go_export section when linking a shared library on linux/powerpc64le. Instead, use TARGET_AIX_OS to toggle AIX specific behavior. Fixes golang/go#60798. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/go/ * go-backend.cc [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS. * go-lang.cc: Likewise.
2023-06-22configure: Implement --enable-host-bind-nowMarek Polacek3-3/+36
As promised in the --enable-host-pie patch, this patch adds another configure option, --enable-host-bind-now, which adds -z now when linking the compiler executables in order to extend hardening. BIND_NOW with RELRO allows the GOT to be marked RO; this prevents GOT modification attacks. This option does not affect linking of target libraries; you can use LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW. With this patch: $ readelf -Wd cc1{,plus,obj,gm2} f951 lto1 cpp rust1 gnat1 | grep FLAGS 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE 0x000000000000001e (FLAGS) BIND_NOW 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE c++tools/ChangeLog: * configure.ac (--enable-host-bind-now): New check. * configure: Regenerate. gcc/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Add -Wl,-z,now to LD_PICFLAG if --enable-host-bind-now. * configure: Regenerate. * doc/install.texi: Document --enable-host-bind-now. lto-plugin/ChangeLog: * configure.ac (--enable-host-bind-now): New check. Link with -z,now. * configure: Regenerate.
2023-06-22Change fma_reassoc_width tuning for ampere1Di Zhao OS1-1/+1
This patch enables reassociation of floating-point additions on ampere1. This brings about 1% overall benefit on spec2017 fprate cases. (There are minor regressions in 510.parest_r and 508.namd_r, analyzed here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110279 .) gcc/ChangeLog: * config/aarch64/aarch64.cc: Change fma_reassoc_width for ampere1.
2023-06-22tree-optimization/110332 - fix ICE with phipropRichard Biener4-3/+46
The following fixes an ICE that occurs when we visit an edge inserted load from the code validating correctness for inserting an aggregate copy there. We can simply skip those loads here. PR tree-optimization/110332 * tree-ssa-phiprop.cc (propagate_with_phi): Always check aliasing with edge inserted loads. * g++.dg/torture/pr110332.C: New testcase. * gcc.dg/torture/pr110332-1.c: Likewise. * gcc.dg/torture/pr110332-2.c: Likewise.
2023-06-22i386: Convert ptestz of pandn into ptestc.Roger Sayle11-10/+259
This patch is the next installment in a set of backend patches around improvements to ptest/vptest. A previous patch optimized the sequence t=pand(x,y); ptestz(t,t) into the equivalent ptestz(x,y), using the property that ZF is set to (X&Y) == 0. This patch performs a similar transformation, converting t=pandn(x,y); ptestz(t,t) into the (almost) equivalent ptestc(y,x), using the property that the CF flags is set to (~X&Y) == 0. The tricky bit is that this sets the CF flag instead of the ZF flag, so we can only perform this transformation when we can also convert the flags consumer, as well as the producer. For the test case: int foo (__m128i x, __m128i y) { __m128i a = x & ~y; return __builtin_ia32_ptestz128 (a, a); } With -O2 -msse4.1 we previously generated: foo: pandn %xmm0, %xmm1 xorl %eax, %eax ptest %xmm1, %xmm1 sete %al ret with this patch we now generate: foo: xorl %eax, %eax ptest %xmm0, %xmm1 setc %al ret At the same time, this patch also provides alternative fixes for PR target/109973 and PR target/110118, by recognizing that ptestc(x,x) always sets the carry flag (X&~X is always zero). This is achieved both by recognizing the special case in ix86_expand_sse_ptest and with a splitter to convert an eligible ptest into an stc. 2023-06-22 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_sse_ptest): Recognize expansion of ptestc with equal operands as producing const1_rtx. * config/i386/i386.cc (ix86_rtx_costs): Provide accurate cost estimates of UNSPEC_PTEST, where the ptest performs the PAND or PAND of its operands. * config/i386/sse.md (define_split): Transform CCCmode UNSPEC_PTEST of reg_equal_p operands into an x86_stc instruction. (define_split): Split pandn/ptestz/set{n?}e into ptestc/set{n?}c. (define_split): Similar to above for strict_low_part destinations. (define_split): Split pandn/ptestz/j{n?}e into ptestc/j{n?}c. gcc/testsuite/ChangeLog * gcc.target/i386/avx-vptest-4.c: New test case. * gcc.target/i386/avx-vptest-5.c: Likewise. * gcc.target/i386/avx-vptest-6.c: Likewise. * gcc.target/i386/pr109973-1.c: Update test case. * gcc.target/i386/pr109973-2.c: Likewise. * gcc.target/i386/sse4_1-ptest-4.c: New test case. * gcc.target/i386/sse4_1-ptest-5.c: Likewise. * gcc.target/i386/sse4_1-ptest-6.c: Likewise.
2023-06-21analyzer: add text-art visualizations of out-of-bounds accesses [PR106626]David Malcolm54-148/+4383
This patch extends -Wanalyzer-out-of-bounds so that, where possible, it will emit a text art diagram visualizing the spatial relationship between (a) the memory region that the analyzer predicts would be accessed, versus (b) the range of memory that is valid to access - whether they overlap, are touching, are close or far apart; which one is before or after in memory, the relative sizes involved, the direction of the access (read vs write), and, in some cases, the values of data involved. This diagram can be suppressed using -fdiagnostics-text-art-charset=none. For example, given: int32_t arr[10]; int32_t int_arr_read_element_before_start_far(void) { return arr[-100]; } it emits: demo-1.c: In function ‘int_arr_read_element_before_start_far’: demo-1.c:7:13: warning: buffer under-read [CWE-127] [-Wanalyzer-out-of-bounds] 7 | return arr[-100]; | ~~~^~~~~~ ‘int_arr_read_element_before_start_far’: event 1 | | 7 | return arr[-100]; | | ~~~^~~~~~ | | | | | (1) out-of-bounds read from byte -400 till byte -397 but ‘arr’ starts at byte 0 | demo-1.c:7:13: note: valid subscripts for ‘arr’ are ‘[0]’ to ‘[9]’ ┌───────────────────────────┐ │read of ‘int32_t’ (4 bytes)│ └───────────────────────────┘ ^ │ │ ┌───────────────────────────┐ ┌────────┬────────┬─────────┐ │ │ │ [0] │ ... │ [9] │ │ before valid range │ ├────────┴────────┴─────────┤ │ │ │‘arr’ (type: ‘int32_t[10]’)│ └───────────────────────────┘ └───────────────────────────┘ ├─────────────┬─────────────┤├─────┬──────┤├─────────────┬─────────────┤ │ │ │ ╭────────────┴───────────╮ ╭────┴────╮ ╭───────┴──────╮ │⚠️ under-read of 4 bytes│ │396 bytes│ │size: 40 bytes│ ╰────────────────────────╯ ╰─────────╯ ╰──────────────╯ and given: #include <string.h> void test_non_ascii () { char buf[5]; strcpy (buf, "文字化け"); } it emits: demo-2.c: In function ‘test_non_ascii’: demo-2.c:7:3: warning: stack-based buffer overflow [CWE-121] [-Wanalyzer-out-of-bounds] 7 | strcpy (buf, "文字化け"); | ^~~~~~~~~~~~~~~~~~~~~~~~ ‘test_non_ascii’: events 1-2 | | 6 | char buf[5]; | | ^~~ | | | | | (1) capacity: 5 bytes | 7 | strcpy (buf, "文字化け"); | | ~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) out-of-bounds write from byte 5 till byte 12 but ‘buf’ ends at byte 5 | demo-2.c:7:3: note: write of 8 bytes to beyond the end of ‘buf’ 7 | strcpy (buf, "文字化け"); | ^~~~~~~~~~~~~~~~~~~~~~~~ demo-2.c:7:3: note: valid subscripts for ‘buf’ are ‘[0]’ to ‘[4]’ ┌─────┬─────┬─────┬────┬────┐┌────┬────┬────┬────┬────┬────┬────┬──────┐ │ [0] │ [1] │ [2] │[3] │[4] ││[5] │[6] │[7] │[8] │[9] │[10]│[11]│ [12] │ ├─────┼─────┼─────┼────┼────┤├────┼────┼────┼────┼────┼────┼────┼──────┤ │0xe6 │0x96 │0x87 │0xe5│0xad││0x97│0xe5│0x8c│0x96│0xe3│0x81│0x91│ 0x00 │ ├─────┴─────┴─────┼────┴────┴┴────┼────┴────┴────┼────┴────┴────┼──────┤ │ U+6587 │ U+5b57 │ U+5316 │ U+3051 │U+0000│ ├─────────────────┼───────────────┼──────────────┼──────────────┼──────┤ │ 文 │ 字 │ 化 │ け │ NUL │ ├─────────────────┴───────────────┴──────────────┴──────────────┴──────┤ │ string literal (type: ‘char[13]’) │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ v v v v v v v v v v v v v ┌─────┬────────────────┬────┐┌─────────────────────────────────────────┐ │ [0] │ ... │[4] ││ │ ├─────┴────────────────┴────┤│ after valid range │ │ ‘buf’ (type: ‘char[5]’) ││ │ └───────────────────────────┘└─────────────────────────────────────────┘ ├─────────────┬─────────────┤├────────────────────┬────────────────────┤ │ │ ╭────────┴────────╮ ╭───────────┴──────────╮ │capacity: 5 bytes│ │⚠️ overflow of 8 bytes│ ╰─────────────────╯ ╰──────────────────────╯ showing that the overflow occurs partway through the UTF-8 encoding of the U+5b57 code point. There are lots more examples in the test suite. It doesn't show up in this email, but the above diagrams are colorized to constrast the valid and invalid access ranges. gcc/ChangeLog: PR analyzer/106626 * Makefile.in (ANALYZER_OBJS): Add analyzer/access-diagram.o. * doc/invoke.texi (Wanalyzer-out-of-bounds): Add description of text art. (fanalyzer-debug-text-art): New. gcc/analyzer/ChangeLog: PR analyzer/106626 * access-diagram.cc: New file. * access-diagram.h: New file. * analyzer.h (class region_offset): Add default ctor. (region_offset::make_byte_offset): New decl. (region_offset::concrete_p): New. (region_offset::get_concrete_byte_offset): New. (region_offset::calc_symbolic_bit_offset): New decl. (region_offset::calc_symbolic_byte_offset): New decl. (region_offset::dump_to_pp): New decl. (region_offset::dump): New decl. (operator<, operator<=, operator>, operator>=): New decls for region_offset. * analyzer.opt (-param=analyzer-text-art-string-ellipsis-threshold=): New. (-param=analyzer-text-art-string-ellipsis-head-len=): New. (-param=analyzer-text-art-string-ellipsis-tail-len=): New. (-param=analyzer-text-art-ideal-canvas-width=): New. (fanalyzer-debug-text-art): New. * bounds-checking.cc: Include "intl.h", "diagnostic-diagram.h", and "analyzer/access-diagram.h". (class out_of_bounds::oob_region_creation_event_capacity): New. (out_of_bounds::out_of_bounds): Add "model" and "sval_hint" params. (out_of_bounds::mark_interesting_stuff): Use the base region. (out_of_bounds::add_region_creation_events): Use oob_region_creation_event_capacity. (out_of_bounds::get_dir): New pure vfunc. (out_of_bounds::maybe_show_notes): New. (out_of_bounds::maybe_show_diagram): New. (out_of_bounds::make_access_diagram): New. (out_of_bounds::m_model): New field. (out_of_bounds::m_sval_hint): New field. (out_of_bounds::m_region_creation_event_id): New field. (concrete_out_of_bounds::concrete_out_of_bounds): Update for new fields. (concrete_past_the_end::concrete_past_the_end): Likewise. (concrete_past_the_end::add_region_creation_events): Use oob_region_creation_event_capacity. (concrete_buffer_overflow::concrete_buffer_overflow): Update for new fields. (concrete_buffer_overflow::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_overflow::get_dir): New. (concrete_buffer_over_read::concrete_buffer_over_read): Update for new fields. (concrete_buffer_over_read::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_overflow::get_dir): New. (concrete_buffer_underwrite::concrete_buffer_underwrite): Update for new fields. (concrete_buffer_underwrite::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_underwrite::get_dir): New. (concrete_buffer_under_read::concrete_buffer_under_read): Update for new fields. (concrete_buffer_under_read::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_under_read::get_dir): New. (symbolic_past_the_end::symbolic_past_the_end): Update for new fields. (symbolic_buffer_overflow::symbolic_buffer_overflow): Likewise. (symbolic_buffer_overflow::emit): Call maybe_show_notes. (symbolic_buffer_overflow::get_dir): New. (symbolic_buffer_over_read::symbolic_buffer_over_read): Update for new fields. (symbolic_buffer_over_read::emit): Call maybe_show_notes. (symbolic_buffer_over_read::get_dir): New. (region_model::check_symbolic_bounds): Add "sval_hint" param. Pass it and sized_offset_reg to diagnostics. (region_model::check_region_bounds): Add "sval_hint" param, passing it to diagnostics. * diagnostic-manager.cc (diagnostic_manager::emit_saved_diagnostic): Pass logger to pending_diagnostic::emit. * engine.cc: Add logger param to pending_diagnostic::emit implementations. * infinite-recursion.cc: Likewise. * kf-analyzer.cc: Likewise. * kf.cc: Likewise. Add nullptr for new param of check_region_for_write. * pending-diagnostic.h: Likewise in decl. * region-model-manager.cc (region_model_manager::get_or_create_int_cst): Convert param from poly_int64 to const poly_wide_int_ref &. (region_model_manager::maybe_fold_binop): Support type being NULL when checking for floating-point types. Check for (X + Y) - X => Y. Be less strict about types when folding associative ops. Check for (X + Y) * CST => (X * CST) + (Y * CST). * region-model-manager.h (region_model_manager::get_or_create_int_cst): Convert param from poly_int64 to const poly_wide_int_ref &. * region-model.cc: Add logger param to pending_diagnostic::emit implementations. (region_model::check_external_function_for_access_attr): Update for new param of check_region_for_write. (region_model::deref_rvalue): Use nullptr rather than NULL. (region_model::get_capacity): Handle RK_STRING. (region_model::check_region_access): Add "sval_hint" param; pass it to check_region_bounds. (region_model::check_region_for_write): Add "sval_hint" param; pass it to check_region_access. (region_model::check_region_for_read): Add NULL for new param to check_region_access. (region_model::set_value): Pass rhs_sval to check_region_for_write. (region_model::get_representative_path_var_1): Handle SK_CONSTANT in the check for infinite recursion. * region-model.h (region_model::check_region_for_write): Add "sval_hint" param. (region_model::check_region_access): Likewise. (region_model::check_symbolic_bounds): Likewise. (region_model::check_region_bounds): Likewise. * region.cc (region_offset::make_byte_offset): New. (region_offset::calc_symbolic_bit_offset): New. (region_offset::calc_symbolic_byte_offset): New. (region_offset::dump_to_pp): New. (region_offset::dump): New. (struct linear_op): New. (operator<, operator<=, operator>, operator>=): New, for region_offset. (region::get_next_offset): New. (region::get_relative_symbolic_offset): Use ptrdiff_type_node. (field_region::get_relative_symbolic_offset): Likewise. (element_region::get_relative_symbolic_offset): Likewise. (bit_range_region::get_relative_symbolic_offset): Likewise. * region.h (region::get_next_offset): New decl. * sm-fd.cc: Add logger param to pending_diagnostic::emit implementations. * sm-file.cc: Likewise. * sm-malloc.cc: Likewise. * sm-pattern-test.cc: Likewise. * sm-sensitive.cc: Likewise. * sm-signal.cc: Likewise. * sm-taint.cc: Likewise. * store.cc (bit_range::contains_p): Allow "out" to be null. * store.h (byte_range::get_start_bit_offset): New. (byte_range::get_next_bit_offset): New. * varargs.cc: Add logger param to pending_diagnostic::emit implementations. gcc/testsuite/ChangeLog: PR analyzer/106626 * gcc.dg/analyzer/data-model-1.c (test_16): Update for out-of-bounds working. * gcc.dg/analyzer/out-of-bounds-diagram-1-ascii.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-debug.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-emoji.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-json.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-sarif.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-unicode.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-10.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-11.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-12.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-13.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-14.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-15.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-2.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-3.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-4.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-5-ascii.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-5-unicode.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-6.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-7.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-8.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-9.c: New test. * gcc.dg/analyzer/pattern-test-2.c: Update expected results. * gcc.dg/analyzer/pr101962.c: Update expected results. * gcc.dg/plugin/analyzer_gil_plugin.c: Add logger param to pending_diagnostic::emit implementations. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21diagnostics: add support for "text art" diagramsDavid Malcolm43-12/+7144
Existing text output in GCC has to be implemented by writing sequentially to a pretty_printer instance. This makes it hard to implement some kinds of diagnostic output (see e.g. diagnostic-show-locus.cc). This patch adds more flexible ways of creating text output: - a canvas class, which can be "painted" to via random-access (rather that sequentially) - a table class for 2D grid layout, supporting items that span multiple rows/columns - a widget class for organizing diagrams hierarchically. The patch also expands GCC's diagnostics subsystem so that diagnostics can have "text art" diagrams - think ASCII art, but potentially including some Unicode characters, such as box-drawing chars. The new code is in a new "gcc/text-art" subdirectory and "text_art" namespace. The patch adds a new "-fdiagnostics-text-art-charset=VAL" option, with values: - "none": don't emit diagrams (added to -fdiagnostics-plain-output) - "ascii": use pure ASCII in diagrams - "unicode": allow for conservative use of unicode drawing characters (such as box-drawing characters). - "emoji" (the default): as "unicode", but potentially allow for conservative use of emoji in the output (such as U+26A0 WARNING SIGN). I made it possible to disable emoji separately from unicode as I believe there's a generation gap in acceptance of these characters (some older programmers have a visceral reaction against them, whereas younger programmers may have no problem with them). Diagrams are emitted to stderr by default. With SARIF output they are captured as a location in "relatedLocations", with the diagram as a code block in Markdown within a "markdown" property of a message. This patch doesn't add any such diagram usage to GCC, saving that for followups, apart from adding a plugin to the test suite to exercise the functionality. contrib/ChangeLog: * unicode/gen-box-drawing-chars.py: New file. * unicode/gen-combining-chars.py: New file. * unicode/gen-printable-chars.py: New file. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add text-art/box-drawing.o, text-art/canvas.o, text-art/ruler.o, text-art/selftests.o, text-art/style.o, text-art/styled-string.o, text-art/table.o, text-art/theme.o, and text-art/widget.o. * color-macros.h (COLOR_FG_BRIGHT_BLACK): New. (COLOR_FG_BRIGHT_RED): New. (COLOR_FG_BRIGHT_GREEN): New. (COLOR_FG_BRIGHT_YELLOW): New. (COLOR_FG_BRIGHT_BLUE): New. (COLOR_FG_BRIGHT_MAGENTA): New. (COLOR_FG_BRIGHT_CYAN): New. (COLOR_FG_BRIGHT_WHITE): New. (COLOR_BG_BRIGHT_BLACK): New. (COLOR_BG_BRIGHT_RED): New. (COLOR_BG_BRIGHT_GREEN): New. (COLOR_BG_BRIGHT_YELLOW): New. (COLOR_BG_BRIGHT_BLUE): New. (COLOR_BG_BRIGHT_MAGENTA): New. (COLOR_BG_BRIGHT_CYAN): New. (COLOR_BG_BRIGHT_WHITE): New. * common.opt (fdiagnostics-text-art-charset=): New option. (diagnostic-text-art.h): New SourceInclude. (diagnostic_text_art_charset) New Enum and EnumValues. * configure: Regenerate. * configure.ac (gccdepdir): Add text-art to loop. * diagnostic-diagram.h: New file. * diagnostic-format-json.cc (json_emit_diagram): New. (diagnostic_output_format_init_json): Wire it up to context->m_diagrams.m_emission_cb. * diagnostic-format-sarif.cc: Include "diagnostic-diagram.h" and "text-art/canvas.h". (sarif_result::on_nested_diagnostic): Move code to... (sarif_result::add_related_location): ...this new function. (sarif_result::on_diagram): New. (sarif_builder::emit_diagram): New. (sarif_builder::make_message_object_for_diagram): New. (sarif_emit_diagram): New. (diagnostic_output_format_init_sarif): Set context->m_diagrams.m_emission_cb to sarif_emit_diagram. * diagnostic-text-art.h: New file. * diagnostic.cc: Include "diagnostic-text-art.h", "diagnostic-diagram.h", and "text-art/theme.h". (diagnostic_initialize): Initialize context->m_diagrams and call diagnostics_text_art_charset_init. (diagnostic_finish): Clean up context->m_diagrams.m_theme. (diagnostic_emit_diagram): New. (diagnostics_text_art_charset_init): New. * diagnostic.h (text_art::theme): New forward decl. (class diagnostic_diagram): Likewise. (diagnostic_context::m_diagrams): New field. (diagnostic_emit_diagram): New decl. * doc/invoke.texi (Diagnostic Message Formatting Options): Add -fdiagnostics-text-art-charset=. (-fdiagnostics-plain-output): Add -fdiagnostics-text-art-charset=none. * gcc.cc: Include "diagnostic-text-art.h". (driver_handle_option): Handle OPT_fdiagnostics_text_art_charset_. * opts-common.cc (decode_cmdline_options_to_array): Add "-fdiagnostics-text-art-charset=none" to expanded_args for -fdiagnostics-plain-output. * opts.cc: Include "diagnostic-text-art.h". (common_handle_option): Handle OPT_fdiagnostics_text_art_charset_. * pretty-print.cc (pp_unicode_character): New. * pretty-print.h (pp_unicode_character): New decl. * selftest-run-tests.cc: Include "text-art/selftests.h". (selftest::run_tests): Call text_art_tests. * text-art/box-drawing-chars.inc: New file, generated by contrib/unicode/gen-box-drawing-chars.py. * text-art/box-drawing.cc: New file. * text-art/box-drawing.h: New file. * text-art/canvas.cc: New file. * text-art/canvas.h: New file. * text-art/ruler.cc: New file. * text-art/ruler.h: New file. * text-art/selftests.cc: New file. * text-art/selftests.h: New file. * text-art/style.cc: New file. * text-art/styled-string.cc: New file. * text-art/table.cc: New file. * text-art/table.h: New file. * text-art/theme.cc: New file. * text-art/theme.h: New file. * text-art/types.h: New file. * text-art/widget.cc: New file. * text-art/widget.h: New file. gcc/testsuite/ChangeLog: * gcc.dg/plugin/diagnostic-test-text-art-ascii-bw.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-ascii-color.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-none.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-unicode-bw.c: New test. * gcc.dg/plugin/diagnostic-test-text-art-unicode-color.c: New test. * gcc.dg/plugin/diagnostic_plugin_test_text_art.c: New test plugin. * gcc.dg/plugin/plugin.exp (plugin_test_list): Add them. libcpp/ChangeLog: * charset.cc (get_cppchar_property): New function template, based on... (cpp_wcwidth): ...this function. Rework to use the above. Include "combining-chars.inc". (cpp_is_combining_char): New function Include "printable-chars.inc". (cpp_is_printable_char): New function * combining-chars.inc: New file, generated by contrib/unicode/gen-combining-chars.py. * include/cpplib.h (cpp_is_combining_char): New function decl. (cpp_is_printable_char): New function decl. * printable-chars.inc: New file, generated by contrib/unicode/gen-printable-chars.py. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21testsuite: move handle-multiline-outputs to before check for blank linesDavid Malcolm6-15/+21
I have followup patches that require checking for multiline patterns that have blank lines within them, so this moves the handling of multiline patterns before the check for blank lines, allowing for such multiline patterns. Doing so uncovers some issues with existing multiline directives, which the patch fixes. gcc/testsuite/ChangeLog: * c-c++-common/Wlogical-not-parentheses-2.c: Split up the multiline directive. * gcc.dg/analyzer/malloc-macro-inline-events.c: Remove redundant dg-regexp directives. * gcc.dg/missing-header-fixit-5.c: Split up the multiline directives. * lib/gcc-dg.exp (gcc-dg-prune): Move call to handle-multiline-outputs from prune_gcc_output to here. * lib/multiline.exp (dg-end-multiline-output): Move call to maybe-handle-nn-line-numbers from prune_gcc_output to here. * lib/prune.exp (prune_gcc_output): Move calls to maybe-handle-nn-line-numbers and handle-multiline-outputs from here to the above. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-06-21compiler: determine types of Slice_{value,info} expressionsIan Lance Taylor3-4/+13
This fixes an accidental omission in the determine types pass. Test case is https://go.dev/cl/505015. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/504797
2023-06-22Daily bump.GCC Administrator5-1/+317
2023-06-21function: Change return type of predicate function from int to boolUros Bizjak2-45/+42
Also change some internal variables to bool and some functions to void. gcc/ChangeLog: * function.h (emit_initial_value_sets): Change return type from int to void. (aggregate_value_p): Change return type from int to bool. (prologue_contains): Ditto. (epilogue_contains): Ditto. (prologue_epilogue_contains): Ditto. * function.cc (temp_slot): Make "in_use" variable bool. (make_slot_available): Update for changed "in_use" variable. (assign_stack_temp_for_type): Ditto. (emit_initial_value_sets): Change return type from int to void and update function body accordingly. (instantiate_virtual_regs): Ditto. (rest_of_handle_thread_prologue_and_epilogue): Ditto. (safe_insn_predicate): Change return type from int to bool. (aggregate_value_p): Change return type from int to bool and update function body accordingly. (prologue_contains): Change return type from int to bool. (prologue_epilogue_contains): Ditto.
2023-06-21c-family: implement -ffp-contract=onAlexander Monakov5-6/+89
Implement -ffp-contract=on for C and C++ without changing default behavior (=off for -std=cNN, =fast for C++ and -std=gnuNN). gcc/c-family/ChangeLog: * c-gimplify.cc (fma_supported_p): New helper. (c_gimplify_expr) [PLUS_EXPR, MINUS_EXPR]: Implement FMA contraction. gcc/ChangeLog: * common.opt (fp_contract_mode) [on]: Remove fallback. * config/sh/sh.md (*fmasf4): Correct flag_fp_contract_mode test. * doc/invoke.texi (-ffp-contract): Update. * trans-mem.cc (diagnose_tm_1): Skip internal function calls.
2023-06-21Fortran: Fix some bugs in associate [PR87477]Paul Thomas12-29/+286
2023-06-21 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/87477 PR fortran/88688 PR fortran/94380 PR fortran/107900 PR fortran/110224 * decl.cc (char_len_param_value): Fix memory leak. (resolve_block_construct): Remove unnecessary static decls. * expr.cc (gfc_is_ptr_fcn): New function. (gfc_check_vardef_context): Use it to permit pointer function result selectors to be used for associate names in variable definition context. * gfortran.h: Prototype for gfc_is_ptr_fcn. * match.cc (build_associate_name): New function. (gfc_match_select_type): Use the new function to replace inline version and to build a new associate name for the case where the supplied associate name is already used for that purpose. * resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow associate names with pointer function targets to be used in variable definition context. * trans-decl.cc (gfc_get_symbol_decl): Unlimited polymorphic variables need deferred initialisation of the vptr. (gfc_trans_deferred_vars): Do the vptr initialisation. * trans-stmt.cc (trans_associate_var): Ensure that a pointer associate name points to the target of the selector and not the selector itself. gcc/testsuite/ PR fortran/87477 PR fortran/107900 * gfortran.dg/pr107900.f90 : New test PR fortran/110224 * gfortran.dg/pr110224.f90 : New test PR fortran/88688 * gfortran.dg/pr88688.f90 : New test PR fortran/94380 * gfortran.dg/pr94380.f90 : New test PR fortran/95398 * gfortran.dg/pr95398.f90 : Set -std=f2008, bump the line numbers in the error tests by two and change the text in two.
2023-06-21Fortran: Seg fault passing string to type cptr dummy [PR108961].Paul Thomas2-1/+30
2023-06-21 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/108961 * trans-expr.cc (gfc_conv_procedure_call): The hidden string length must not be passed to a formal arg of type(cptr). gcc/testsuite/ PR fortran/108961 * gfortran.dg/pr108961.f90: New test.
2023-06-21vect: Add testcases for unsigned conversions [PR110018]Uros Bizjak2-10/+104
Also test convresions with unsigned types. PR target/110018 gcc/testsuite/ChangeLog: * gcc.target/i386/pr110018-1.c: Use explicit signed types. * gcc.target/i386/pr110018-2.c: New test.
2023-06-21aarch64: Avoid same input and output Z register for gather loadsKyrylo Tkachov4-71/+263
The architecture recommends that load-gather instructions avoid using the same Z register for the load address and the destination, and the Software Optimization Guides for Arm cores recommend that as well. This means that for code like: svuint64_t food (svbool_t p, uint64_t *in, svint64_t offsets, svuint64_t a) { return svadd_u64_x (p, a, svld1_gather_offset(p, in, offsets)); } we'll want to avoid generating the current: food: ld1d z0.d, p0/z, [x0, z0.d] // Z0 reused as input and output. add z0.d, z1.d, z0.d ret However, we still want to avoid generating extra moves where there were none before, so the tight aarch64-sve-acle.exp tests for load gathers should still pass as they are. This patch implements that recommendation for the load gather patterns by: * duplicating the alternatives * marking the output operand as early clobber * Tying the input Z register operand in the original alternatives to 0 * Penalising the original alternatives with '?' This results in a large-ish patch in terms of diff lines but the new compact syntax (thanks Tamar) makes it quite a readable an regular change. The benchmark numbers on a Neoverse V1 on fprate look okay: diff 503.bwaves_r 0.00% 507.cactuBSSN_r 0.00% 508.namd_r 0.00% 510.parest_r 0.55% 511.povray_r 0.22% 519.lbm_r 0.00% 521.wrf_r 0.00% 526.blender_r 0.00% 527.cam4_r 0.56% 538.imagick_r 0.00% 544.nab_r 0.00% 549.fotonik3d_r 0.00% 554.roms_r 0.00% fprate 0.10% Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>): Add alternatives to prefer to avoid same input and output Z register. (mask_gather_load<mode><v_int_container>): Likewise. (*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise. (*mask_gather_load<mode><v_int_container>_sxtw): Likewise. (*mask_gather_load<mode><v_int_container>_uxtw): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_sxtw): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_uxtw): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (*aarch64_ldff1_gather<mode>_sxtw): Likewise. (*aarch64_ldff1_gather<mode>_uxtw): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode> <VNx4_NARROW:mode>): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_sxtw): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_uxtw): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise. (@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode> <SVE_PARTIAL_I:mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/gather_earlyclobber.c: New test. * gcc.target/aarch64/sve2/gather_earlyclobber.c: New test.
2023-06-21aarch64: Convert SVE gather patterns to compact syntaxKyrylo Tkachov2-191/+211
This patch converts the SVE load gather patterns to the new compact syntax that Tamar introduced. This allows for a future patch I want to contribute to add more alternatives that are better viewed in the more compact form. The lines in some patterns are >80 long now, but I think that's unavoidable and those patterns already had overly long constraint strings. No functional change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>): Convert to compact alternatives syntax. (mask_gather_load<mode><v_int_container>): Likewise. (*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise. (*mask_gather_load<mode><v_int_container>_sxtw): Likewise. (*mask_gather_load<mode><v_int_container>_uxtw): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_sxtw): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_uxtw): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (*aarch64_ldff1_gather<mode>_sxtw): Likewise. (*aarch64_ldff1_gather<mode>_uxtw): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode> <VNx4_NARROW:mode>): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_sxtw): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_uxtw): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise. (@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode> <SVE_PARTIAL_I:mode>): Likewise.
2023-06-21Revert "aarch64: Convert SVE gather patterns to compact syntax"Kyrylo Tkachov2-275/+191
This reverts commit bb3c69058a5fb874ea3c5c26bfb331d33d0497c3.
2023-06-21Move can_vec_mask_load_store_p and get_len_load_store_mode from ↵Ju-Zhe Zhong5-69/+68
"optabs-query" into "optabs-tree" Since we want both can_vec_mask_load_store_p and get_len_load_store_mode can see "internal_fn", move these 2 functions into optabs-tree. gcc/ChangeLog: * optabs-query.cc (can_vec_mask_load_store_p): Move to optabs-tree.cc. (get_len_load_store_mode): Ditto. * optabs-query.h (can_vec_mask_load_store_p): Move to optabs-tree.h. (get_len_load_store_mode): Ditto. * optabs-tree.cc (can_vec_mask_load_store_p): New function. (get_len_load_store_mode): Ditto. * optabs-tree.h (can_vec_mask_load_store_p): Ditto. (get_len_load_store_mode): Ditto. * tree-if-conv.cc: include optabs-tree instead of optabs-query
2023-06-21Less strip_offset in IVOPTsRichard Biener1-3/+4
This avoids one strip_offset use in add_iv_candidate_for_use where we know it operates on a sizetype quantity. * tree-ssa-loop-ivopts.cc (add_iv_candidate_for_use): Use split_constant_offset for the POINTER_PLUS_EXPR case.
2023-06-21Less strip_offset in IVOPTsRichard Biener1-4/+4
This avoids a strip_offset use in record_group_use where we know it operates on addresses. * tree-ssa-loop-ivopts.cc (record_group_use): Use split_constant_offset.
2023-06-21Hide IVOPTs strip_offsetRichard Biener3-6/+9
PR110243 shows strip_offset has some correctness issues, the following avoids using it from loop distribution which can use the more correct split_constant_offset from data-ref analysis instead. The patch then un-exports the function from IVOPTs. * tree-loop-distribution.cc (classify_builtin_st): Use split_constant_offset. * tree-ssa-loop-ivopts.h (strip_offset): Remove. * tree-ssa-loop-ivopts.cc (strip_offset): Make static.
2023-06-21aarch64: Convert SVE gather patterns to compact syntaxKyrylo Tkachov2-191/+275
This patch converts the SVE load gather patterns to the new compact syntax that Tamar introduced. This allows for a future patch I want to contribute to add more alternatives that are better viewed in the more compact form. The lines in some patterns are >80 long now, but I think that's unavoidable and those patterns already had overly long constraint strings. No functional change intended. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (mask_gather_load<mode><v_int_container>): Convert to compact alternatives syntax. (mask_gather_load<mode><v_int_container>): Likewise. (*mask_gather_load<mode><v_int_container>_<su>xtw_unpacked): Likewise. (*mask_gather_load<mode><v_int_container>_sxtw): Likewise. (*mask_gather_load<mode><v_int_container>_uxtw): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): Likewise. (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_sxtw): Likewise. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode> <SVE_2BHSI:mode>_uxtw): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (@aarch64_ldff1_gather<mode>): Likewise. (*aarch64_ldff1_gather<mode>_sxtw): Likewise. (*aarch64_ldff1_gather<mode>_uxtw): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode> <VNx4_NARROW:mode>): Likewise. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_sxtw): Likewise. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode> <VNx2_NARROW:mode>_uxtw): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt<mode>): Likewise. (@aarch64_gather_ldnt_<ANY_EXTEND:optab><SVE_FULL_SDI:mode> <SVE_PARTIAL_I:mode>): Likewise.
2023-06-21docs: replace backslashchar [PR 110329].Tamar Christina1-1/+1
It seems like @blackslashchar{} is a relatively new addition to texinfo. Other parts of the docs use @samp{\} so use it here too so older distros work. gcc/ChangeLog: PR other/110329 * doc/md.texi: Replace backslashchar.
2023-06-21[i386] Reject too large vectors for partial vector vectorizationRichard Biener3-0/+51
The following works around the lack of the x86 backend making the vectorizer compare the costs of the different possible vector sizes the backed advertises through the vector_modes hook. When enabling masked epilogues or main loops then this means we will select the prefered vector mode which is usually the largest even for loops that do not iterate close to the times the vector has lanes. When not using masking the vectorizer would reject any mode resulting in a VF bigger than the number of iterations but with masking they are simply masked out. So this overloads the finish_cost function and matches for the problematic case, forcing a high cost to make us try a smaller vector size. * config/i386/i386.cc (ix86_vector_costs::finish_cost): Overload. For masked main loops make sure the vectorization factor isn't more than double the number of iterations. * gcc.target/i386/vect-partial-vectors-1.c: New testcase. * gcc.target/i386/vect-partial-vectors-2.c: Likewise.
2023-06-21x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512FJan Beulich3-11/+59
There's no reason to constrain this to AVX512VL, unless instructed so by -mprefer-vector-width=, as the wider operation is unusable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign<mode>3 can benefit from the operation being a single-insn one (leaving aside moves which the compiler decides to insert for unclear reasons, and leaving aside the fact that bcst_mem_operand() is too restrictive for broadcast to be embedded right into VPTERNLOG*). While there also bring *<avx512>_vternlog<mode>_all's in sync with that of the three splitters. Along with this also request value duplication in ix86_expand_copysign()'s call to ix86_build_signbit_mask(), eliminating excess space allocation in .rodata.*, filled with zeros which are never read. gcc/ * config/i386/i386-expand.cc (ix86_expand_copysign): Request value duplication by ix86_build_signbit_mask() when AVX512F and not HFmode. * config/i386/sse.md (*<avx512>_vternlog<mode>_all): Convert to 2-alternative form. Adjust "mode" attribute. Add "enabled" attribute. (*<avx512>_vpternlog<mode>_1): Also permit when TARGET_AVX512F && !TARGET_PREFER_AVX256. (*<avx512>_vpternlog<mode>_2): Likewise. (*<avx512>_vpternlog<mode>_3): Likewise. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: New test.
2023-06-21x86: add -mprefer-vector-width=512 to new avx512f-dupv2di.c testcaseJan Beulich1-1/+1
This is to cover testing also being done with -march=cascadelake. gcc/testsuite/ * gcc.target/i386/avx512f-dupv2di.c: Add -mprefer-vector-width=512.
2023-06-21Use intermiediate integer type for float_expr/fix_trunc_expr when direct ↵liuhongt2-2/+158
optab is not existed. We have already use intermidate type in case WIDEN, but not for NONE, this patch extended that. gcc/ChangeLog: PR target/110018 * tree-vect-stmts.cc (vectorizable_conversion): Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110018-1.c: New test.
2023-06-21Daily bump.GCC Administrator5-1/+623
2023-06-20gensupport: drop suppport for define_cond_exec from compact syntacTamar Christina1-2/+2
define_cond_exec does not support the special @@ syntax and so can't support {@. As such just remove support for it. gcc/ChangeLog: PR bootstrap/110324 * gensupport.cc (convert_syntax): Explicitly check for RTX code.
2023-06-20libcpp: Improve location for macro names [PR66290]Lewis Hyatt25-333/+385
When libcpp reports diagnostics whose locus is a macro name (such as for -Wunused-macros), it uses the location in the cpp_macro object that was stored by _cpp_new_macro. This is currently set to pfile->directive_line, which contains the line number only and no column information. This patch changes the stored location to the src_loc for the token defining the macro name, which includes the location and range information. libcpp/ChangeLog: PR c++/66290 * macro.cc (_cpp_create_definition): Add location argument. * internal.h (_cpp_create_definition): Adjust prototype. * directives.cc (do_define): Pass new location argument to _cpp_create_definition. (do_undef): Stop passing inferior location to cpp_warning_with_line; the default from cpp_warning is better. (cpp_pop_definition): Pass new location argument to _cpp_create_definition. * pch.cc (cpp_read_state): Likewise. gcc/testsuite/ChangeLog: PR c++/66290 * c-c++-common/cpp/macro-ranges.c: New test. * c-c++-common/cpp/line-2.c: Adapt to check for column information on macro-related libcpp warnings. * c-c++-common/cpp/line-3.c: Likewise. * c-c++-common/cpp/macro-arg-count-1.c: Likewise. * c-c++-common/cpp/pr58844-1.c: Likewise. * c-c++-common/cpp/pr58844-2.c: Likewise. * c-c++-common/cpp/warning-zero-location.c: Likewise. * c-c++-common/pragma-diag-14.c: Likewise. * c-c++-common/pragma-diag-15.c: Likewise. * g++.dg/modules/macro-2_d.C: Likewise. * g++.dg/modules/macro-4_d.C: Likewise. * g++.dg/modules/macro-4_e.C: Likewise. * g++.dg/spellcheck-macro-ordering.C: Likewise. * gcc.dg/builtin-redefine.c: Likewise. * gcc.dg/cpp/Wunused.c: Likewise. * gcc.dg/cpp/redef2.c: Likewise. * gcc.dg/cpp/redef3.c: Likewise. * gcc.dg/cpp/redef4.c: Likewise. * gcc.dg/cpp/ucnid-11-utf8.c: Likewise. * gcc.dg/cpp/ucnid-11.c: Likewise. * gcc.dg/cpp/undef2.c: Likewise. * gcc.dg/cpp/warn-redefined-2.c: Likewise. * gcc.dg/cpp/warn-redefined.c: Likewise. * gcc.dg/cpp/warn-unused-macros-2.c: Likewise. * gcc.dg/cpp/warn-unused-macros.c: Likewise.
2023-06-20aarch64: Fix gcc.target/aarch64/sve/pcs failuresRichard Sandiford67-92/+95
Several gcc.target/aarch64/sve/pcs tests started failing after 6a2e8dcbbd4, because the tests weren't robust against whether an indirect argument register or the stack pointer was used as the base for stores. The patch allows either base register when there is only one indirect argument. It disables -fcprop-registers in cases where there are sometimes multiple indirect arguments, since the name of the argument register is then an important part of the test. Disabling -fcprop-registers gives poor final register allocation, since: * combine's make_more_copies hack adds extra redundant moves * code with those moves is not allocated as well as moves without them * we often rely on -fcprop-registers to clean up the allocation later The patch therefore disables combine in the same tests as cprop-registers. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/args_1.c: Match moves from the stack pointer to indirect argument registers and allow either to be used as the base register in subsequent stores. * gcc.target/aarch64/sve/pcs/args_8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_2.c: Allow the store of the indirect argument to happen via the argument register or the stack pointer. * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_bf16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_bf16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_bf16.c: Disable -fcprop-registers and combine. * gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_bf16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise.
2023-06-20aarch64: Robustify stack tie handlingRichard Sandiford2-7/+18
The SVE handling of stack clash protection copied the stack pointer to X11 before the probe and set up X11 as the CFA for unwind purposes: /* This is done to provide unwinding information for the stack adjustments we're about to do, however to prevent the optimizers from removing the R11 move and leaving the CFA note (which would be very wrong) we tie the old and new stack pointer together. The tie will expand to nothing but the optimizers will not touch the instruction. */ rtx stack_ptr_copy = gen_rtx_REG (Pmode, STACK_CLASH_SVE_CFA_REGNUM); emit_move_insn (stack_ptr_copy, stack_pointer_rtx); emit_insn (gen_stack_tie (stack_ptr_copy, stack_pointer_rtx)); /* We want the CFA independent of the stack pointer for the duration of the loop. */ add_reg_note (insn, REG_CFA_DEF_CFA, stack_ptr_copy); RTX_FRAME_RELATED_P (insn) = 1; -fcprop-registers is now smart enough to realise that X11 = SP, replace X11 with SP in the stack tie, and delete the instruction created above. This patch tries to prevent that by making stack_tie fussy about the register numbers. It fixes failures in gcc.target/aarch64/sve/pcs/stack_clash*.c. gcc/ * config/aarch64/aarch64.md (stack_tie): Hard-code the first register operand to the stack pointer. Require the second register operand to have the number specified in a separate const_int operand. * config/aarch64/aarch64.cc (aarch64_emit_stack_tie): New function. (aarch64_allocate_and_probe_stack_space): Use it. (aarch64_expand_prologue, aarch64_expand_epilogue): Likewise. (aarch64_expand_epilogue): Likewise.
2023-06-20tree-ssa-math-opts: Small uaddc/usubc pattern matching improvement [PR79173]Jakub Jelinek2-0/+38
In the following testcase we fail to pattern recognize the least significant .UADDC call. The reason is that arg3 in that case is _3 = .ADD_OVERFLOW (...); _2 = __imag__ _3; _1 = _2 != 0; arg3 = (unsigned long) _1; and while before the changes arg3 has a single use in some .ADD_OVERFLOW later on, we add a .UADDC call next to it (and gsi_remove/gsi_replace only what is strictly necessary and leave quite a few dead stmts around which next DCE cleans up) and so it all of sudden isn't used just once, but twice (.ADD_OVERFLOW and .UADDC) and so uaddc_cast fails. While we could tweak uaddc_cast and not require has_single_use in these uses, there is also no vrp that would figure out that because __imag__ _3 is in [0, 1] range, it can just use arg3 = __imag__ _3; and drop the comparison and cast. We already search if either arg2 or arg3 is ultimately set from __imag__ of .{{ADD,SUB}_OVERFLOW,U{ADD,SUB}C} call, so the following patch just remembers the lhs of __imag__ from that case and uses it later. 2023-06-20 Jakub Jelinek <jakub@redhat.com> PR middle-end/79173 * tree-ssa-math-opts.cc (match_uaddc_usubc): Remember lhs of IMAGPART_EXPR of arg2/arg3 and use that as arg3 if it has the right type. * g++.target/i386/pr79173-1.C: New test.