aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-31Provide counted_by attribute to flexible array member fieldQing Zhao10-21/+444
'counted_by (COUNT)' The 'counted_by' attribute may be attached to the C99 flexible array member of a structure. It indicates that the number of the elements of the array is given by the field "COUNT" in the same structure as the flexible array member. GCC may use this information to improve detection of object size information for such structures and provide better results in compile-time diagnostics and runtime features like the array bound sanitizer and the '__builtin_dynamic_object_size'. For instance, the following code: struct P { size_t count; char other; char array[] __attribute__ ((counted_by (count))); } *p; specifies that the 'array' is a flexible array member whose number of elements is given by the field 'count' in the same structure. The field that represents the number of the elements should have an integer type. Otherwise, the compiler reports an error and ignores the attribute. When the field that represents the number of the elements is assigned a negative integer value, the compiler treats the value as zero. An explicit 'counted_by' annotation defines a relationship between two objects, 'p->array' and 'p->count', and there are the following requirementthat on the relationship between this pair: * 'p->count' must be initialized before the first reference to 'p->array'; * 'p->array' has _at least_ 'p->count' number of elements available all the time. This relationship must hold even after any of these related objects are updated during the program. It's the user's responsibility to make sure the above requirements to be kept all the time. Otherwise the compiler reports warnings, at the same time, the results of the array bound sanitizer and the '__builtin_dynamic_object_size' is undefined. One important feature of the attribute is, a reference to the flexible array member field uses the latest value assigned to the field that represents the number of the elements before that reference. For example, p->count = val1; p->array[20] = 0; // ref1 to p->array p->count = val2; p->array[30] = 0; // ref2 to p->array in the above, 'ref1' uses 'val1' as the number of the elements in 'p->array', and 'ref2' uses 'val2' as the number of elements in 'p->array'. gcc/c-family/ChangeLog: * c-attribs.cc (handle_counted_by_attribute): New function. (attribute_takes_identifier_p): Add counted_by attribute to the list. * c-common.cc (c_flexible_array_member_type_p): ...To this. * c-common.h (c_flexible_array_member_type_p): New prototype. gcc/c/ChangeLog: * c-decl.cc (flexible_array_member_type_p): Renamed and moved to... (add_flexible_array_elts_to_size): Use renamed function. (is_flexible_array_member_p): Use renamed function. (verify_counted_by_attribute): New function. (finish_struct): Use renamed function and verify counted_by attribute. * c-tree.h (lookup_field): New prototype. * c-typeck.cc (lookup_field): Expose as extern function. (tagged_types_tu_compatible_p): Check counted_by attribute for structure type. gcc/ChangeLog: * doc/extend.texi: Document attribute counted_by. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by.c: New test. * gcc.dg/flex-array-counted-by-7.c: New test. * gcc.dg/flex-array-counted-by-8.c: New test.
2024-05-31alpha: Fix invalid RTX in divmodsi insn patterns [PR115297]Uros Bizjak3-10/+26
any_divmod instructions are modelled with invalid RTX: [(set (match_operand:DI 0 "register_operand" "=c") (sign_extend:DI (match_operator:SI 3 "divmod_operator" [(match_operand:DI 1 "register_operand" "a") (match_operand:DI 2 "register_operand" "b")]))) (clobber (reg:DI 23)) (clobber (reg:DI 28))] where SImode divmod_operator (div,mod,udiv,umod) has DImode operands. Wrap input operand with truncate:SI to make machine modes consistent. PR target/115297 gcc/ChangeLog: * config/alpha/alpha.md (<any_divmod:code>si3): Wrap DImode operands 3 and 4 with truncate:SI RTX. (*divmodsi_internal_er): Ditto for operands 1 and 2. (*divmodsi_internal_er_1): Ditto. (*divmodsi_internal): Ditto. * config/alpha/constraints.md ("b"): Correct register number in the description. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr115297.c: New test.
2024-05-31nvptx target: Global constructor, destructor support, via nvptx-tools 'ld'Thomas Schwinge6-5/+109
The function attributes 'constructor', 'destructor', and 'init_priority' now work, as do the C++ features making use of this. Test cases with effective target 'global_constructor' and 'init_priority' now generally work, and 'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented: global constructors not supported on this target". For proper execution test results, this depends on <https://github.com/SourceryTools/nvptx-tools/commit/96f8fc59a757767b9e98157d95c21e9fef22a93b> "ld: Global constructor/destructor support". gcc/ * config/nvptx/nvptx.h: Configure global constructor, destructor support. gcc/testsuite/ * gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is 'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper in identifiers. * lib/target-supports.exp (check_effective_target_global_constructor): Enable for nvptx. libgcc/ * config/nvptx/crt0.c (__gbl_ctors): New weak function. (__main): Invoke it. * config/nvptx/gbl-ctors.c: New. * config/nvptx/t-nvptx: Configure global constructor, destructor support.
2024-05-31tree-optimization/115278 - fix DSE in if-conversion wrt volatilesRichard Biener2-1/+41
The following adds the missing guard for volatile stores to the embedded DSE in the loop if-conversion pass. PR tree-optimization/115278 * tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores. * g++.dg/vect/pr115278.cc: New testcase.
2024-05-31fix: valid compiler optimization may fail the testMarc Poulhiès1-0/+12
cxa4001 may fail with "Exception not raised" when the compiler omits the calls to To_Mapping, in accordance with 10.2.1(18/3): "If a library unit is declared pure, then the implementation is permitted to omit a call on a library-level subprogram of the library unit if the results are not needed after the call" Using the result of both To_Mapping calls prevents the compiler from omitting them. "The corrected test will be available on the ACAA web site (http://www.ada-auth.org/), and will be issued with the Modified Tests List version 2.6K, 3.1DD, and 4.1GG." gcc/testsuite/ChangeLog: * ada/acats/tests/cxa/cxa4001.a: Use function result.
2024-05-31build: Include minor version in config.gcc unsupported messageRainer Orth1-2/+2
It has been pointed out to me that when moving Solaris 11.3 from config.gcc's obsolete to unsupported list, I'd forgotten to also move the minor version info, leading to confusing *** Configuration i386-pc-solaris2.11 not supported instead of the correct *** Configuration i386-pc-solaris2.11.3 not supported This patch fixes this oversight. Tested on i386-pc-solaris2.11 (11.3 and 11.4). 2024-05-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc: * config.gcc: Move ${target_min} from obsolete to unsupported message.
2024-05-31Fix some opindex for some options [PR115022]Andrew Pinski13-16/+33
While looking at the index I noticed that some options had `-` in the front for the index which is wrong. And then I noticed there was no index for `mcmodel=` for targets or had used `-mcmodel` incorrectly. This fixes both of those and regnerates the urls files see that `-mcmodel=` option now has an url associated with it. gcc/ChangeLog: PR target/115022 * doc/invoke.texi (fstrub=disable): Fix opindex. (minline-memops-threshold): Fix opindex. (mcmodel=): Add opindex and fix them. * common.opt.urls: Regenerate. * config/aarch64/aarch64.opt.urls: Regenerate. * config/bpf/bpf.opt.urls: Regenerate. * config/i386/i386.opt.urls: Regenerate. * config/loongarch/loongarch.opt.urls: Regenerate. * config/nds32/nds32-elf.opt.urls: Regenerate. * config/nds32/nds32-linux.opt.urls: Regenerate. * config/or1k/or1k.opt.urls: Regenerate. * config/riscv/riscv.opt.urls: Regenerate. * config/rs6000/aix64.opt.urls: Regenerate. * config/rs6000/linux64.opt.urls: Regenerate. * config/sparc/sparc.opt.urls: Regenerate. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-31testsuite: Adjust several dg-additional-files-options calls [PR115294]Rainer Orth5-5/+5
A recent patch commit bdc264a16e327c63d133131a695a202fbbc0a6a0 Author: Alexandre Oliva <oliva@adacore.com> Date: Thu May 30 02:06:48 2024 -0300 [testsuite] conditionalize dg-additional-sources on target and type added two additional args to dg-additional-files-options. Unfortunately, this completely broke several testsuites like ERROR: tcl error sourcing /vol/gcc/src/hg/master/local/libatomic/testsuite/../../gcc/testsuite/lib/gcc-dg.exp. wrong # args: should be "dg-additional-files-options options source dest type" since the patch forgot to adjust some of the callers. This patch fixes that. Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and x86_64-pc-linux-gnu. 2024-05-31 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> libatomic: PR testsuite/115294 * testsuite/lib/libatomic.exp (libatomic_target_compile): Pass new dg-additional-files-options args. libgomp: PR testsuite/115294 * testsuite/lib/libgomp.exp (libgomp_target_compile): Pass new dg-additional-files-options args. libitm: PR testsuite/115294 * testsuite/lib/libitm.exp (libitm_target_compile): Pass new dg-additional-files-options args. libphobos: PR testsuite/115294 * testsuite/lib/libphobos.exp (libphobos_target_compile): Pass new dg-additional-files-options args. libvtv: PR testsuite/115294 * testsuite/lib/libvtv.exp (libvtv_target_compile): Pass new dg-additional-files-options args.
2024-05-30xtensa: Use epilogue_completed rather than cfun->machine->epilogue_doneTakayuki 'January June' Suwa3-16/+4
In commit ad89d820bf, an "epilogue_done" member was added to the machine_function structure, but it is sufficient to use the existing "epilogue_completed" global variable. gcc/ChangeLog: * config/xtensa/xtensa-protos.h (xtensa_use_return_instruction_p): Remove. * config/xtensa/xtensa.cc (machine_function): Remove "epilogue_done" field. (xtensa_expand_epilogue): Remove "cfun->machine->epilogue_done" usage. (xtensa_use_return_instruction_p): Remove. * config/xtensa/xtensa.md ("return"): Replace calling "xtensa_use_return_instruction_p()" with inline code.
2024-05-30xtensa: Use REG_P(), MEM_P(), etc. instead of comparing GET_CODE()Takayuki 'January June' Suwa3-53/+51
Instead of comparing directly, this patch replaces as much as possible with macros that determine RTX code such as REG_P(), SUBREG_P() or MEM_P(), etc. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_valid_move, constantpool_address_p, xtensa_tls_symbol_p, gen_int_relational, xtensa_emit_move_sequence, xtensa_copy_incoming_a7, xtensa_expand_block_move, xtensa_expand_nonlocal_goto, xtensa_emit_call, xtensa_legitimate_address_p, xtensa_legitimize_address, xtensa_tls_referenced_p, print_operand, print_operand_address, xtensa_output_literal): Replace RTX code comparisons with their predicate macros such as REG_P(). * config/xtensa/xtensa.h (CONSTANT_ADDRESS_P, LEGITIMATE_PIC_OPERAND_P): Ditto. * config/xtensa/xtensa.md (reload<mode>_literal, indirect_jump): Ditto.
2024-05-31C23: allow aliasing for types derived from structs with variable sizeMartin Uecker4-9/+59
Previously, we set the aliasing set of structures with variable size struct foo { int x[n]; char b; }; to zero. The reason is that such types can be compatible to diffrent structure types which are incompatible. struct foo { int x[2]; char b; }; struct foo { int x[3]; char b; }; But it is not enough to set the aliasing set to zero, because derived types would then still end up in different equivalence classes even though they might be compatible. Instead those types should be set to structural equivalency. We also add checking assertions that ensure that TYPE_CANONICAL is set correctly for all tagged types. gcc/c/ * c-decl.cc (finish_struct): Do not set TYPE_CANONICAL for structure or unions with variable size. * c-objc-common.cc (c_get_alias_set): Do not set alias set to zero. * c-typeck.cc (comptypes_verify): New function. (comptypes,comptypes_same_p,comptypes_check_enum_int): Add assertion. (comptypes_equiv_p): Add assertion that ensures that compatible types have the same equivalence class. (tagged_types_tu_compatible_p): Remove now unneeded special case. gcc/testsuite/ * gcc.dg/gnu23-tag-alias-8.c: New test.
2024-05-31C: allow aliasing of compatible types derived from enumeral types [PR115157]Martin Uecker8-14/+112
Aliasing of enumeral types with the underlying integer is now allowed by setting the aliasing set to zero. But this does not allow aliasing of derived types which are compatible as required by ISO C. Instead, initially set structural equality. Then set TYPE_CANONICAL and update pointers and main variants when the type is completed (as done for structures and unions in C23). PR tree-optimization/115157 PR tree-optimization/115177 gcc/c/ * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum, finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / TYPE_CANONICAL. * c-objc-common.cc (get_alias_set): Remove special case. (get_aka_type): Add special case. gcc/c-family/ * c-attribs.cc (handle_hardbool_attribute): Set TYPE_CANONICAL for hardbools. gcc/ * godump.cc (go_output_typedef): Use TYPE_MAIN_VARIANT instead of TYPE_CANONICAL. gcc/testsuite/ * gcc.dg/enum-alias-1.c: New test. * gcc.dg/enum-alias-2.c: New test. * gcc.dg/enum-alias-3.c: New test. * gcc.dg/enum-alias-4.c: New test.
2024-05-31Rename double_u with __double_u to avoid pulluting the namespace.liuhongt2-8/+8
gcc/ChangeLog: * config/i386/emmintrin.h (__double_u): Rename from double_u. (_mm_load_sd): Replace double_u with __double_u. (_mm_store_sd): Ditto. (_mm_loadh_pd): Ditto. (_mm_loadl_pd): Ditto. * config/i386/xmmintrin.h (__float_u): Rename from float_u. (_mm_load_ss): Ditto. (_mm_store_ss): Ditto.
2024-05-31Daily bump.GCC Administrator6-1/+423
2024-05-31i386: Rewrite bswaphi2 handling [PR115102]Uros Bizjak2-27/+60
Introduce *bswaphi2 instruction pattern and enable bswaphi2 expander also for non-movbe targets. The testcase: unsigned short bswap8 (unsigned short val) { return ((val & 0xff00) >> 8) | ((val & 0xff) << 8); } now expands through bswaphi2 named expander. Rewrite bswaphi_lowpart insn pattern as bswaphisi2_lowpart in the RTX form that combine pass can use to simplify: Trying 6, 9, 8 -> 10: 6: r99:SI=bswap(r103:SI) 9: {r107:SI=r103:SI&0xffffffffffff0000;clobber flags:CC;} REG_DEAD r103:SI REG_UNUSED flags:CC 8: {r106:SI=r99:SI 0>>0x10;clobber flags:CC;} REG_DEAD r99:SI REG_UNUSED flags:CC 10: {r104:SI=r106:SI|r107:SI;clobber flags:CC;} REG_DEAD r107:SI REG_DEAD r106:SI REG_UNUSED flags:CC Successfully matched this instruction: (set (reg:SI 104 [ _8 ]) (ior:SI (and:SI (reg/v:SI 103 [ val ]) (const_int -65536 [0xffffffffffff0000])) (lshiftrt:SI (bswap:SI (reg/v:SI 103 [ val ])) (const_int 16 [0x10])))) allowing combination of insns 6, 8, 9 and 10 when compiling the following testcase: unsigned int bswap8 (unsigned int val) { return (val & 0xffff0000) | ((val & 0xff00) >> 8) | ((val & 0xff) << 8); } to produce: movl %edi, %eax xchgb %ah, %al ret The expansion now always goes through a clobberless form of the bswaphi instruction. The instruction is conditionally converted to a rotate at peephole2 pass. This significantly simplifies bswaphisi2_lowpart insn pattern attributes. PR target/115102 gcc/ChangeLog: * config/i386/i386.md (bswaphi2): Also enable for !TARGET_MOVBE. (*bswaphi2): New insn pattern. (bswaphisi2_lowpart): Rename from bswaphi_lowpart. Rewrite insn RTX to match the expected form of the combine pass. Remove rol{w} alternative and corresponding attributes. (bswsaphisi2_lowpart peephole2): New peephole2 pattern to conditionally convert bswaphisi2_lowpart to rotlhi3_1_slp. (bswapsi2): Update expander for rename. (rotlhi3_1_slp splitter): Conditionally split to bswaphi2. gcc/testsuite/ChangeLog: * gcc.target/i386/pr115102.c: New test.
2024-05-30ira: Fix go_through_subreg offset calculation [PR115281]Richard Sandiford2-1/+41
go_through_subreg used: else if (!can_div_trunc_p (SUBREG_BYTE (x), REGMODE_NATURAL_SIZE (GET_MODE (x)), offset)) to calculate the register offset for a pseudo subreg x. In the blessed days before poly-int, this was: *offset = (SUBREG_BYTE (x) / REGMODE_NATURAL_SIZE (GET_MODE (x))); But I think this is testing the wrong natural size. If we exclude paradoxical subregs (which will get an offset of zero regardless), it's the inner register that is being split, so it should be the inner register's natural size that we use. This matters in the testcase because we have an SFmode lowpart subreg into the last of three variable-sized vectors. The SUBREG_BYTE is therefore equal to the size of two variable-sized vectors. Dividing by the vector size gives a register offset of 2, as expected, but dividing by the size of a scalar FPR would give a variable offset. I think something similar could happen for fixed-size targets if REGMODE_NATURAL_SIZE is different for vectors and integers (say), although that case would trade an ICE for an incorrect offset. gcc/ PR rtl-optimization/115281 * ira-conflicts.cc (go_through_subreg): Use the natural size of the inner mode rather than the outer mode. gcc/testsuite/ PR rtl-optimization/115281 * gfortran.dg/pr115281.f90: New test.
2024-05-30aarch64, middle-end: Move pair_fusion pass from aarch64 to middle-endAjit Kumar Agarwal5-3149/+3216
Move pair fusion pass from aarch64-ldp-fusion.cc to middle-end to support multiple targets. Common infrastructure of load store pair fusion is divided into target independent and target dependent code. Target independent code is structured in the following files. gcc/pair-fusion.h gcc/pair-fusion.cc Target independent code is the Generic code with pure virtual function to interface betwwen target independent and dependent code. 2024-05-30 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com> gcc/ChangeLog: * pair-fusion.h: Generic header code for load store pair fusion that can be shared across different architectures. * pair-fusion.cc: Generic source code implementation for load store pair fusion that can be shared across different architectures. * Makefile.in: Add new object file pair-fusion.o. * config/aarch64/aarch64-ldp-fusion.cc: Delete generic code and move it to pair-fusion.cc in the middle-end. * config/aarch64/t-aarch64: Add header file dependency on pair-fusion.h. Remove unnecessary header file dependency.
2024-05-30ggc: Reduce GGC_QUIRE_SIZE on Solaris/SPARC [PR115031]Rainer Orth1-0/+3
g++.dg/modules/pr99023_b.X currently FAILs on 32-bit Solaris/SPARC: FAIL: g++.dg/modules/pr99023_b.X -std=c++2a 1 blank line(s) in output FAIL: g++.dg/modules/pr99023_b.X -std=c++2a (test for excess errors) Excess errors: cc1plus: out of memory allocating 1048344 bytes after a total of 7913472 bytes It turns out that this exhaustion of the 32-bit address space happens due to a combination of three issues: * the SPARC pagesize of 8 kB, * ggc-page.cc's chunk size of 512 * pagesize, i.e. 4 MB, and * mmap adding two 8 kB unmapped red-zone pages to each mapping which result in the 4 MB mappings to actually consume 4.5 MB of address space. To avoid this, this patch reduces the chunk size so it remains at 4 MB even when combined with the red-zone pages, as recommended by mmap(2). Tested on sparc-sun-solaris2.11 and sparcv9-sun-solaris2.11. 2024-05-29 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc: PR c++/115031 * config/sparc/sol2.h (GGC_QUIRE_SIZE): Define as 510.
2024-05-30analyzer: fix a -Wunused-parameterDavid Malcolm1-1/+1
gcc/analyzer/ChangeLog: * infinite-loop.cc (looping_back_event::get_desc): Fix unused parameter warning introduced by me in r15-636-g770657d02c986c. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-05-30Add new text_art::tree_widget and use it in analyzerDavid Malcolm57-33/+2071
This patch adds a new text_art::tree_widget, which makes it easy to generate hierarchical visualizations using either ASCII: +- Child 0 | +- Grandchild 0 0 | +- Grandchild 0 1 | `- Grandchild 0 2 +- Child 1 | +- Grandchild 1 0 | +- Grandchild 1 1 | `- Grandchild 1 2 `- Child 2 +- Grandchild 2 0 +- Grandchild 2 1 `- Grandchild 2 2 or Unicode: Root ├─ Child 0 │ ├─ Grandchild 0 0 │ ├─ Grandchild 0 1 │ ╰─ Grandchild 0 2 ├─ Child 1 │ ├─ Grandchild 1 0 │ ├─ Grandchild 1 1 │ ╰─ Grandchild 1 2 ╰─ Child 2 ├─ Grandchild 2 0 ├─ Grandchild 2 1 ╰─ Grandchild 2 2 potentially with colorization of the connecting lines. It adds a new template for typename T: void text_art::dump<T> (const T&); for using this to dump any object to stderr that supports a make_dump_widget method, with similar templates for dumping to a pretty_printer * and a FILE *. It uses this within the analyzer to add two new families of dumping methods: one for program states, e.g.: (gdb) call state->dump() State ├─ Region Model │ ├─ Current Frame: frame: ‘calls_malloc’@2 │ ├─ Store │ │ ├─ m_called_unknown_fn: false │ │ ├─ frame: ‘test’@1 │ │ │ ╰─ _1: (INIT_VAL(n_2(D))*(size_t)4) │ │ ╰─ frame: ‘calls_malloc’@2 │ │ ├─ result_4: &HEAP_ALLOCATED_REGION(27) │ │ ╰─ _5: &HEAP_ALLOCATED_REGION(27) │ ╰─ Dynamic Extents │ ╰─ HEAP_ALLOCATED_REGION(27): (INIT_VAL(n_2(D))*(size_t)4) ╰─ ‘malloc’ state machine ╰─ 0x468cb40: &HEAP_ALLOCATED_REGION(27): unchecked ({free}) (‘result_4’) and the other for showing the detail of the recursive makeup of svalues and regions, e.g. the (INIT_VAL(n_2(D))*(size_t)4) from above: (gdb) call size_in_bytes->dump() (17): ‘long unsigned int’: binop_svalue(mult_expr: ‘*’) ├─ (15): ‘size_t’: initial_svalue │ ╰─ m_reg: (12): ‘size_t’: decl_region(‘n_2(D)’) │ ╰─ parent: (9): frame_region(‘test’, index: 0, depth: 1) │ ╰─ parent: (1): stack region │ ╰─ parent: (0): root region ╰─ (16): ‘size_t’: constant_svalue (‘4’) I've already found both of these useful when debugging analyzer issues. The patch uses the former to update the output of -fdump-analyzer-exploded-nodes-2 and -fdump-analyzer-exploded-nodes-3. The older dumping functions within the analyzer are retained in case they turn out to still be useful for debugging. gcc/ChangeLog: * Makefile.in (OBJS-libcommon): Add text-art/tree-widget.o. * doc/analyzer.texi: Rewrite discussion of dumping state to cover the text_art::tree_widget-based dumps, with a more interesting example. * text-art/dump-widget-info.h: New file. * text-art/dump.h: New file. * text-art/selftests.cc (selftest::text_art_tests): Call text_art_tree_widget_cc_tests. * text-art/selftests.h (selftest::text_art_tree_widget_cc_tests): New decl. * text-art/theme.cc (ascii_theme::get_cppchar): Handle the various cell_kind::TREE_*. (unicode_theme::get_cppchar): Likewise. * text-art/theme.h (enum class theme::cell_kind): Add TREE_CHILD_NON_FINAL, TREE_CHILD_FINAL, TREE_X_CONNECTOR, and TREE_Y_CONNECTOR. * text-art/tree-widget.cc: New file. gcc/analyzer/ChangeLog: * call-details.cc: Define INCLUDE_VECTOR. * call-info.cc: Likewise. * call-summary.cc: Likewise. * checker-event.cc: Likewise. * checker-path.cc: Likewise. * complexity.cc: Likewise. * constraint-manager.cc: Likewise. (bounded_range::make_dump_widget): New. (bounded_ranges::add_to_dump_widget): New. (equiv_class::make_dump_widget): New. (constraint::make_dump_widget): New. (bounded_ranges_constraint::make_dump_widget): New. (constraint_manager::make_dump_widget): New. * constraint-manager.h (bounded_range::make_dump_widget): New decl. (bounded_ranges::add_to_dump_widget): New decl. (equiv_class::make_dump_widget): New decl. (constraint::make_dump_widget): New decl. (bounded_ranges_constraint::make_dump_widget): New decl. (constraint_manager::make_dump_widget): New decl. * diagnostic-manager.cc: Define INCLUDE_VECTOR. * engine.cc: Likewise. Include "text-art/dump.h". (setjmp_svalue::print_dump_widget_label): New. (setjmp_svalue::add_dump_widget_children): New. (exploded_graph::dump_exploded_nodes): Use text_art::dump_to_file for -fdump-analyzer-exploded-nodes-2 and -fdump-analyzer-exploded-nodes-3. Fix overlong line. * feasible-graph.cc: Define INCLUDE_VECTOR. * infinite-recursion.cc: Likewise. * kf-analyzer.cc: Likewise. * kf-lang-cp.cc: Likewise. * kf.cc: Likewise. * known-function-manager.cc: Likewise. * pending-diagnostic.cc: Likewise. * program-point.cc: Likewise. * program-state.cc: Likewise. Include "text-art/tree-widget" and "text-art/dump.h". (sm_state_map::make_dump_widget): New. (program_state::dump): New. (program_state::make_dump_widget): New. * program-state.h: Include "text-art/widget.h". (sm_state_map::make_dump_widget): New decl. (program_state::dump): New decl. (program_state::make_dump_widget): New decl. * ranges.cc: Define INCLUDE_VECTOR. * record-layout.cc: Likewise. * region-model-asm.cc: Likewise. * region-model-manager.cc: Likewise. * region-model-reachability.cc: Likewise. * region-model.cc: Likewise. Include "text-art/tree-widget.h". (region_to_value_map::make_dump_widget): New. (region_model::dump): New. (region_model::make_dump_widget): New. (selftest::test_dump): Add test of dump_to_pp<region_model>. * region-model.h: Include "text-art/widget.h" and "text-art/dump.h". (region_to_value_map::make_dump_widget): New decl. (region_model::dump): New decl. (region_model::make_dump_widget): New decl. * region.cc: Define INCLUDE_VECTOR and include "text-art/dump.h". (region::dump): New. (region::make_dump_widget): New. (region::add_dump_widget_children): New. (frame_region::print_dump_widget_label): New. (globals_region::print_dump_widget_label): New. (code_region::print_dump_widget_label): New. (function_region::print_dump_widget_label): New. (label_region::print_dump_widget_label): New. (stack_region::print_dump_widget_label): New. (heap_region::print_dump_widget_label): New. (root_region::print_dump_widget_label): New. (thread_local_region::print_dump_widget_label): New. (symbolic_region::print_dump_widget_label): New. (symbolic_region::add_dump_widget_children): New. (decl_region::print_dump_widget_label): New. (field_region::print_dump_widget_label): New. (element_region::print_dump_widget_label): New. (element_region::add_dump_widget_children): New. (offset_region::print_dump_widget_label): New. (offset_region::add_dump_widget_children): New. (sized_region::print_dump_widget_label): New. (sized_region::add_dump_widget_children): New. (cast_region::print_dump_widget_label): New. (cast_region::add_dump_widget_children): New. (heap_allocated_region::print_dump_widget_label): New. (alloca_region::print_dump_widget_label): New. (string_region::print_dump_widget_label): New. (bit_range_region::print_dump_widget_label): New. (var_arg_region::print_dump_widget_label): New. (errno_region::print_dump_widget_label): New. (private_region::print_dump_widget_label): New. (unknown_region::print_dump_widget_label): New. * region.h: Include "text-art/widget.h". (region::dump): New decl. (region::make_dump_widget): New decl. (region::add_dump_widget_children): New decl. (frame_region::print_dump_widget_label): New decl. (globals_region::print_dump_widget_label): New decl. (code_region::print_dump_widget_label): New decl. (function_region::print_dump_widget_label): New decl. (label_region::print_dump_widget_label): New decl. (stack_region::print_dump_widget_label): New decl. (heap_region::print_dump_widget_label): New decl. (root_region::print_dump_widget_label): New decl. (thread_local_region::print_dump_widget_label): New decl. (symbolic_region::print_dump_widget_label): New decl. (symbolic_region::add_dump_widget_children): New decl. (decl_region::print_dump_widget_label): New decl. (field_region::print_dump_widget_label): New decl. (element_region::print_dump_widget_label): New decl. (element_region::add_dump_widget_children): New decl. (offset_region::print_dump_widget_label): New decl. (offset_region::add_dump_widget_children): New decl. (sized_region::print_dump_widget_label): New decl. (sized_region::add_dump_widget_children): New decl. (cast_region::print_dump_widget_label): New decl. (cast_region::add_dump_widget_children): New decl. (heap_allocated_region::print_dump_widget_label): New decl. (alloca_region::print_dump_widget_label): New decl. (string_region::print_dump_widget_label): New decl. (bit_range_region::print_dump_widget_label): New decl. (var_arg_region::print_dump_widget_label): New decl. (errno_region::print_dump_widget_label): New decl. (private_region::print_dump_widget_label): New decl. (unknown_region::print_dump_widget_label): New decl. * sm-fd.cc: Define INCLUDE_VECTOR. * sm-file.cc: Likewise. * sm-malloc.cc: Likewise. * sm-pattern-test.cc: Likewise. * sm-signal.cc: Likewise. * sm-taint.cc: Likewise. * sm.cc: Likewise. * state-purge.cc: Likewise. * store.cc: Likewise. Include "text-art/tree-widget.h". (add_binding_to_tree_widget): New. (binding_map::add_to_tree_widget): New. (binding_cluster::make_dump_widget): New. (store::make_dump_widget): New. * store.h: Include "text-art/tree-widget.h". (binding_map::add_to_tree_widget): New decl. (binding_cluster::make_dump_widget): New decl. (store::make_dump_widget): New decl. * svalue.cc: Define INCLUDE_VECTOR. Include "make-unique.h" and "text-art/dump.h". (svalue::dump): New. (svalue::make_dump_widget): New. (region_svalue::print_dump_widget_label): New. (region_svalue::add_dump_widget_children): New. (constant_svalue::print_dump_widget_label): New. (constant_svalue::add_dump_widget_children): New. (unknown_svalue::print_dump_widget_label): New. (unknown_svalue::add_dump_widget_children): New. (poisoned_svalue::print_dump_widget_label): New. (poisoned_svalue::add_dump_widget_children): New. (initial_svalue::print_dump_widget_label): New. (initial_svalue::add_dump_widget_children): New. (unaryop_svalue::print_dump_widget_label): New. (unaryop_svalue::add_dump_widget_children): New. (binop_svalue::print_dump_widget_label): New. (binop_svalue::add_dump_widget_children): New. (sub_svalue::print_dump_widget_label): New. (sub_svalue::add_dump_widget_children): New. (repeated_svalue::print_dump_widget_label): New. (repeated_svalue::add_dump_widget_children): New. (bits_within_svalue::print_dump_widget_label): New. (bits_within_svalue::add_dump_widget_children): New. (widening_svalue::print_dump_widget_label): New. (widening_svalue::add_dump_widget_children): New. (placeholder_svalue::print_dump_widget_label): New. (placeholder_svalue::add_dump_widget_children): New. (unmergeable_svalue::print_dump_widget_label): New. (unmergeable_svalue::add_dump_widget_children): New. (compound_svalue::print_dump_widget_label): New. (compound_svalue::add_dump_widget_children): New. (conjured_svalue::print_dump_widget_label): New. (conjured_svalue::add_dump_widget_children): New. (asm_output_svalue::print_dump_widget_label): New. (asm_output_svalue::add_dump_widget_children): New. (const_fn_result_svalue::print_dump_widget_label): New. (const_fn_result_svalue::add_dump_widget_children): New. * svalue.h: Include "text-art/widget.h". Add "using text_art::dump_widget_info". (svalue::dump): New decl. (svalue::make_dump_widget): New decl. (svalue::print_dump_widget_label): New decl. (svalue::print_dump_widget_label): New decl. (svalue::add_dump_widget_children): New decl. (region_svalue::print_dump_widget_label): New decl. (region_svalue::add_dump_widget_children): New decl. (constant_svalue::print_dump_widget_label): New decl. (constant_svalue::add_dump_widget_children): New decl. (unknown_svalue::print_dump_widget_label): New decl. (unknown_svalue::add_dump_widget_children): New decl. (poisoned_svalue::print_dump_widget_label): New decl. (poisoned_svalue::add_dump_widget_children): New decl. (initial_svalue::print_dump_widget_label): New decl. (initial_svalue::add_dump_widget_children): New decl. (unaryop_svalue::print_dump_widget_label): New decl. (unaryop_svalue::add_dump_widget_children): New decl. (binop_svalue::print_dump_widget_label): New decl. (binop_svalue::add_dump_widget_children): New decl. (sub_svalue::print_dump_widget_label): New decl. (sub_svalue::add_dump_widget_children): New decl. (repeated_svalue::print_dump_widget_label): New decl. (repeated_svalue::add_dump_widget_children): New decl. (bits_within_svalue::print_dump_widget_label): New decl. (bits_within_svalue::add_dump_widget_children): New decl. (widening_svalue::print_dump_widget_label): New decl. (widening_svalue::add_dump_widget_children): New decl. (placeholder_svalue::print_dump_widget_label): New decl. (placeholder_svalue::add_dump_widget_children): New decl. (unmergeable_svalue::print_dump_widget_label): New decl. (unmergeable_svalue::add_dump_widget_children): New decl. (compound_svalue::print_dump_widget_label): New decl. (compound_svalue::add_dump_widget_children): New decl. (conjured_svalue::print_dump_widget_label): New decl. (conjured_svalue::add_dump_widget_children): New decl. (asm_output_svalue::print_dump_widget_label): New decl. (asm_output_svalue::add_dump_widget_children): New decl. (const_fn_result_svalue::print_dump_widget_label): New decl. (const_fn_result_svalue::add_dump_widget_children): New decl. * trimmed-graph.cc: Define INCLUDE_VECTOR. * varargs.cc: Likewise. gcc/testsuite/ChangeLog: * gcc.dg/plugin/analyzer_cpython_plugin.c: Define INCLUDE_VECTOR. * gcc.dg/plugin/analyzer_gil_plugin.c: Likewise. * gcc.dg/plugin/analyzer_kernel_plugin.c: Likewise. * gcc.dg/plugin/analyzer_known_fns_plugin.c: Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-05-30libgomp.texi: Impl. update for USM and missing 5.2 itemTobias Burnus1-2/+4
libgomp/ChangeLog: * libgomp.texi (OpenMP 5.0 status): Mark 'requires' as done and link to 'Offload-Target Specifics'. (OpenMP 5.2 status): Add item about additional map-type modifiers in 'declare mapper'.
2024-05-30[testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*Alexandre Oliva7-16/+9
Codegen changes caused add instruction count mismatches on ppc-*-linux-gnu and other 32-bit ppc targets. At some point the expected counts were adjusted for lp64, but ilp32 differences remained, and published test results confirm it. for gcc/testsuite/ChangeLog PR testsuite/101169 * gcc.target/powerpc/fold-vec-extract-double.p7.c: Adjust addi counts for ilp32. * gcc.target/powerpc/fold-vec-extract-float.p7.c: Likewise. * gcc.target/powerpc/fold-vec-extract-float.p8.c: Likewise. * gcc.target/powerpc/fold-vec-extract-int.p7.c: Likewise. * gcc.target/powerpc/fold-vec-extract-int.p8.c: Likewise. * gcc.target/powerpc/fold-vec-extract-short.p7.c: Likewise. * gcc.target/powerpc/fold-vec-extract-short.p8.c: Likewise.
2024-05-30[testsuite] conditionalize dg-additional-sources on target and typeAlexandre Oliva0-0/+0
g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when check_vect_support_and_set_flags finds vector support lacking for execution tests: tests decay to compile tests, and additional sources are rejected by the compiler when compiling to a named output file. At first I considered using some effective target to conditionalize the additional sources. There was no support for target-specific additional sources, so I added that. But then, I found that adding an effective target to check whether the test involves linking would just make for busy work in this case, and so I went ahead and adjusted the handling of additional sources to refrain from adding them on compile tests, reporting them as unsupported. That solves the problem without using the newly-added machinery for per-target additional sources, but I figured since I'd implemented it I might as well contribute it, since there might be other uses for it. for gcc/ChangeLog * doc/sourcebuild.texi (dg-additional-sources): Document newly-added support for target selectors, and implicit discard on non-linking tests that name the compiler output explicitly. for gcc/testsuite/ChangeLog * lib/gcc-defs.exp (dg-additional-sources): Support target selectors. Make it cumulative. (dg-additional-files-options): Take dest and type. Note unsupported additional sources when not linking and naming the compiler output. Adjust source dirname prepending to cope with leading blanks. * lib/g++.exp (g++_target_compile): Pass dest and type on to dg-additional-files-options. * lib/gcc.exp (gcc_target_compile): Likewise. * lib/gdc.exp (gdb_target_compile): Likewise. * lib/gfortran.exp (gfortran_target_compile): Likewise. * lib/go.exp (go_target_compile): Likewise. * lib/obj-c++.exp (obj-c++_target_compile): Likewise. * lib/objc.exp (objc_target_compile): Likewise. * lib/rust.exp (rust_target_compile): Likewise. * lib/profopt.exp (profopt-execute): Likewise-ish.
2024-05-30[libstdc++-v3] [rtems] enable filesystem supportAlexandre Oliva2-0/+14
mkdir, chdir and chmod functions are defined in librtemscpu, that doesn't get linked in during libstdc++-v3 configure, but applications use -qrtems for linking, which brings those symbols in, so it makes sense to mark them as available so that the C++ filesystem APIs are enabled. for libstdc++-v3/ChangeLog * configure.ac [*-*-rtems*]: Set chdir, chmod and mkdir as available. * configure: Rebuilt.
2024-05-30Support vcond_mask_qiqi and friends.liuhongt2-0/+30
gcc/ChangeLog: * config/i386/sse.md (vcond_mask_<mode><mode>): New expander. gcc/testsuite/ChangeLog: * gcc.target/i386/pr114125.c: New test.
2024-05-30Don't reduce estimated unrolled size for innermost loop.liuhongt2-22/+86
For the innermost loop, after completely loop unroll, it will most likely not be able to reduce the body size to 2/3. The current 2/3 reduction will make some of the larger loops completely unrolled during cunrolli, which will then result in them not being able to be vectorized. It also increases the register pressure. The patch move the 2/3 reduction from estimated_unrolled_size to tree_unroll_loops_completely. gcc/ChangeLog: PR tree-optimization/112325 * tree-ssa-loop-ivcanon.cc (estimated_unrolled_size): Move the 2 / 3 loop body size reduction to .. (try_unroll_loop_completely): .. here, add it for the check of body size shrink, and the check of comparison against param_max_completely_peeled_insns when (!cunrolli ||loop->inner). (canonicalize_loop_induction_variables): Add new parameter cunrolli and pass down. (tree_unroll_loops_completely_1): Ditto. (canonicalize_induction_variables): Pass cunrolli as false to canonicalize_loop_induction_variables. (tree_unroll_loops_completely): Set cunrolli to true at beginning and set it to false after CHANGED is true. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr112325.c: New test.
2024-05-30[testsuite] conditionalize dg-additional-sources on target and typeAlexandre Oliva11-15/+46
g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when check_vect_support_and_set_flags finds vector support lacking for execution tests: tests decay to compile tests, and additional sources are rejected by the compiler when compiling to a named output file. At first I considered using some effective target to conditionalize the additional sources. There was no support for target-specific additional sources, so I added that. But then, I found that adding an effective target to check whether the test involves linking would just make for busy work in this case, and so I went ahead and adjusted the handling of additional sources to refrain from adding them on compile tests, reporting them as unsupported. That solves the problem without using the newly-added machinery for per-target additional sources, but I figured since I'd implemented it I might as well contribute it, since there might be other uses for it. for gcc/ChangeLog * doc/sourcebuild.texi (dg-additional-sources): Document newly-added support for target selectors, and implicit discard on non-linking tests that name the compiler output explicitly. for gcc/testsuite/ChangeLog * lib/gcc-defs.exp (dg-additional-sources): Support target selectors. Make it cumulative. (dg-additional-files-options): Take dest and type. Note unsupported additional sources when not linking and naming the compiler output. Adjust source dirname prepending to cope with leading blanks. * lib/g++.exp (g++_target_compile): Pass dest and type on to dg-additional-files-options. * lib/gcc.exp (gcc_target_compile): Likewise. * lib/gdc.exp (gdb_target_compile): Likewise. * lib/gfortran.exp (gfortran_target_compile): Likewise. * lib/go.exp (go_target_compile): Likewise. * lib/obj-c++.exp (obj-c++_target_compile): Likewise. * lib/objc.exp (objc_target_compile): Likewise. * lib/rust.exp (rust_target_compile): Likewise. * lib/profopt.exp (profopt-execute): Likewise-ish.
2024-05-30tree-ssa-pre.c/115214(ICE in find_or_generate_expression, at ↵Jiawei2-3/+59
tree-ssa-pre.c:2780): Return NULL_TREE when deal special cases. Return NULL_TREE when genop3 equal EXACT_DIV_EXPR. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652641.html version log v3: remove additional POLY_INT_CST check. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652795.html gcc/ChangeLog: * tree-ssa-pre.cc (create_component_ref_by_pieces_1): New conditions. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr115214.c: New test.
2024-05-30Revert "resource.cc: Replace calls to find_basic_block with cfgrtl ↵Hans-Peter Nilsson1-10/+56
BLOCK_FOR_INSN" This reverts commit 933ab59c59bdc1ac9e3ca3a56527836564e1821b.
2024-05-30Revert "resource.cc (mark_target_live_regs): Remove check for bb not found"Hans-Peter Nilsson1-132/+138
This reverts commit e1abce5b6ad8f5aee86ec7729b516d81014db09e.
2024-05-30Revert "resource.cc: Remove redundant conditionals"Hans-Peter Nilsson1-52/+71
This reverts commit 802a98d128f9b0eea2432f6511328d14e0bd721b.
2024-05-30Daily bump.GCC Administrator13-1/+1216
2024-05-29C23: fix aliasing for structures/unions with incomplete typesMartin Uecker2-3/+76
When incomplete structure/union types are completed later, compatibility of struct types that contain pointers to such types changes. When forming equivalence classes for TYPE_CANONICAL, we therefor need to be conservative and treat all structs with the same tag which are pointer targets as equivalent for purposed of determining equivalency of structure/union types which contain such types as member. This avoids having to update TYPE_CANONICAL of such structure/unions recursively. The pointer types themselves are updated in c_update_type_canonical. gcc/c/ * c-typeck.cc (comptypes_internal): Add flag to track whether a struct is the target of a pointer. (tagged_types_tu_compatible): When forming equivalence classes, treat nested pointed-to structs as equivalent. gcc/testsuite/ * gcc.dg/c23-tag-incomplete-alias-1.c: New test.
2024-05-30MIPS16: Mark $2/$3 as clobbered if GP is usedYunQiang Su1-1/+10
PR Target/84790. The gp init sequence li $2,%hi(_gp_disp) addiu $3,$pc,%lo(_gp_disp) sll $2,16 addu $2,$3 is generated directly in `mips_output_function_prologue`, and does not appear in the RTL. So the IRA/IPA passes are not aware that $2/$3 have been clobbered, so they may be used for cross (local) function call. Let's mark $2/$3 clobber both: - Just after the UNSPEC_GP RTL of a function; - Just after a function call. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Origin-Patch-by: Felix Fietkau <nbd@nbd.name>. gcc * config/mips/mips.cc(mips16_gp_pseudo_reg): Mark MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered. (mips_emit_call_insn): Mark MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.
2024-05-30MIPS/testsuite: Fix bseli.b fail in msa-builtins.cYunQiang Su1-1/+1
commit 05daf617ea22e1d818295ed2d037456937e23530 Author: Jeff Law <jlaw@ventanamicro.com> Date: Sat May 25 12:39:05 2024 -0600 [committed] [v2] More logical op simplifications in simplify-rtx.cc does some simplifications, and then `bseli.b $w1,$w0,255` is found that it is same with `or.v $w1,$w0,$w1`. So there will be no bseli.b instruction generated. Let's use 254 instead of 255 to test the generation of `bseli.b`. gcc/testsuite * gcc.target/mips/msa-builtins.c: Use 254 instead of 255 for bseli.b, as `bseli.b $w0,$w1,255` is same as `or.v $w0,$w0,$w1`.
2024-05-29PR modula2/115276 bugfix libgm2 wraptime.InitTM returns NILGaius Mulley5-3/+59
This patch fixes libgm2/libm2iso/wraptime.cc:InitTM so that it does not always return NULL. The incorrect autoconf macro was used (inside InitTM) and the function short circuited to return NULL. The fix is to use HAVE_SYS_TIME_H and use AC_HEADER_TIME in libgm2/configure.ac. libgm2/ChangeLog: PR modula2/115276 * config.h.in: Regenerate. * configure: Regenerate. * configure.ac: Use AC_HEADER_TIME. * libm2iso/wraptime.cc (InitTM): Check HAVE_SYS_TIME_H before using struct tm to obtain the size. gcc/testsuite/ChangeLog: PR modula2/115276 * gm2/isolib/run/pass/testinittm.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-05-29match: Add support for `a ^ CST` to bitwise_inverted_equal_p [PR115224]Andrew Pinski4-0/+42
While looking into something else, I noticed that `a ^ CST` needed to be special casing to bitwise_inverted_equal_p as it would simplify to `a ^ ~CST` for the bitwise not. Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/115224 gcc/ChangeLog: * generic-match-head.cc (bitwise_inverted_equal_p): Add `a ^ CST` case. * gimple-match-head.cc (gimple_bit_xor_cst): New declaration. (gimple_bitwise_inverted_equal_p): Add `a ^ CST` case. * match.pd (bit_xor_cst): New match. (maybe_bit_not): Add bit_xor_cst case. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/bitops-8.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-29Match: Add maybe_bit_not instead of plain matchingAndrew Pinski1-4/+10
While working on adding matching of negative expressions of `a - b`, I noticed that we started to have "duplicated" patterns due to not having a way to match maybe negative expressions. So I went back to what I did for bit_not and decided to improve the situtation there so for some patterns where we had 2 operands of an expression where one could have been a bit_not, add back maybe_bit_not. This does not add maybe_bit_not in every place were bitwise_inverted_equal_p is used, just the ones were 2 operands of an expression could be swapped. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * match.pd (bit_not_with_nop): Unconditionalize. (maybe_cmp): Likewise. (maybe_bit_not): New match pattern. (`~X & X`): Use maybe_bit_not and add `:c` back. (`~x ^ x`/`~x | x`): Likewise. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-05-29aarch64: Split aarch64_combinev16qi before RA [PR115258]Richard Sandiford3-16/+34
Two-vector TBL instructions are fed by an aarch64_combinev16qi, whose purpose is to put the two input data vectors into consecutive registers. This aarch64_combinev16qi was then split after reload into individual moves (from the first input to the first half of the output, and from the second input to the second half of the output). In the worst case, the RA might allocate things so that the destination of the aarch64_combinev16qi is the second input followed by the first input. In that case, the split form of aarch64_combinev16qi uses three eors to swap the registers around. This PR is about a test where this worst case occurred. And given the insn description, that allocation doesn't semm unreasonable. early-ra should (hopefully) mean that we're now better at allocating subregs of vector registers. The upcoming RA subreg patches should improve things further. The best fix for the PR therefore seems to be to split the combination before RA, so that the RA can see the underlying moves. Perhaps it even makes sense to do this at expand time, avoiding the need for aarch64_combinev16qi entirely. That deserves more experimentation though. gcc/ PR target/115258 * config/aarch64/aarch64-simd.md (aarch64_combinev16qi): Allow the split before reload. * config/aarch64/aarch64.cc (aarch64_split_combinev16qi): Generalize into a form that handles pseudo registers. gcc/testsuite/ PR target/115258 * gcc.target/aarch64/pr115258.c: New test.
2024-05-29libstdc++: Use RAII to replace try/catch blocksFrançois Dumont2-100/+55
Move _Guard into std::vector declaration and use it to guard all calls to vector _M_allocate. Doing so the compiler has more visibility on what is done with the pointers and do not raise anymore the -Wfree-nonheap-object warning. libstdc++-v3/ChangeLog: * include/bits/vector.tcc (_Guard): Move all the nested duplicated class... * include/bits/stl_vector.h (_Guard_alloc): ...here and rename. (_M_allocate_and_copy): Use latter. (_M_initialize_dispatch): Small code simplification. (_M_range_initialize): Likewise and set _M_finish first from the result of __uninitialize_fill_n_a that can throw.
2024-05-29Delete a file due to push errorFeng Xue1-0/+0
gcc/ * tree-vect-loop.c : Removed.
2024-05-29vect: Unify bbs in loop_vec_info and bb_vec_infoFeng Xue6-128/+74
Both derived classes have their own "bbs" field, which have exactly same purpose of recording all basic blocks inside the corresponding vect region, while the fields are composed by different data type, one is normal array, the other is auto_vec. This difference causes some duplicated code even handling the same stuff, almost in tree-vect-patterns. One refinement is lifting this field into the base class "vec_info", and reset its value to the continuous memory area pointed by two old "bbs" in each constructor of derived classes. 2024-05-16 Feng Xue <fxue@os.amperecomputing.com> gcc/ * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Move initialization of bbs to explicit construction code. Adjust the definition of nbbs. (update_epilogue_loop_vinfo): Update nbbs for epilog vinfo. * tree-vect-patterns.cc (vect_determine_precisions): Make loop_vec_info and bb_vec_info share same code. (vect_pattern_recog): Remove duplicated vect_pattern_recog_1 loop. * tree-vect-slp.cc (vect_get_and_check_slp_defs): Access to bbs[0] via base vec_info class. (_bb_vec_info::_bb_vec_info): Initialize bbs and nbbs using data fields of input auto_vec<> bbs. (vect_slp_region): Use access to nbbs to replace original bbs.length(). (vect_schedule_slp_node): Access to bbs[0] via base vec_info class. * tree-vectorizer.cc (vec_info::vec_info): Add initialization of bbs and nbbs. (vec_info::insert_seq_on_entry): Access to bbs[0] via base vec_info class. * tree-vectorizer.h (vec_info): Add new fields bbs and nbbs. (LOOP_VINFO_NBBS): New macro. (BB_VINFO_BBS): Rename BB_VINFO_BB to BB_VINFO_BBS. (BB_VINFO_NBBS): New macro. (_loop_vec_info): Remove field bbs. (_bb_vec_info): Rename field bbs.
2024-05-29c++: pragma target and static init [PR109753]Jason Merrill3-0/+15
#pragma target and optimize should also apply to implicitly-generated functions like static initialization functions and defaulted special member functions. The handle_optimize_attribute change is necessary to avoid regressing g++.dg/opt/pr105306.C; maybe_clone_body creates a cgraph_node for the ~B alias before handle_optimize_attribute, and the alias never goes through finalize_function, so we need to adjust semantic_interposition somewhere else. PR c++/109753 gcc/c-family/ChangeLog: * c-attribs.cc (handle_optimize_attribute): Set cgraph_node::semantic_interposition. gcc/cp/ChangeLog: * decl.cc (start_preparsed_function): Call decl_attributes. gcc/testsuite/ChangeLog: * g++.dg/opt/always_inline1.C: New test.
2024-05-29[to-be-committed] [RISC-V] Use pack to handle repeating constantsJeff Law3-1/+52
This patch utilizes zbkb to improve the code we generate for 64bit constants when the high half is a duplicate of the low half. Basically we generate the low half and use a pack instruction with that same register repeated. ie pack dest,src,src That gives us a maximum sequence of 3 instructions and sometimes it will be just 2 instructions (say if the low 32bits can be constructed with a single addi or lui). As with shadd, I'm abusing an RTL opcode. This time it's CONCAT. It's reasonably close to what we're doing. Obviously it's just how we identify the desire to generate a pack in the array of opcodes. We don't actually emit a CONCAT. Note that we don't care about the potential sign extension from bit 31. pack will only look at bits 0..31 of each input (for rv64). So we go ahead and sign extend before synthesizing the low part as that allows us to handle more cases trivially. I had my testsuite generator chew on random cases of a repeating constant without any surprises. I don't see much point in including all those in the testcase (after all there's 2**32 of them). I've got a set of 10 I'm including. Nothing particularly interesting in them. An enterprising developer that needs this improved without zbkb could probably do so with a bit of work. First increase the cost by 1 unit. Second avoid cases where bit 31 is set and restrict it to cases when we can still create pseudos. On the codegen side, when encountering the CONCAT, generate the appropriate shift of "X" into a temporary register, then IOR the temporary with "X" into the new destination. Anyway, I've tested this in my tester (though it doesn't turn on zbkb, yet). I'll let the CI system chew on it overnight, but like mine, I don't think it lights up zbkb. So it's unlikely to spit out anything interesting. gcc/ * config/riscv/crypto.md (riscv_xpack_<X:mode>_<HX:mode>_2): Remove '*' allow it to be used via the gen_* interface. * config/riscv/riscv.cc (riscv_build_integer): Identify when Zbkb can be used to profitably synthesize repeating constants. (riscv_move_integer): Codegen changes to generate those Zbkb sequences. gcc/testsuite/ * gcc.target/riscv/synthesis-9.c: New test.
2024-05-29c++: add module extensionsJason Merrill6-25/+30
There is a trend in the broader C++ community to use a different extension for module interface units, even though (in GCC) they are compiled in the same way as other source files. Let's recognize these extensions as C++. .ixx is the MSVC standard, while the .c*m are supported by Clang. libc++ standard headers use .cppm, as their other source files use .cpp. Perhaps libstdc++ might use .ccm for parallel consistency? One issue with .c++m is that libcpp/mkdeps.cc has been using it for the phony dependencies to express module dependencies, so I'm changing mkdeps to something less likely to be an actual file, ".c++-module". gcc/cp/ChangeLog: * lang-specs.h: Add module interface extensions. gcc/ChangeLog: * doc/invoke.texi: Update module extension docs. libcpp/ChangeLog: * mkdeps.cc (make_write): Change .c++m to .c++-module. gcc/testsuite/ChangeLog: * g++.dg/modules/dep-1_a.C * g++.dg/modules/dep-1_b.C * g++.dg/modules/dep-2.C: Change .c++m to .c++-module.
2024-05-29libgomp: Enable USM for AMD APUs and MI200 devicesTobias Burnus4-4/+28
If HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true, all GPUs on the system support unified shared memory. That's the case for APUs and MI200 devices when XNACK is enabled. XNACK can be enabled by setting HSA_XNACK=1 as env var for supported devices; otherwise, if disable, USM code will use host fallback. gcc/ChangeLog: * config/gcn/gcn-hsa.h (gcn_local_sym_hash): Fix typo. include/ChangeLog: * hsa.h (HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT): Add enum value. libgomp/ChangeLog: * libgomp.texi (gcn): Update USM handling * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Handle USM if HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true.
2024-05-29libgomp: Enable USM for some nvptx devicesTobias Burnus4-4/+45
A few high-end nvptx devices support the attribute CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS; for those, unified shared memory is supported in hardware. This patch enables support for those - if all installed nvptx devices have this feature (as the capabilities are per device type). This exposes a bug in gomp_copy_back_icvs as it did before use omp_get_mapped_ptr to find mapped variables, but that returns the unchanged pointer in cased of shared memory. But in this case, we have a few actually mapped pointers - like the ICV variables. Additionally, there was a mismatch with regards to '-1' for the device number as gomp_copy_back_icvs and omp_get_mapped_ptr count differently. Hence, do the lookup manually. include/ChangeLog: * cuda/cuda.h (CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS): Add. libgomp/ChangeLog: * libgomp.texi (nvptx): Update USM description. * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim support when requesting USM and all devices support CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS. * target.c (gomp_copy_back_icvs): Fix device ptr lookup. (gomp_target_init): Set GOMP_OFFLOAD_CAP_SHARED_MEM is the devices supports USM.
2024-05-29c-family: add hints for strerrorOskari Pirhonen3-0/+11
Add proper hints for implicit declaration of strerror. The results could be confusing depending on the other included headers. These example messages are from compiling a trivial program to print the string for an errno value. It only includes stdio.h (cstdio for C++). Before: $ /tmp/gcc-master/bin/gcc test.c -o test_c test.c: In function ‘main’: test.c:4:20: warning: implicit declaration of function ‘strerror’; did you mean ‘perror’? [-Wimplicit-function-declaration] 4 | printf("%s\n", strerror(0)); | ^~~~~~~~ | perror $ /tmp/gcc-master/bin/g++ test.cpp -o test_cpp test.cpp: In function ‘int main()’: test.cpp:4:20: error: ‘strerror’ was not declared in this scope; did you mean ‘stderr’? 4 | printf("%s\n", strerror(0)); | ^~~~~~~~ | stderr After: $ /tmp/gcc-known-headers/bin/gcc test.c -o test_c test.c: In function ‘main’: test.c:4:20: warning: implicit declaration of function ‘strerror’ [-Wimplicit-function-declaration] 4 | printf("%s\n", strerror(0)); | ^~~~~~~~ test.c:2:1: note: ‘strerror’ is defined in header ‘<string.h>’; this is probably fixable by adding ‘#include <string.h>’ 1 | #include <stdio.h> +++ |+#include <string.h> 2 | $ /tmp/gcc-known-headers/bin/g++ test.cpp -o test_cpp test.cpp: In function ‘int main()’: test.cpp:4:20: error: ‘strerror’ was not declared in this scope 4 | printf("%s\n", strerror(0)); | ^~~~~~~~ test.cpp:2:1: note: ‘strerror’ is defined in header ‘<cstring>’; this is probably fixable by adding ‘#include <cstring>’ 1 | #include <cstdio> +++ |+#include <cstring> 2 | gcc/c-family/ChangeLog: * known-headers.cc (get_stdlib_header_for_name): Add strerror. gcc/testsuite/ChangeLog: * g++.dg/spellcheck-stdlib.C: Add check for strerror. * gcc.dg/spellcheck-stdlib-2.c: New test. Signed-off-by: Oskari Pirhonen <xxc3ncoredxx@gmail.com>
2024-05-29tree-optimization/115252 - enhance peeling for gaps avoidanceRichard Biener2-30/+46
Code generation for contiguous load vectorization can already deal with generalized avoidance of loading from a gap. The following extends detection of peeling for gaps requirement with that, gets rid of the old special casing of a half load and makes sure when we do access the gap we have peeling for gaps enabled. PR tree-optimization/115252 * tree-vect-stmts.cc (get_group_load_store_type): Enhance detecting the number of cases where we can avoid accessing a gap during code generation. (vectorizable_load): Remove old half-vector peeling for gap avoidance which is now redundant. Add gap-aligned case where it's OK to access the gap. Add assert that we have peeling for gaps enabled when we access a gap. * gcc.dg/vect/slp-gap-1.c: New testcase.
2024-05-29tree-optimization/114435 - pcom left around copies confusing SLPRichard Biener2-0/+40
The following arranges for the pre-SLP vectorization scalar cleanup to be run when predictive commoning was applied to a loop in the function. This is similar to the complete unroll situation and facilitating SLP vectorization. Avoiding the SSA copies in predictive commoning itself isn't easy (and predcom also sometimes unrolls, asking for scalar cleanup). PR tree-optimization/114435 * tree-predcom.cc (tree_predictive_commoning): Queue the next scalar cleanup sub-pipeline to be run when we did something. * gcc.dg/vect/bb-slp-pr114435.c: New testcase.