aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-06-29libgompd: Fix Access Bugsdevel/omp/ompdMohame Atef1-3/+3
libgomp/ChangeLog 2022-06-29 Mohamed Atef <mohamedatef1698@gmail.com> * ompd-helper.c (gompd_is_final, gompd_is_implicit, gompd_get_team_size): Change is_ptr from 1 to 0 in ACCESS_VALUE. Signed-off-by: Mohamed Atef <mohamedatef1698@gmail.com>
2022-06-20Fix sizes in OMPD support and add local ICVs functions.Mohame Atef6-51/+462
This patch fixes some undesirable implementation details in OMPD support, and adds local ICVs functions, e.g. per-thread, per-task, and per-parallel region ICVs. libgomp/ChangeLog 2022-06-17 Mohamed Atef <mohamedatef1698@gmail.com> * ompd-helper.h (DEREFERENCE, ACCESS_VALUE): New macros. (gompd_get_proc_bind): Change the returned value from ompd_word_t to const char *. (gompd_get_max_task_priority): Fix format. (gompd_stringize_gompd_enabled): Removed. (gompd_get_gompd_enabled): New function prototype. * ompd-helper.c (gompd_get_affinity_format): Call CHECK_RET. Fix format in gompd_enabled GET_VALUE. (gompd_stringize_gompd_enabled): Removed. (gompd_get_nthread, gompd_get_thread_limit, gompd_get_run_sched, gompd_get_run_sched_chunk_size, gompd_get_default_device, gompd_get_dynamic, gompd_get_max_active_levels, gompd_get_proc_bind, gompd_is_final, gompd_is_implicit, gompd_get_team_size): New functions. (gompd_get_gompd_enabled): Change the returned value from ompd_word_t to const char *. * ompd-init.c (ompd_process_initialize): Use sizeof_short instead of sizeof_long_long in GET_VALUE argument. * ompd-support.h: Change type from __UINT64_TYPE__ to unsigned short. (GOMPD_FOREACH_ACCESS): Add entries for gomp_task kind and final_task and gomp_team nthreads. * ompd-support.c (gompd_get_offset, gompd_get_sizeof_member, gompd_get_size, OMPD_SECTION): Define. (gompd_access_gomp_thread_handle, gompd_sizeof_gomp_thread_handle): New variables. (gompd_state): Change type from __UNIT64_TYPE__ to unsigned short. (gompd_load): Remove gompd_init_access, gompd_init_sizeof_members, gompd_init_sizes, gompd_access_gomp_thread_handle, gompd_sizeof_gomp_thread_handle. * ompd-icv.c (ompd_get_icv_from_scope): Add thread_handle, task_handle and parallel_handle. Fix format in ashandle definition. Call gompd_get_nthread, gompd_get_thread_limit, gomp_get_run_shed, gompd_get_run_sched_chunk_size, gompd_get_default_device, gompd_get_dynamic, gompd_get_max_active_levels, gompd_get_proc_bind, gompd_is_final, gompd_is_implicit, and gompd_get_team_size. (ompd_get_icv_string_from_scope): Fix format in ashandle definition. Add task_handle. Call gompd_get_gompd_enabled, and gompd_get_proc_bind. Remove the call to gompd_stringize_gompd_enabled.
2022-05-23Add OMPD support, initialization and global ICVs function.Mohame Atef19-15/+1514
This commit adds OMPD support so that the debugger can successfully load libgompd (libgomp OMPD implementaion). It also initializes OMPD, the debugger can now load an OpenMP program or a core file. finally, adds global ICVs functions the debugger now can query and get information about global ICVs (number of threads, stacksize, ...etc). libgomp/ChangeLog 2022-05-23 Mohamed Atef <mohamedatef1698@gmail.com> * config/darwin/plugin-suffix.h (SONAME_SUFFIX): Remove ()s. * config/hpux/plugin-suffix.h (SONAME_SUFFIX): Remove ()s. * config/posix/plugin-suffix.h (SONAME_SUFFIX): Remove ()s. * configure: Regenerate. * Makefile.am (toolexeclib_LTLIBRARIES): Add libgompd.la. (libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,libgompd_la_SOURCES, libgompd_version_dep, libgompd_version_script, libgompd.ver-sun, libgompd.ver, libgompd_version_info): New. * Makefile.in: Regenerate. * env.c: Include ompd-support.h. (parse_debug): New function. (gompd_enabled): New Variable. (initialize_env): Call gompd_load. (initialize_env): Call parse_debug. * team.c: Include ompd-support.h. (gomp_team_start): Call ompd_bp_parallel_begin. (gomp_team_end): Call ompd_bp_parallel_end. (gomp_thread_start): Call ompd_bp_thread_start. * libgomp.map: Add OMP_5.0.3 symbol versions. * libgompd.map: New. * omp-tools.h.in: New. * ompd-types.h.in: New. * ompd-support.h: New. * ompd-support.c: New. * ompd-helper.h: New. * ompd-helper.c: New. * ompd-init.c: New. * ompd-icv.c: New. * configure.ac (AC_CONFIG_FILES): Add omp-tools.h and ompd-types.h. Signed-off-by: Mohamed Atef <mohamedatef1698@gmail.com>
2022-05-23jit: use 'final' and 'override' where appropriateDavid Malcolm1-5/+5
gcc/jit/ChangeLog: * jit-recording.h: Add "final" and "override" to all vfunc implementations that were missing them, as appropriate. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23analyzer: use 'final' and 'override' where appropriateDavid Malcolm7-13/+16
gcc/analyzer/ChangeLog: * call-info.cc: Add "final" and "override" to all vfunc implementations that were missing them, as appropriate. * engine.cc: Likewise. * region-model.cc: Likewise. * sm-malloc.cc: Likewise. * supergraph.h: Likewise. * svalue.cc: Likewise. * varargs.cc: Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-23[x86_64]: Zhaoxin lujiazui enablementMayshao18-47/+1159
This patch fix Zhaoxin CPU vendor ID detection problem and add zhaoxin "lujiazui" processor support. Currently gcc can't recognize Zhaoxin CPU (vendor ID "CentaurHauls" and "Shanghai") if user use -march=native option, which is confusing for users. This patch enables -march=native in zhaoxin family 7th processor and -march/-mtune=lujiazui, costs and tunning are set according to the characteristics of the processor. We add a new md file to describe lujiazui pipeline. Testing: Bootstrap is ok, and no regressions for i386/x86-64 testsuite. Background: Related Zhaoxin linux kernel patch can be found at: https://lore.kernel.org/lkml/01042674b2f741b2aed1f797359bdffb@zhaoxin.com/ Related Zhaoxin glibc patch can be found at: https://sourceware.org/git/?p=glibc.git;a=commit;h=32ac0b988466785d6e3cc1dffc364bb26fc63193 gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Detect the specific type of Zhaoxin CPU, and return Zhaoxin CPU name. (cpu_indicator_init): Handle Zhaoxin processors. * common/config/i386/i386-common.cc: Add lujiazui. * common/config/i386/i386-cpuinfo.h (enum processor_vendor): Add VENDOR_ZHAOXIN. (enum processor_types): Add ZHAOXIN_FAM7H. (enum processor_subtypes): Add ZHAOXIN_FAM7H_LUJIAZUI. * config.gcc: Add lujiazui. * config/i386/cpuid.h (signature_SHANGHAI_ebx): Add Signatures for zhaoxin (signature_SHANGHAI_ecx): Ditto. (signature_SHANGHAI_edx): Ditto. * config/i386/driver-i386.cc (host_detect_local_cpu): Let -march=native recognize lujiazui processors. * config/i386/i386-c.cc (ix86_target_macros_internal): Add lujiazui. * config/i386/i386-options.cc (m_LUJIAZUI): New_definition. * config/i386/i386.h (enum processor_type): Ditto. * config/i386/i386.md: Add lujiazui. * config/i386/x86-tune-costs.h (struct processor_costs): Add lujiazui costs. * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add lujiazui. (ix86_adjust_cost): Ditto. * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Add lujiazui Tunnings. (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_MOVX): Ditto. (X86_TUNE_MEMORY_MISMATCH_STALL): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_32): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_64): Ditto. (X86_TUNE_FUSE_CMP_AND_BRANCH_SOFLAGS): Ditto. (X86_TUNE_FUSE_ALU_AND_BRANCH): Ditto. (X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto. (X86_TUNE_USE_LEAVE): Ditto. (X86_TUNE_PUSH_MEMORY): Ditto. (X86_TUNE_LCP_STALL): Ditto. (X86_TUNE_USE_INCDEC): Ditto. (X86_TUNE_INTEGER_DFMODE_MOVES): Ditto. (X86_TUNE_OPT_AGU): Ditto. (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto. (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto. (X86_TUNE_USE_SAHF): Ditto. (X86_TUNE_USE_BT): Ditto. (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto. (X86_TUNE_ONE_IF_CONV_INSN): Ditto. (X86_TUNE_AVOID_MFENCE): Ditto. (X86_TUNE_EXPAND_ABS): Ditto. (X86_TUNE_USE_SIMODE_FIOP): Ditto. (X86_TUNE_USE_FFREEP): Ditto. (X86_TUNE_EXT_80387_CONSTANTS): Ditto. (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto. (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto. (X86_TUNE_SSE_TYPELESS_STORES): Ditto. (X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto. * doc/extend.texi: Add details about lujiazui. * doc/invoke.texi: Add details about lujiazui. * config/i386/lujiazui.md: Introduce lujiazui cpu and include new md file. gcc/testsuite/ChangeLog: * gcc.target/i386/funcspec-56.inc: Test -arch=lujiauzi and -tune=lujiazui. * g++.target/i386/mv32.C: Ditto. Signed-off-by: mayshao <mayshao-oc@zhaoxin.com>
2022-05-23testsuite: mallign: Handle word size of 1 byteDimitar Dimitrov1-1/+1
This patch fixes a spurious warning for the pru-unknown-elf target: gcc/testsuite/gcc.dg/mallign.c:12:27: warning: ignoring return value of 'malloc' declared with attribute 'warn_unused_result' [-Wunused-result] For 8-bit targets the resulting mask ignores all bits in the value returned by malloc. Fix by first checking the target word size. gcc/testsuite/ChangeLog: * gcc.dg/mallign.c: Skip check if sizeof(word)==1. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-23demangler: C++ modules supportNathan Sidwell3-28/+188
This adds demangling support for C++ modules. A new 'W' component along with augmented behaviour of 'S' components. include/ * demangle.h (enum demangle_component_type): Add module components. libiberty/ * cp-demangle.c (d_make_comp): Adjust. (d_name, d_prefix): Adjust subst handling. Add module handling. (d_maybe_module_name): New. (d_unqualified_name): Add incoming module parm. Handle it. Adjust all callers. (d_special_name): Add 'GI' support. (d_count_template_scopes): Adjust. (d_print_comp_inner): Print module. * testsuite/demangle-expected: New test cases
2022-05-23tilepro: fix missing ARRAY_SIZE macroMartin Liska1-0/+2
gcc/ChangeLog: * config/tilepro/gen-mul-tables.cc (ARRAY_SIZE): Add new macro.
2022-05-23Remove forward_propagate_into_condRichard Biener1-77/+2
This is a first cleanup opportunity from the COND_EXPR gimplification which allows us to remove now redundant forward_propagate_into_cond. 2022-05-23 Richard Biener <rguenther@suse.de> * tree-ssa-forwprop.cc (forward_propagate_into_cond): Remove. (pass_forwprop::execute): Do not propagate into COND_EXPR conditions.
2022-05-23Remove is_gimple_condexprRichard Biener4-32/+14
This removes is_gimple_condexpr, note the vectorizer via patterns still creates COND_EXPRs with embedded GENERIC conditions and has a reference to the function in comments. Otherwise is_gimple_condexpr is now equal to is_gimple_val. 2022-05-16 Richard Biener <rguenther@suse.de> * gimple-expr.cc (is_gimple_condexpr): Remove. * gimple-expr.h (is_gimple_condexpr): Likewise. * gimplify.cc (gimplify_expr): Remove is_gimple_condexpr usage. * tree-if-conv.cc (set_bb_predicate): Likewie. (add_to_predicate_list): Likewise. (gen_phi_arg_condition): Likewise. (predicate_scalar_phi): Likewise. (predicate_statements): Likewise.
2022-05-23Force the selection operand of a GIMPLE COND_EXPR to be a registerRichard Biener21-62/+104
This goes away with the selection operand allowed to be a GENERIC tcc_comparison tree. It keeps those for vectorizer pattern recog, those are short lived and removing this instance is a bigger task. The patch doesn't yet remove dead code and functionality, that's left for a followup. Instead the patch makes sure to produce valid GIMPLE IL and continue to optimize COND_EXPRs where the previous IL allowed and the new IL showed regressions in the testsuite. 2022-05-16 Richard Biener <rguenther@suse.de> * gimple-expr.cc (is_gimple_condexpr): Equate to is_gimple_val. * gimplify.cc (gimplify_pure_cond_expr): Gimplify the condition as is_gimple_val. * gimple-fold.cc (valid_gimple_rhs_p): Simplify. * tree-cfg.cc (verify_gimple_assign_ternary): Likewise. * gimple-loop-interchange.cc (loop_cand::undo_simple_reduction): Build the condition of the COND_EXPR separately. * tree-ssa-loop-im.cc (move_computations_worker): Likewise. * tree-vect-generic.cc (expand_vector_condition): Likewise. * tree-vect-loop.cc (vect_create_epilog_for_reduction): Likewise. * vr-values.cc (simplify_using_ranges::simplify): Likewise. * tree-vect-patterns.cc: Add comment indicating we are building invalid COND_EXPRs and why. * omp-expand.cc (expand_omp_simd): Gimplify the condition to the COND_EXPR separately. (expand_omp_atomic_cas): Note part that should be unreachable now. * tree-ssa-forwprop.cc (forward_propagate_into_cond): Adjust condition for valid replacements. * tree-if-conv.cc (predicate_bbs): Simulate previous re-folding of the condition in folded COND_EXPRs which is necessary because of unfolded GIMPLE_CONDs in the IL as in for example gcc.dg/fold-bopcond-1.c. * gimple-range-gori.cc (gori_compute::condexpr_adjust): Handle that the comparison is now in the def stmt of the select operand. Required by gcc.dg/pr104526.c. * gcc.dg/gimplefe-27.c: Adjust. * gcc.dg/gimplefe-45.c: Likewise. * gcc.dg/pr101145-2.c: Likewise. * gcc.dg/pr98211.c: Likewise. * gcc.dg/torture/pr89595.c: Likewise. * gcc.dg/tree-ssa/divide-7.c: Likewise. * gcc.dg/tree-ssa/ssa-lim-12.c: Likewise.
2022-05-23OpenMP: Handle descriptors in target's firstprivate [PR104949]Tobias Burnus11-11/+355
For allocatable/pointer arrays, a firstprivate to a device not only needs to privatize the descriptor but also the actual data. This is implemented as: firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x) where the address of x in device memory is saved in hostaddrs[i] by libgomp and the middle end actually passes hostaddrs[i]' to attach. As side effect, has_device_addr(array_desc) had to be changed: before, it was converted to firstprivate in the front end; now it is handled in omp-low.cc as has_device_addr requires a shallow firstprivate (not touching the data pointer) while the normal firstprivate requires (now) a deep firstprivate. gcc/fortran/ChangeLog: PR fortran/104949 * f95-lang.cc (LANG_HOOKS_OMP_ARRAY_SIZE): Redefine. * trans-openmp.cc (gfc_omp_array_size): New. (gfc_trans_omp_variable_list): Never turn has_device_addr to firstprivate. * trans.h (gfc_omp_array_size): New. gcc/ChangeLog: PR fortran/104949 * langhooks-def.h (lhd_omp_array_size): New. (LANG_HOOKS_OMP_ARRAY_SIZE): Define. (LANG_HOOKS_DECLS): Add it. * langhooks.cc (lhd_omp_array_size): New. * langhooks.h (struct lang_hooks_for_decls): Add hook. * omp-low.cc (scan_sharing_clauses, lower_omp_target): Handle GOMP_MAP_FIRSTPRIVATE for array descriptors. libgomp/ChangeLog: PR fortran/104949 * target.c (gomp_map_vars_internal, copy_firstprivate_data): Support attach for GOMP_MAP_FIRSTPRIVATE. * testsuite/libgomp.fortran/target-firstprivate-1.f90: New test. * testsuite/libgomp.fortran/target-firstprivate-2.f90: New test. * testsuite/libgomp.fortran/target-firstprivate-3.f90: New test.
2022-05-23Some additional ix86_rtx_costs clean-ups: NEG, AND, andn and pandn.Roger Sayle1-39/+94
Double-word NOT requires two operations, but double-word NEG requires three operations. Using SSE, vector NOT requires a pxor with -1, but AND of NOT is cheap thanks to the existence of pandn. There's also some legacy (aka incorrect) logic explicitly testing for DImode [independently of TARGET_64BIT] in determining the cost of logic operations that's not required. 2022-05-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) <case AND>: Split from XOR/IOR case. Account for two instructions for double-word operations. In case of vector pandn, account for single instruction. Likewise for integer andn with TARGET_BMI. <case NOT>: Vector NOT requires more than 1 instruction (pxor). <case NEG>: Double-word negation requires 3 instructions.
2022-05-23RISC-V: Fix canonical extension order (K and J)Tsukasa OI2-2/+2
This commit fixes canonical extension order to follow the RISC-V ISA Manual draft-20210402-1271737 or later. gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_supported_std_ext): Fix "K" extension prefix to be placed before "J". * config/riscv/arch-canonicalize: Likewise. Signed-off-by: Tsukasa OI <research_trasio@irq.a4lg.com>
2022-05-23Increase move cost between mask and gpr.liuhongt2-3/+3
kmovd only uses port5 which is often the bottleneck of performance. Also from latency perspective, spill and reload mostly could be STLF or even MRN which only take 1 cycle. So the patch increase move cost between gpr and mask to be the same as gpr <-> sse register. gcc/ChangeLog: * config/i386/x86-tune-costs.h (skylake_cost): Increase gpr <-> mask cost from 5 to 6. (icelake_cost): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/spill_to_mask-1.c: New test.
2022-05-23Daily bump.GCC Administrator1-1/+1
2022-05-22Daily bump.GCC Administrator2-1/+32
2022-05-21testsuite: Skip vectorize tests for PRUDimitar Dimitrov7-14/+14
PRU has single-cycle constant cost for any jump, and it cannot vectorise. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/gen-vect-11.c: For PRU target, skip the vectorizing checks in tree dumps. * gcc.dg/tree-ssa/gen-vect-11a.c: Ditto. * gcc.dg/tree-ssa/gen-vect-2.c: Ditto. * gcc.dg/tree-ssa/gen-vect-25.c: Ditto. * gcc.dg/tree-ssa/gen-vect-26.c: Ditto. * gcc.dg/tree-ssa/gen-vect-28.c: Ditto. * gcc.dg/tree-ssa/gen-vect-32.c: Ditto. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21testsuite: Adjust pr91088.c for default_packed targetsDimitar Dimitrov1-1/+2
PR ipa/91088 gcc/testsuite/ChangeLog: * gcc.dg/ipa/pr91088.c: Adjust member offset checks to accommodate targets which pack structures by default. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21testsuite: Skip gcc.dg/pr46647.c for PRUDimitar Dimitrov1-2/+2
Like AVR and Cris, PRU has no alignment requirements. Thus it is also affected by PR53535. PR middle-end/53535 gcc/testsuite/ChangeLog: * gcc.dg/pr46647.c: Skip for pru target. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21testsuite: Skip ifcvt-4.c for PRUDimitar Dimitrov1-1/+1
PRU has no condition code and conditional moves. gcc/testsuite/ChangeLog: * gcc.dg/ifcvt-4.c: Skip for PRU. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21testsuite: Mark extra warnings for default_packedDimitar Dimitrov1-2/+4
If the target uses packed structs by default, there are no trailing padding bytes allocated. Hence extra warnings are emitted. gcc/testsuite/ChangeLog: * gcc.dg/Warray-bounds-48-novec.c: Add expected warnings if target packs the structs by default. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2022-05-21Daily bump.GCC Administrator15-1/+389
2022-05-20testsuite: add missing dg-require-effective-target fpicMarc Poulhiès1-0/+1
Require effective target fpic for newly added test. gcc/testsuite/ * g++.dg/ext/visibility/visibility-local-extern1.C: Add missing dg-require-effective-target fpic.
2022-05-20libstdc++: Reduce <random> test iterations for simulatorsJonathan Wakely7-21/+76
Some of these tests take several minutes on a simulator like cris-elf, so we can conditionally run fewer iterations. The testDiscreteDist helper already supports custom sizes so we just need to make use of that when { target simulator } matches. The relevant code is sufficiently tested on other targets, so we're not losing anything by only running a small number of iterators for sims. libstdc++-v3/ChangeLog: * testsuite/26_numerics/random/bernoulli_distribution/operators/values.cc: Run fewer iterations for simulator targets. * testsuite/26_numerics/random/binomial_distribution/operators/values.cc: Likewise. * testsuite/26_numerics/random/discrete_distribution/operators/values.cc: Likewise. * testsuite/26_numerics/random/geometric_distribution/operators/values.cc: Likewise. * testsuite/26_numerics/random/negative_binomial_distribution/operators/values.cc: Likewise. * testsuite/26_numerics/random/poisson_distribution/operators/values.cc: Likewise. * testsuite/26_numerics/random/uniform_int_distribution/operators/values.cc: Likewise.
2022-05-20AArch64: Improve rotate patternsWilco Dijkstra4-65/+461
Improve and generalize rotate patterns. Rotates by more than half the bitwidth of a register are canonicalized to rotate left. Many existing shift patterns don't handle this case correctly, so add rotate left to the shift iterator and convert rotate left into ror during assembly output. Add missing zero_extend patterns for shifted BIC, ORN and EON. gcc/ * config/aarch64/aarch64.md (and_<SHIFT:optab><mode>3_compare0): Support rotate left. (and_<SHIFT:optab>si3_compare0_uxtw): Likewise. (<LOGICAL:optab>_<SHIFT:optab><mode>3): Likewise. (<LOGICAL:optab>_<SHIFT:optab>si3_uxtw): Likewise. (one_cmpl_<optab><mode>2): Likewise. (<LOGICAL:optab>_one_cmpl_<SHIFT:optab><mode>3): Likewise. (<LOGICAL:optab>_one_cmpl_<SHIFT:optab>sidi_uxtw): New pattern. (eor_one_cmpl_<SHIFT:optab><mode>3_alt): Support rotate left. (eor_one_cmpl_<SHIFT:optab>sidi3_alt_ze): Likewise. (and_one_cmpl_<SHIFT:optab><mode>3_compare0): Likewise. (and_one_cmpl_<SHIFT:optab>si3_compare0_uxtw): Likewise. (and_one_cmpl_<SHIFT:optab><mode>3_compare0_no_reuse): Likewise. (and_<SHIFT:optab><mode>3nr_compare0): Likewise. (*<optab>si3_insn_uxtw): Use SHIFT_no_rotate. (rolsi3_insn_uxtw): New pattern. * config/aarch64/iterators.md (SHIFT): Add rotate left. (SHIFT_no_rotate): Add new iterator. (SHIFT:shift): Print rotate left as ror. (is_rotl): Add test for left rotate. gcc/testsuite/ * gcc.target/aarch64/ror_2.c: New test. * gcc.target/aarch64/ror_3.c: New test.
2022-05-20AArch64: Cleanup CPU option processing codeWilco Dijkstra3-135/+32
The --with-cpu/--with-arch configure option processing not only checks valid arguments but also sets TARGET_CPU_DEFAULT with a CPU and extension bitmask. This isn't used however since a --with-cpu is translated into a -mcpu option which is processed as if written on the command-line (so TARGET_CPU_DEFAULT is never accessed). So remove all the complex processing and bitmask, and just validate the option. Fix a bug that always reports valid architecture extensions as invalid. As a result the CPU processing in aarch64.c can be simplified. gcc/ * config.gcc (aarch64*-*-*): Simplify --with-cpu and --with-arch processing. Add support for architectural extensions. * config/aarch64/aarch64.h (TARGET_CPU_DEFAULT): Remove AARCH64_CPU_DEFAULT_FLAGS. (TARGET_CPU_NBITS): Remove. (TARGET_CPU_MASK): Remove. * config/aarch64/aarch64.cc (AARCH64_CPU_DEFAULT_FLAGS): Remove define. (get_tune_cpu): Assert CPU is always valid. (get_arch): Assert architecture is always valid. (aarch64_override_options): Cleanup CPU selection code and simplify logic. (aarch64_option_restore): Remove unnecessary checks on tune.
2022-05-20Use "final" and "override" directly, rather than via macrosDavid Malcolm60-1345/+1345
As of GCC 11 onwards we have required a C++11 compiler, such as GCC 4.8 or later. On the assumption that any such compiler correctly implements "final" and "override", this patch updates the source tree to stop using the FINAL and OVERRIDE macros from ansidecl.h, in favor of simply using "final" and "override" directly. libcpp/ChangeLog: * lex.cc: Replace uses of "FINAL" and "OVERRIDE" with "final" and "override". gcc/analyzer/ChangeLog: * analyzer-pass.cc: Replace uses of "FINAL" and "OVERRIDE" with "final" and "override". * call-info.h: Likewise. * checker-path.h: Likewise. * constraint-manager.cc: Likewise. * diagnostic-manager.cc: Likewise. * engine.cc: Likewise. * exploded-graph.h: Likewise. * feasible-graph.h: Likewise. * pending-diagnostic.h: Likewise. * region-model-impl-calls.cc: Likewise. * region-model.cc: Likewise. * region-model.h: Likewise. * region.h: Likewise. * sm-file.cc: Likewise. * sm-malloc.cc: Likewise. * sm-pattern-test.cc: Likewise. * sm-sensitive.cc: Likewise. * sm-signal.cc: Likewise. * sm-taint.cc: Likewise. * state-purge.h: Likewise. * store.cc: Likewise. * store.h: Likewise. * supergraph.h: Likewise. * svalue.h: Likewise. * trimmed-graph.h: Likewise. * varargs.cc: Likewise. gcc/c-family/ChangeLog: * c-format.cc: Replace uses of "FINAL" and "OVERRIDE" with "final" and "override". * c-pretty-print.h: Likewise. gcc/cp/ChangeLog: * cxx-pretty-print.h: Replace uses of "FINAL" and "OVERRIDE" with "final" and "override". * error.cc: Likewise. gcc/jit/ChangeLog: * jit-playback.h: Replace uses of "FINAL" and "OVERRIDE" with "final" and "override". * jit-recording.cc: Likewise. * jit-recording.h: Likewise. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc: Replace uses of "FINAL" and "OVERRIDE" with "final" and "override". * config/aarch64/aarch64-sve-builtins-functions.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise. * config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise. * diagnostic-path.h: Likewise. * digraph.cc: Likewise. * gcc-rich-location.h: Likewise. * gimple-array-bounds.cc: Likewise. * gimple-loop-versioning.cc: Likewise. * gimple-range-cache.cc: Likewise. * gimple-range-cache.h: Likewise. * gimple-range-fold.cc: Likewise. * gimple-range-fold.h: Likewise. * gimple-range-tests.cc: Likewise. * gimple-range.h: Likewise. * gimple-ssa-evrp.cc: Likewise. * input.cc: Likewise. * json.h: Likewise. * read-rtl-function.cc: Likewise. * tree-complex.cc: Likewise. * tree-diagnostic-path.cc: Likewise. * tree-ssa-ccp.cc: Likewise. * tree-ssa-copy.cc: Likewise. * tree-vrp.cc: Likewise. * value-query.h: Likewise. * vr-values.h: Likewise. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-05-20libgomp: Add new runtime routines omp_target_memcpy_async and ↵Marcel Vollweiler16-55/+950
omp_target_memcpy_rect_async This patch adds two new OpenMP runtime routines: omp_target_memcpy_async and omp_target_memcpy_rect_async. Both functions are introduced in OpenMP 5.1 as asynchronous variants of omp_target_memcpy and omp_target_memcpy_rect. In contrast to the synchronous variants, the asynchronous functions have two additional function parameters to allow the specification of task dependences: int depobj_count omp_depend_t *depobj_list integer(c_int), value :: depobj_count integer(omp_depend_kind), optional :: depobj_list(*) The implementation splits the synchronous functions into two parts: (a) check and (b) copy. Then (a) is used in the asynchronous functions for the sequential part, and the actual copy process (b) is executed in a new created task. The sequential part (a) takes into account the requirements for the return values: "The routine returns zero if successful. Otherwise, it returns a non-zero value." (omp_target_memcpy_async, OpenMP 5.1 spec, section 3.8.7) "An application can determine the number of inclusive dimensions supported by an implementation by passing NULL pointers (or C_NULL_PTR, for Fortran) for both dst and src. The routine returns the number of dimensions supported by the implementation for the specified device numbers. No copy operation is performed." (omp_target_memcpy_rect_async, OpenMP 5.1 spec, section 3.8.8) Due to asynchronicity an error is thrown if the asynchronous memcpy is not successful (in contrast to the synchronous functions which use a return value unequal to zero). gcc/ChangeLog: * omp-low.cc (omp_runtime_api_call): Added target_memcpy_async and target_memcpy_rect_async to omp_runtime_apis array. libgomp/ChangeLog: * libgomp.map: Added omp_target_memcpy_async and omp_target_memcpy_rect_async. * libgomp.texi: Both functions are now supported. * omp.h.in: Added omp_target_memcpy_async and omp_target_memcpy_rect_async. * omp_lib.f90.in: Added interfaces for both new functions. * omp_lib.h.in: Likewise. * target.c (ialias_redirect): Added for GOMP_task. (omp_target_memcpy): Restructured into check and copy part. (omp_target_memcpy_check): New helper function for omp_target_memcpy and omp_target_memcpy_async that checks requirements. (omp_target_memcpy_copy): New helper function for omp_target_memcpy and omp_target_memcpy_async that performs the memcpy. (omp_target_memcpy_async_helper): New helper function that is used in omp_target_memcpy_async for the asynchronous task. (omp_target_memcpy_async): Added. (omp_target_memcpy_rect): Restructured into check and copy part. (omp_target_memcpy_rect_check): New helper function for omp_target_memcpy_rect and omp_target_memcpy_rect_async that checks requirements. (omp_target_memcpy_rect_copy): New helper function for omp_target_memcpy_rect and omp_target_memcpy_rect_async that performs the memcpy. (omp_target_memcpy_rect_async_helper): New helper function that is used in omp_target_memcpy_rect_async for the asynchronous task. (omp_target_memcpy_rect_async): Added. * task.c (ialias): Added for GOMP_task. * testsuite/libgomp.c-c++-common/target-memcpy-async-1.c: New test. * testsuite/libgomp.c-c++-common/target-memcpy-async-2.c: New test. * testsuite/libgomp.c-c++-common/target-memcpy-rect-async-1.c: New test. * testsuite/libgomp.c-c++-common/target-memcpy-rect-async-2.c: New test. * testsuite/libgomp.fortran/target-memcpy-async-1.f90: New test. * testsuite/libgomp.fortran/target-memcpy-async-2.f90: New test. * testsuite/libgomp.fortran/target-memcpy-rect-async-1.f90: New test. * testsuite/libgomp.fortran/target-memcpy-rect-async-2.f90: New test.
2022-05-20libgcc: use __builtin_clz and __builtin_ctz in libbidChristophe Lyon1-47/+4
This patch replaces libbid's implementations of clz and ctz for 32 and 64 bits inputs which used several masks, and switches to the corresponding builtins. This will provide a better implementation, especially on targets with clz/ctz instructions. 2022-05-06 Christophe Lyon <christophe.lyon@arm.com> libgcc/config/libbid/ChangeLog: * bid_binarydecimal.c (CLZ32_MASK16): Delete. (CLZ32_MASK8): Delete. (CLZ32_MASK4): Delete. (CLZ32_MASK2): Delete. (CLZ32_MASK1): Delete. (clz32_nz): Use __builtin_clz. (ctz32_1bit): Delete. (ctz32): Use __builtin_ctz. (CLZ64_MASK32): Delete. (CLZ64_MASK16): Delete. (CLZ64_MASK8): Delete. (CLZ64_MASK4): Delete. (CLZ64_MASK2): Delete. (CLZ64_MASK1): Delete. (clz64_nz): Use __builtin_clzl. (ctz64_1bit): Delete. (ctz64): Use __builtin_ctzl.
2022-05-20libgcc: Add support for HF mode (aka _Float16) in libbidChristophe Lyon10-5/+364
This patch adds support for trunc and extend operations between HF mode (_Float16) and Decimal Floating Point formats (_Decimal32, _Decimal64 and _Decimal128). For simplicity we rely on the implicit conversions inserted by the compiler between HF and SD/DF/TF modes. The existing bid*_to_binary* and binary*_to_bid* functions are non-trivial and at this stage it is not clear if there is a performance-critical use case involving _Float16 and _Decimal* formats. The patch also adds two executable tests, to make sure the right functions are called, available (link phase) and functional. Tested on aarch64 and x86_64. The number of symbol matches in the testcases includes the .global XXX to avoid having to match different call instructions for different targets. 2022-05-04 Christophe Lyon <christophe.lyon@arm.com> libgcc/ChangeLog: * Makefile.in (D32PBIT_FUNCS): Add _hf_to_sd and _sd_to_hf. (D64PBIT_FUNCS): Add _hf_to_dd and _dd_to_hf. (D128PBIT_FUNCS): Add _hf_to_td _td_to_hf. libgcc/config/libbid/ChangeLog: * bid_gcc_intrinsics.h (LIBGCC2_HAS_HF_MODE): Define according to __LIBGCC_HAS_HF_MODE__. (BID_HAS_HF_MODE): Define. (HFtype): Define. (__bid_extendhfsd): New prototype. (__bid_extendhfdd): Likewise. (__bid_extendhftd): Likewise. (__bid_truncsdhf): Likewise. (__bid_truncddhf): Likewise. (__bid_trunctdhf): Likewise. * _dd_to_hf.c: New file. * _hf_to_dd.c: New file. * _hf_to_sd.c: New file. * _hf_to_td.c: New file. * _sd_to_hf.c: New file. * _td_to_hf.c: New file. gcc/testsuite/ChangeLog: * gcc.dg/torture/convert-dfp-2.c: New test. * gcc.dg/torture/convert-dfp.c: New test.
2022-05-20testsuite: Add C++ unwinding tests with Decimal Floating-PointChristophe Lyon3-0/+157
These tests exercise exception handling with Decimal Floating-Point type. dfp-1.C and dfp-2.C check that thrown objects of such types are properly caught, whether when using C++ classes (decimalXX) or via GCC mode attributes. dfp-saves-aarch64.C checks that such objects are properly restored, and has to use the mode attribute trick because objects of decimalXX class type cannot be assigned to a register variable. 2022-05-03 Christophe Lyon <christophe.lyon@arm.com> gcc/testsuite/ * g++.dg/eh/dfp-1.C: New test. * g++.dg/eh/dfp-2.C: New test. * g++.dg/eh/dfp-saves-aarch64.C: New test.
2022-05-20testsuite: enable more BID DFP tests for AArch64Christophe Lyon10-8/+30
Some tests for the BID format are currently restricted to i?86 and x86_64, but they also pass on AArch64, so this patch enables them. Since all these tests are related to the BID format, it seems useful to introduce a new effective-target (dfp_bid) instead of adding aarch64 to the current target list. 2022-04-28 Christophe Lyon <christophe.lyon@arm.com> gcc/ * doc/sourcebuild.texi (Decimal floating point attributes): Document dfp_bid effective-target. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_dfp_bid): New. * gcc.dg/dfp/bid-non-canonical-d128-1.c: Use dfp_bid effective-target. * gcc.dg/dfp/bid-non-canonical-d128-2.c: Likewise. * gcc.dg/dfp/bid-non-canonical-d128-3.c: Likewise. * gcc.dg/dfp/bid-non-canonical-d128-4.c: Likewise. * gcc.dg/dfp/bid-non-canonical-d32-1.c: Likewise. * gcc.dg/dfp/bid-non-canonical-d32-2.c: Likewise. * gcc.dg/dfp/bid-non-canonical-d64-1.c: Likewise. * gcc.dg/dfp/bid-non-canonical-d64-2.c: Likewise.
2022-05-20testsuite: Add new tests for DFP under aarch64/aapcs64Christophe Lyon49-0/+1982
This patch copies all existing tests involving float/double/long double types and replaces them with _Decimal32/_Decimal64/_Decimal128. I thought it would be clearer/easier to maintain to do it this way rather than adding tests for DFP types in the existing testcases, except for func-ret-1.c and func-ret-3.c. This makes sure all cases tested for traditional floating-point are equally tested for decimal floating-point. The patch also adds a test involving loading DFP values from memory. 2022-03-31 Christophe Lyon <christophe.lyon@arm.com> gcc/testsuite/ * gcc.target/aarch64/aapcs64/aapcs64.exp: Support new dfp*.c tests. * gcc.target/aarch64/aapcs64/func-ret-1.c: Add DFP tests. * gcc.target/aarch64/aapcs64/func-ret-3.c: Add DFP tests. * gcc.target/aarch64/aapcs64/type-def.h: Add DFP types. * gcc.target/aarch64/aapcs64/dfp-1.c: New test. * gcc.target/aarch64/aapcs64/ice_dfp_5.c: New test. * gcc.target/aarch64/aapcs64/test_align_dfp-1.c: New test. * gcc.target/aarch64/aapcs64/test_align_dfp-4.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_1.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_10.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_11.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_12.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_13.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_14.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_15.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_16.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_17.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_18.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_19.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_2.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_20.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_21.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_22.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_23.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_24.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_25.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_26.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_27.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_3.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_5.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_6.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_7.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_8.c: New test. * gcc.target/aarch64/aapcs64/test_dfp_9.c: New test. * gcc.target/aarch64/aapcs64/test_quad_double_dfp.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-1.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-10.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-11.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-12.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-13.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-14.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-16.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-2.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-3.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-4.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-5.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-6.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-8.c: New test. * gcc.target/aarch64/aapcs64/va_arg_dfp-9.c: New test.
2022-05-20testsuite:: Fix pr39986.c testcase for AArch64Christophe Lyon1-11/+11
The testcase in c-c++-common/dfp/pr39986.c detects if DFP constants are correctly emitted in the assembly. However, AArch64 uses .word instead of the expected .long directive. With this patch, we now accept both. 2022-03-31 Christophe Lyon <christophe.lyon@arm.com> gcc/testsuite/ * c-c++-common/dfp/pr39986.c: Accept .word directive.
2022-05-20libgcc: enable DFP for AArch64Christophe Lyon1-0/+6
DFP support on AArch64 relies on libgcc, so enable its DFP routines for all AArch64 targets. 2022-03-31 Christophe Lyon <christophe.lyon@arm.com> libgcc/ * config.host: Add t-dfprules to AArch64 targets.
2022-05-20libgcc: Enable XF mode conversions to/from DFP modes only if supportedChristophe Lyon6-0/+12
Some targets do not support XF mode (eg AArch64), so don't build the corresponding to/from DFP modes convertion routines if __LIBGCC_HAS_XF_MODE__ is not defined. 2022-03-31 Christophe Lyon <christophe.lyon@arm.com> libgcc/config/libbid/ * _dd_to_xf.c: Check __LIBGCC_HAS_XF_MODE__. * _sd_to_xf.c: Likewise. * _td_to_xf.c: Likewise. * _xf_to_dd.c: Likewise. * _xf_to_sd.c: Likewise. * _xf_to_td.c: Likewise.
2022-05-20aarch64: Add backend support for DFPChristophe Lyon3-51/+89
This patch updates the aarch64 backend as needed to support DFP modes (SD, DD and TD). Changes v1->v2: * Drop support for DFP modes in aarch64_gen_{load||store}[wb]_pair as these are only used in prologue/epilogue where DFP modes are not used. Drop the changes to the corresponding patterns in aarch64.md, and useless GPF_PAIR iterator. * In aarch64_reinterpret_float_as_int, handle DDmode the same way as DFmode (needed in case the representation of the floating-point value can be loaded using mov/movk. * In aarch64_float_const_zero_rtx_p, reject constants with DFP mode: when X is zero, the callers want to emit either '0' or 'zr' depending on the context, which is not the way 0.0 is represented in DFP mode (in particular fmov d0, #0 is not right for DFP). * In aarch64_legitimate_constant_p, accept DFP 2022-03-31 Christophe Lyon <christophe.lyon@arm.com> gcc/ * config/aarch64/aarch64.cc (aarch64_split_128bit_move): Handle DFP modes. (aarch64_mode_valid_for_sched_fusion_p): Likewise. (aarch64_classify_address): Likewise. (aarch64_legitimize_address_displacement): Likewise. (aarch64_reinterpret_float_as_int): Likewise. (aarch64_float_const_zero_rtx_p): Likewise. (aarch64_can_const_movi_rtx_p): Likewise. (aarch64_anchor_offset): Likewise. (aarch64_secondary_reload): Likewise. (aarch64_rtx_costs): Likewise. (aarch64_legitimate_constant_p): Likewise. (aarch64_gimplify_va_arg_expr): Likewise. (aapcs_vfp_sub_candidate): Likewise. (aarch64_vfp_is_call_or_return_candidate): Likewise. (aarch64_output_scalar_simd_mov_immediate): Likewise. (aarch64_gen_adjusted_ldpstp): Likewise. (aarch64_scalar_mode_supported_p): Accept DFP modes if enabled. * config/aarch64/aarch64.md (movsf_aarch64): Use SFD iterator and rename into mov<mode>_aarch64. (movdf_aarch64): Use DFD iterator and rename into mov<mode>_aarch64. (movtf_aarch64): Use TFD iterator and rename into mov<mode>_aarch64. (split pattern for move TF mode): Use TFD iterator. * config/aarch64/iterators.md (GPF_TF_F16_MOV): Add DFP modes. (SFD, DFD, TFD): New iterators. (GPF_TF): Add DFP modes. (TX, DX, DX2): Likewise.
2022-05-20aarch64: Enable DFP (Decimal Floating-point) (BID format)Christophe Lyon4-4/+8
This patch enables DFP support on aarch64, by updating config/dfp.m4 and regenerating the involved configure scripts. We enable the BID format. 2022-03-31 Christophe Lyon <christophe.lyon@arm.com> config/ * dfp.m4: Add aarch64 support. gcc/ * configure: Regenerate. libdecnumber/ * configure: Regenerate. libgcc/ * configure: Regenerate.
2022-05-20Disable snapshots from gcc-9Richard Biener1-1/+0
GCC 9 nears its end. 2022-05-20 Richard Biener <rguenther@suse.de> maintainer-scripts/ * crontab: Disable snapshots from the gcc-9 branch.
2022-05-20Daily bump.GCC Administrator5-1/+462
2022-05-19libstdc++: Avoid including <cstdint> for std::char_traitsJonathan Wakely7-14/+25
We should prefer the __UINT_LEAST16_TYPE__ and __UINT_LEAST32_TYPE__ macros, if available, so that we don't need all of <cstdint> in every header that uses std::char_traits. libstdc++-v3/ChangeLog: * include/bits/char_traits.h: Only include <cstdint> when necessary. * include/std/stacktrace: Use __UINTPTR_TYPE__ instead of uintptr_t. * src/c++11/cow-stdexcept.cc: Include <stdint.h>. * src/c++17/floating_to_chars.cc: Likewise. * testsuite/20_util/assume_aligned/1.cc: Include <cstdint>. * testsuite/20_util/assume_aligned/3.cc: Likewise. * testsuite/20_util/shared_ptr/creation/array.cc: Likewise.
2022-05-19libstdc++: Only include <ext/atomicity.h> for COW stringJonathan Wakely2-1/+2
Since the COW std::string was moved to its own header, we don't need the atomic dispatch helpers in the definition of std::__cxx11::string. Move the inclusion of the <ext/atomicity.h> header to <bits/cow_string.h> where it's needed. libstdc++-v3/ChangeLog: * include/bits/basic_string.h: Do not include <ext/atomicity.h> here. * include/bits/cow_string.h: Include it here.
2022-05-19libstdc++: Ensure pmr aliases work without <memory_resource>Jonathan Wakely30-488/+551
Currently the alias templates for std::pmr::vector, std::pmr::string etc. are defined using a forward declaration for polymorphic_allocator. This means you can't actually use the alias templates unless you also include <memory_resource>. The rationale for that is that it's a fairly large header, and most users don't need it. This isn't uncontroversial though, and LWG 3681 questions whether it's even conforming. This change adds a new <bits/memory_resource.h> header with the minimum needed to use polymorphic_allocator and the std::pmr container aliases. Including <memory_resource> is still necessary to use the program-wide resource objects, or the pool resources or monotonic buffer resource. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/memory_resource.h: New file. * include/std/deque: Include <bits/memory_resource.h>. * include/std/forward_list: Likewise. * include/std/list: Likewise. * include/std/map: Likewise. * include/std/memory_resource (pmr::memory_resource): Move to new <bits/memory_resource.h> header. (pmr::polymorphic_allocator): Likewise. * include/std/regex: Likewise. * include/std/set: Likewise. * include/std/stacktrace: Likewise. * include/std/string: Likewise. * include/std/unordered_map: Likewise. * include/std/unordered_set: Likewise. * include/std/vector: Likewise. * testsuite/21_strings/basic_string/types/pmr_typedefs.cc: Remove <memory_resource> header and check construction. * testsuite/23_containers/deque/types/pmr_typedefs.cc: Likewise. * testsuite/23_containers/forward_list/pmr_typedefs.cc: Likewise. * testsuite/23_containers/list/pmr_typedefs.cc: Likewise. * testsuite/23_containers/map/pmr_typedefs.cc: Likewise. * testsuite/23_containers/multimap/pmr_typedefs.cc: Likewise. * testsuite/23_containers/multiset/pmr_typedefs.cc: Likewise. * testsuite/23_containers/set/pmr_typedefs.cc: Likewise. * testsuite/23_containers/unordered_map/pmr_typedefs.cc: Likewise. * testsuite/23_containers/unordered_multimap/pmr_typedefs.cc: Likewise. * testsuite/23_containers/unordered_multiset/pmr_typedefs.cc: Likewise. * testsuite/23_containers/unordered_set/pmr_typedefs.cc: Likewise. * testsuite/23_containers/vector/pmr_typedefs.cc: Likewise. * testsuite/28_regex/match_results/pmr_typedefs.cc: Likewise. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/variadic-tuple.C: Qualify function to avoid ADL finding std::make_tuple.
2022-05-19PR middle-end/98865: Expand X*Y as X&-Y when Y is [0,1].Roger Sayle2-0/+86
The patch is a revised solution for PR middle-end/98865 incorporating the feedback/suggestions from Richard Biener's review here: https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593928.html Most significantly, this patch now performs the transformation/optimization during RTL expansion, where the target's rtx_costs can be used to determine whether the original multiplication (that may potentially be implemented by a shift or lea) is cheaper than a negation and a bit-wise and. Previously the expression (x>>63)*y would be compiled with -O2 as shrq $63, %rdi movq %rdi, %rax imulq %rsi, %rax but with this patch now produces: sarq $63, %rdi movq %rdi, %rax andq %rsi, %rax Likewise the expression (x>>63)*135 [that appears in a hot-spot of the Botan AES-128 benchmark] was previously: shrq $63, %rdi leaq (%rdi,%rdi,8), %rdx movq %rdx, %rax salq $4, %rax subq %rdx, %rax now becomes: movq %rdi, %rax sarq $63, %rax andl $135, %eax 2022-05-19 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR middle-end/98865 * expr.cc (expand_expr_real_2) [MULT_EXPR]: Expand X*Y as X&Y when both X and Y are [0, 1], X*Y as X&-Y when Y is [0,1] and likewise X*Y as -X&Y when X is [0,1] using tree_nonzero_bits. gcc/testsuite/ChangeLog PR middle-end/98865 * gcc.target/i386/pr98865.c: New test case.
2022-05-19[PATCH, rs6000] Remove the (no longer used) BTC defines.Will Schmidt2-51/+4
These defines are no longer used once the rs6000 built-in reworks were completed. Time to remove them. There was a reference to RS6000_BTC_SPECIAL in a TODO comment in rs6000-builtins.def. That comment remains, but I have updated the comment to refer to "SPECIAL" processing, instead of having it refer directly to the _BTC_SPECIAL macro. 2022-05-18 Will Schmidt <will_schmidt@vnet.ibm.com> gcc/ * config/rs6000/rs6000-builtins.def: Rephrase to remove RS6000_BTC_SPECIAL from comment. * config/rs6000/rs6000.h (RS6000_BTC_UNARY, RS6000_BTC_BINARY, RS6000_BTC_TERNARY, RS6000_BTC_QUATERNARY, RS6000_BTC_QUINARY, RS6000_BTC_SENARY, RS6000_BTC_OPND_MASK, RS6000_BTC_SPECIAL, RS6000_BTC_PREDICATE, RS6000_BTC_ABS, RS6000_BTC_DST, RS6000_BTC_TYPE_MASK, RS6000_BTC_MISC, RS6000_BTC_CONST, RS6000_BTC_PURE, RS6000_BTC_FP, RS6000_BTC_QUAD, RS6000_BTC_PAIR, RS6000_BTC_QUADPAIR, RS6000_BTC_ATTR_MASK, RS6000_BTC_SPR, RS6000_BTC_VOID, RS6000_BTC_CR, RS6000_BTC_OVERLOADED, RS6000_BTC_GIMPLE, RS6000_BTC_MISC_MASK, RS6000_BTC_MEM, RS6000_BTC_SAT, RS6000_BTM_ALWAYS): Delete.
2022-05-19libstdc++: Implement LWG 3683 for pmr::polymorphic_allocatorJonathan Wakely2-0/+29
This issue has recently been moved to Tentatively Ready, and seems uncontroversial. This allows equality comparison with types that are convertible to pmr::polymorphic_allocator, which fail deduction for the existing equality operator. libstdc++-v3/ChangeLog: * include/std/memory_resource (polymorphic_allocator): Add non-template equality operator, as proposed for LWG 3683. * testsuite/20_util/polymorphic_allocator/lwg3683.cc: New test.
2022-05-19Fix OMP CAS expansion with separate conditionRichard Biener1-5/+6
When forcing the condition to be split out from COND_EXPRs I see a runtime failure of libgomp.fortran/atomic-19.f90 which can be reduced to !$omp atomic update, compare, capture if (x == 69_2 - r) x = 6_8 v = x being miscompiled, the difference being - _13 = .ATOMIC_COMPARE_EXCHANGE (_9, _10, _11, 4, 0, 0); - _14 = IMAGPART_EXPR <_13>; - _15 = REALPART_EXPR <_13>; - _16 = _14 != 0 ? _11 : _15; - _2 = (integer(kind=4)) _16; - v_17 = _2; + _14 = .ATOMIC_COMPARE_EXCHANGE (_10, _11, _12, 4, 0, 0); + _15 = IMAGPART_EXPR <_14>; + _16 = REALPART_EXPR <_14>; + _2 = (logical(kind=1)) _15; + _3 = (integer(kind=4)) _16; + v_17 = _3; where one can see a missing COND_EXPR. It seems to be a latent issue to me given the code can be exercised, it just maybe misses a 'need_new' testcase combined with 'cond_stmt'. Appearantly the if (cond_stmt) code is just to avoid creating a temporary (and possibly to preserve the condition compute if used elsewhere since the original stmt is going to be deleted). The following makes the failure go away for me in my patched tree and it also survives libgomp and gomp testing in an unpatched tree. 2022-05-13 Richard Biener <rguenther@suse.de> * omp-expand.cc (expand_omp_atomic_cas): Do not short-cut computation of the new value.
2022-05-19Remove get_or_alloc_expression_idRichard Biener1-15/+3
This function is no longer needed. 2022-05-19 Richard Biener <rguenther@suse.de> * tree-ssa-pre.cc (get_or_alloc_expression_id): Remove. (add_to_value): Use get_expression_id. (bitmap_insert_into_set): Likewise. (bitmap_value_insert_into_set): Likewise.