aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-11-03i386: Fix uninitialized register after peephole2 conversion [PR107404]Uros Bizjak2-1/+55
The eliminate reg-reg move by inverting the condition of a cmove #2 peephole2 converts the following sequence: 473: bx:DI=[r14:DI*0x8+r12:DI] 960: r15:DI=r8:DI 485: {flags:CCC=cmp(r15:DI+bx:DI,bx:DI);r15:DI=r15:DI+bx:DI;} 737: r15:DI={(geu(flags:CCC,0))?r15:DI:bx:DI} to: 1110: {flags:CCC=cmp(r8:DI+bx:DI,bx:DI);r8:DI=r8:DI+bx:DI;} 1111: r15:DI=[r14:DI*0x8+r12:DI] 1112: r15:DI={(geu(flags:CCC,0))?r8:DI:r15:DI} Please note that(insn 1110) uses register BX, but its initialization was eliminated. Avoid conversion if eliminated move intialized a register, used in the moved instruction. 2022-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog: PR target/107404 * config/i386/i386.md (eliminate reg-reg move by inverting the condition of a cmove #2 peephole2): Check if eliminated move initialized a register, used in the moved instruction. gcc/testsuite/ChangeLog: PR target/107404 * g++.target/i386/pr107404.C: New test.
2022-11-03libstdc++: Add missing move in ranges::copyJonathan Wakely2-1/+25
This is needed to support a move-only output iterator when the input iterators are specializations of __normal_iterator. libstdc++-v3/ChangeLog: * include/bits/ranges_algobase.h (__detail::__copy_or_move): Move output iterator. * testsuite/25_algorithms/copy/constrained.cc: Check copying to move-only output iterator.
2022-11-03amdgcn: Fix duplicate conditionals [PR107510]Andrew Stubbs1-2/+0
Just a harmless cut-and-paste issue. PR target/107510 gcc/ChangeLog: * config/gcn/gcn.cc (gcn_expand_reduc_scalar): Remove duplicate UNSPEC_SMIN_DPP_SHR conditionals.
2022-11-03testsuite: Fix gen-vect-34.c with vect_masked_load [PR106806]Kewen Lin1-1/+1
This is to fix the failure on powerpc as reported in PR106806, the test case requires tree ifcvt pass to perform on that loop, and it relies on masked_load support. The fix is to guard the expected scan with vect_masked_load effective target. As tested on powerpc64{,le}-linux-gnu and aarch64-linux-gnu (cfarm machine), the failures were gone. But on x86_64-redhat-linux (cfarm machine) the result becomes from PASS to N/A. I think it's expected since that machine doesn't support AVX by default so both check_avx_available and vect_masked_load fail, it should work fine on machines with default AVX support, or if we adjust the current check_avx_available with current_compiler_flags. PR testsuite/106806 gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/gen-vect-34.c: Adjust with vect_masked_load effective target.
2022-11-03c: C2x autoJoseph Myers9-19/+418
Implement C2x auto, a more restricted version of the C++ feature (closer to GNU C __auto_type in terms of what's supported). Since the feature is very close to GNU C __auto_type, much of the implementation can be shared. The main differences are: * Any prior declaration of the identifier in an outer scope is shadowed during the initializer (whereas __auto_type leaves any such declaration visible until the initializer ends and the scope of the __auto_type declaration itself starts). (A prior declaration in the same scope is undefined behavior.) * The standard feature supports braced initializers (containing a single expression, optionally followed by a comma). * The standard feature disallows the declaration from declaring anything that's not an ordinary identifier (thus, the initializer cannot declare a tag or the members of a structure or union), while making it undefined behavior for it to declare more than one ordinary identifier. (For the latter, while I keep the existing error from __auto_type in the case of more than one declarator, I don't restrict other ordinary identifiers from being declared in inner scopes such as GNU statement expressions. I do however disallow defining the members of an enumeration inside the initializer (if the enum definition has no tag, that doesn't actually violate a constraint), to avoid an enum type becoming accessible beyond where it would have been without auto. (Preventing new types from escaping the initializer - thus, ensuring that anything written with auto corresponds to something that could have been written without auto, modulo multiple evaluation of VLA size expressions when not using auto - is a key motivation for some restrictions on what can be declared in the initializer.) The rule on shadowing and restrictions on other declarations in the initializer are actually general rules for what C2x calls underspecified declarations, a description that covers constexpr as well as auto (in particular, this disallows a constexpr initializer from referencing the variable being initialized). Thus, some of the code added for those restrictions will also be of use in implementing C2x constexpr. auto with a type specifier remains a storage class specifier with the same meaning as before (i.e. a redundant storage class specifier for use at block scope). Note that the feature is only enabled in C2x mode (-std=c2x or -std=gnu2x); in older modes, a declaration with auto and no type is treated as a case of implicit int (only accepted at block scope). Since many of the restrictions on C2x auto are specified as undefined behavior rather than constraint violations, it would be possible to support more features from C++ auto without requiring diagnostics (but maybe not a good idea, if it isn't clear exactly what semantics might be given to such a feature in a future revision of C; and -Wc23-c2y-compat should arguably warn for any such future feature anyway). For now the features are limited to something close to what's supported with __auto_type, with the differences as discussed above between the two features. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/c/ * c-decl.cc (in_underspecified_init, start_underspecified_init) (finish_underspecified_init): New. (shadow_tag_warned, parser_xref_tag, start_struct, start_enum): Give errors inside initializers of underspecified declarations. (grokdeclarator): Handle (erroneous) case of C2X auto on a parameter. (declspecs_add_type): Handle c2x_auto_p case. (declspecs_add_scspec): Handle auto possibly setting c2x_auto_p in C2X mode. (finish_declspecs): Handle c2x_auto_p. * c-parser.cc (c_parser_declaration_or_fndef): Handle C2X auto. * c-tree.h (C_DECL_UNDERSPECIFIED): New macro. (struct c_declspecs): Add c2x_auto_p. (start_underspecified_init, finish_underspecified_init): New prototypes. * c-typeck.cc (build_external_ref): Give error for underspecified declaration referenced in its initializer. gcc/testsuite/ * gcc.dg/c2x-auto-1.c, gcc.dg/c2x-auto-2.c, gcc.dg/c2x-auto-3.c, gcc.dg/c2x-auto-4.c, gcc.dg/gnu2x-auto-1.c: New tests.
2022-11-03Daily bump.GCC Administrator5-1/+114
2022-11-02Rebuilt configure and gcc/configure.Gaius Mulley1-3/+87
ChangeLog: * configure: Rebuilt. gcc/ChangeLog: * configure: Rebuilt. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2022-11-02Merge branch 'master' into devel/modula-2.Gaius Mulley883-6496/+23550
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2022-11-02Python3 scripts in gcc/m2/tools-src flake8 compliant and use argparse.Gaius Mulley2-446/+386
tidydates.py and boilerplate.py have been changed to use argparse. All python3 scripts are flake8 compliant. gcc/m2/ChangeLog: * tools-src/boilerplate.py: Rewritten. * tools-src/tidydates.py: Rewritten. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2022-11-02libstdc++: Remove more redundant union membersJonathan Wakely3-3/+0
We don't need these 'unused' members because they're never used, and a union with a single variant member is fine. libstdc++-v3/ChangeLog: * libsupc++/eh_globals.cc (constant_init::unused): Remove. * src/c++11/system_error.cc (constant_init::unused): Remove. * src/c++17/memory_resource.cc (constant_init::unused): Remove.
2022-11-02Support OpenACC 'declare create' with Fortran allocatable arrays, part II ↵Thomas Schwinge3-28/+160
[PR106643, PR96668] PR libgomp/106643 PR fortran/96668 libgomp/ * oacc-mem.c (goacc_enter_data_internal): Support OpenACC 'declare create' with Fortran allocatable arrays, part II. * testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90: Adjust. * testsuite/libgomp.oacc-fortran/pr106643-1.f90: New.
2022-11-02Support OpenACC 'declare create' with Fortran allocatable arrays, part I ↵Thomas Schwinge3-2/+706
[PR106643] PR libgomp/106643 libgomp/ * oacc-mem.c (goacc_enter_data_internal): Support OpenACC 'declare create' with Fortran allocatable arrays, part I. * testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90: New. * testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90: New.
2022-11-02Add 'libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90'Thomas Schwinge1-0/+402
libgomp/ * testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90: New.
2022-11-02Add 'libgomp.oacc-fortran/declare-allocatable-1-runtime.f90'Thomas Schwinge1-0/+278
... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted for missing support for OpenACC "Changes from Version 2.0 to 2.5": "The 'declare create' directive with a Fortran 'allocatable' has new behavior". Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete' manually. libgomp/ * testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90: New.
2022-11-02Add 'libgomp.oacc-fortran/declare-allocatable-1.f90'Cesar Philippidis1-0/+268
libgomp/ * testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New. Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
2022-11-02RISC-V: Add Zawrs ISA extension supportChristoph Müllner4-0/+23
This patch adds support for the Zawrs ISA extension. Zawrs has been ratified by the RISC-V BoD on Oct 20th, 2022. Binutils support has been merged as: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=eb668e50036e979fb0a74821df4eee0307b44e66 gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Add zawrs extension. * config/riscv/riscv-opts.h (MASK_ZAWRS): New. (TARGET_ZAWRS): New. * config/riscv/riscv.opt: New. gcc/testsuite/ChangeLog: * gcc.target/riscv/zawrs.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2022-11-02gcc: honour -ffile-prefix-map in ASM_MAP [PR93371]Rasmus Villemoes1-1/+1
-ffile-prefix-map is supposed to be a superset of -fmacro-prefix-map and -fdebug-prefix-map. However, when building .S or .s files, gas is not called with the appropriate --debug-prefix-map option when -ffile-prefix-map is used. While the user can specify -fdebug-prefix-map when building assembly files via gcc, it's more ergonomic to also support -ffile-prefix-map; especially since for .S files that could contain the __FILE__ macro, one would then also have to specify -fmacro-prefix-map. gcc: PR driver/93371 * gcc.cc (ASM_MAP): Honour -ffile-prefix-map.
2022-11-02Fix bug in frange::contains_p() for signed zeros.Aldy Hernandez1-1/+9
The contains_p() code wasn't returning true for non-singleton ranges containing signed zeros. With this patch we now handle: -0.0 exists in [-3, +5.0] +0.0 exists in [-3, +5.0] gcc/ChangeLog: * value-range.cc (frange::contains_p): Fix signed zero handling. (range_tests_signed_zeros): New test.
2022-11-02libstdc++: Improve ERANGE behavior for fallback FP std::from_charsPatrick Palka1-1/+6
The fallback implementation of floating-point std::from_chars (used for formats other than binary32/64) just calls the C library's strtod family of functions. In case of overflow, the behavior of these functions is rigidly specified: If the correct value overflows and default rounding is in effect, plus or minus HUGE_VAL, HUGE_VALF, or HUGE_VALL is returned (according to the return type and sign of the value), and the value of the macro ERANGE is stored in errno. But in case of underflow, implementations are given more leeway: If the result underflows the functions return a value whose magnitude is no greater than the smallest normalized positive number in the return type; whether errno acquires the value ERANGE is implementation-defined. Thus the fallback implementation can (and does) portably detect overflow, but it can't portably detect underflow. However, glibc (and presumably other high-quality C library implementations) will reliably set errno to ERANGE in case of underflow as well, and it'll also return the nearest denormal number to the correct value (zero in case of true underflow), which allows callers to succesfully parse denormal numbers. So since we can't be perfect here, this patch takes the best effort approach of assuming a high quality C library implementation with respect to this underflow behavior, and refines our implementation to try to distiguish between a denormal result and true underflow by inspecting strtod's return value. libstdc++-v3/ChangeLog: * src/c++17/floating_from_chars.cc (from_chars_impl): In the ERANGE case, distinguish between a denormal result and true underflow by checking if the return value is 0.
2022-11-02libstdc++: Remove unnecessary variant member in std::expectedJonathan Wakely1-5/+4
Hui Xie pointed out that we don't need a dummy member in the union, because all constructors always initialize either _M_val or _M_unex. We still need the _M_void member of the expected<void, E> specialization, because the constructor has to initialize something when not using the _M_unex member. libstdc++-v3/ChangeLog: * include/std/expected (expected::_M_invalid): Remove.
2022-11-02libstdc++: Ignore -Wignored-qualifiers warning in <variant>Jonathan Wakely1-0/+3
The warning is wrong here, the qualifier serves a purpose and is not ignored (c.f. PR c++/107492). libstdc++-v3/ChangeLog: * include/std/variant (__variant::_Multi_array::__untag_result): Use pragma to suppress warning.
2022-11-02libstdc++: _Bfloat16 for <compare>Jakub Jelinek1-1/+6
Jon pointed out that we have TODO: _Bfloat16 in <compare>. Right now _S_fp_fmt() returns _Binary16 for _Float16, __fp16 as well as __bf16 and it actually works because we don't have a special handling of _Binary16. So, either we could just document that, but I'm a little bit afraid if HPPA or MIPS don't start supporting _Float16 and/or __bf16. If they do, we have the #if defined __hppa__ || (defined __mips__ && !defined __mips_nan2008) // IEEE 754-1985 allowed the meaning of the quiet/signaling // bit to be reversed. Flip that to give desired ordering. if (__builtin_isnan(__x) && __builtin_isnan(__y)) { using _Int = decltype(__ix); constexpr int __nantype = __fmt == _Binary32 ? 22 : __fmt == _Binary64 ? 51 : __fmt == _Binary128 ? 111 : -1; constexpr _Int __bit = _Int(1) << __nantype; __ix ^= __bit; __iy ^= __bit; } #endif code, the only one where we actually care whether something is _Binary{32,64,128} (elsewhere we just care about the x86 and m68k 80bits or double double or just floating point type's sizeof) and we'd need to handle there _Binary16 and/or _Bfloat16. So this patch uses different enum for it even when it isn't needed right now, after all _Binary16 isn't needed either and we could just use _Binary32... 2022-11-02 Jakub Jelinek <jakub@redhat.com> * libsupc++/compare (_Strong_order::_Fp_fmt): Add _Bfloat16. (_Strong_order::_Bfloat16): New static data member. (_Strong_order::_S_fp_fmt): Return _Bfloat16 for std::bfloat16_t.
2022-11-02builtins: Guard builtins.cc against HUGE_VAL and NAN definitionsRainer Orth1-0/+5
trunk bootstrap recently broke on Solaris like this: /vol/gcc/src/hg/master/local/gcc/builtins.cc:2104:8: error: pasting "CFN_BUILT_IN_" and "(" does not give a valid preprocessing token 2104 | case CFN_BUILT_IN_##MATHFN: \ | ^~~~~~~~~~~~~ /vol/gcc/src/hg/master/local/gcc/builtins.cc:2112:3: note: in expansion of macro 'CASE_MATHFN' 2112 | CASE_MATHFN(MATHFN) \ | ^~~~~~~~~~~ /vol/gcc/src/hg/master/local/gcc/builtins.cc:1967:5: note: in expansion of macro 'CASE_MATHFN_FLOATN' 1967 | CASE_MATHFN_FLOATN (HUGE_VAL) \ and similarly for NAN. It turns out this happens because <math.h> is included at some point, which (in <iso/math_c99.h>) defines While this only happpens on Solaris right now, the same issue would be present on other targets when <math.h> gets included somehow. To avoid this, this patch #undef's both macros. Bootstrapped without regressions on i386-pc-solaris2.11 and sparc-sun-solaris2.11. 2022-11-01 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc: * builtins.cc (mathfn_built_in_2): #undef HUGE_VAL, NAN.
2022-11-02libstdc++: Shortest denormal hex std::to_charsJakub Jelinek3-8/+21
On Fri, Oct 28, 2022 at 12:52:44PM -0400, Patrick Palka wrote: > > The following patch on top of > > https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html > > adds std::{,b}float16_t support for std::to_chars. > > When precision is specified (or for std::bfloat16_t for hex mode even if not), > > I believe we can just use the std::to_chars float (when float is mode > > compatible with std::float32_t) overloads, both formats are proper subsets > > of std::float32_t. > > Unfortunately when precision is not specified and we are supposed to emit > > shortest string, the std::{,b}float16_t strings are usually much shorter. > > E.g. 1.e7p-14f16 shortest fixed representation is > > 0.0001161 and shortest scientific representation is > > 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t) > > 0.00011610985 and > > 1.1610985e-04. > > Similarly for 1.38p-112bf16, > > 0.000000000000000000000000000000000235 > > 2.35e-34 vs. 1.38p-112f32 > > 0.00000000000000000000000000000000023472271 > > 2.3472271e-34 > > For std::float16_t there are differences even in the shortest hex, say: > > 0.01p-14 vs. 1p-22 > > but only for denormal std::float16_t values (where all std::float16_t > > denormals converted to std::float32_t are normal), __FLT16_MIN__ and > > everything larger in absolute value than that is the same. Unless > > that is a bug and we should try to discover shorter representations > > even for denormals... > > IIRC for hex formatting of denormals I opted to be consistent with how > glibc printf formats them, instead of outputting the truly shortest > form. > > I wouldn't be against using the float32 overloads even for shortest hex > formatting of float16. The output is shorter but equivalent so it > shouldn't cause any problems. The following patch changes the behavior of the shortest hex denormals, such that they are printed like normals (so for has_implicit_leading_bit with 1p-149 instead of 0.000002p-126 etc., otherwise (Intel extended) with the leading digit before dot being [89abcdef]). I think for all the supported format it is never longer, it can be equal length e.g. for 0.fffffep-126 vs. 1.fffffcp-127 but fortunately no largest subnormal in any format has the unbiased exponent like -9, -99, -999, -9999 because then it would be longer and often it is shorter, sometimes much shorter. For the cases with precision it keeps the handling as is. While for !has_implicit_leading_bit we for normals or with this patch even denormals have really shortest representation, for other formats we sometimes do not, but this patch doesn't deal with that (we always use 1.NNN while we could use 1.NNN up to f.NNN and by that shortening by the last hexit if the last hexit doesn't have least significant bit set and unbiased exponent is not -9, -99, -999 or -9999. 2022-11-02 Jakub Jelinek <jakub@redhat.com> * src/c++17/floating_to_chars.cc (__floating_to_chars_hex): Drop const from unbiased_exponent. Canonicalize denormals such that they have the leading bit set by shifting effective mantissa up and decreasing unbiased_exponent. (__floating_to_chars_shortest): Don't instantiate __floating_to_chars_hex for float16_t either and use float instead. * testsuite/20_util/to_chars/float.cc (float_to_chars_test_cases): Adjust testcases for shortest hex denormals. * testsuite/20_util/to_chars/double.cc (double_to_chars_test_cases): Likewise.
2022-11-02rs6000: Byte reverse V8HI on Power8 by vector rotation.Xionghu Luo3-7/+29
gcc/ PR target/100866 * config/rs6000/altivec.md: (*altivec_vrl<VI_char>): Named to... (altivec_vrl<VI_char>): ...this. * config/rs6000/vsx.md (revb_<mode>): Call vspltish and vrlh when target is Power8 and mode is V8HI. gcc/testsuite/ PR target/100866 * gcc.target/powerpc/pr100866-2.c: New.
2022-11-02Daily bump.GCC Administrator6-1/+287
2022-11-01c++: per-scope, per-signature lambda discriminatorsNathan Sidwell10-3/+265
This implements ABI-compliant lambda discriminators. Not only do we have per-scope counters, but we also distinguish by lambda signature. Only lambdas with the same signature will need non-zero discriminators. As the discriminator is signature-dependent, we have to process the lambda function's declaration before we can determine it. For templated and generic lambdas the signature is that of the uninstantiated lambda -- not separate for each instantiation. With this change, gcc and clang now produce the same lambda manglings for all these testcases. gcc/cp/ * cp-tree.h (LAMBDA_EXPR_SCOPE_SIG_DISCRIMINATOR): New. (struct tree_lambda_expr): Add discriminator_sig bitfield. (recrd_lambda_scope_sig_discriminator): Declare. * lambda.cc (struct lambda_sig_count): New. (lambda_discriminator): Add signature vector. (start_lambda_scope): Adjust. (compare_lambda_template_head, compare_lambda_sig): New. (record_lambda_scope_sig_discriminator): New. * mangle.cc (write_closure_type): Use the scope-sig discriminator for ABI >= 18. Emit abi mangling warning if needed. * module.cc (trees_out::core_vals): Stream the new discriminator. (trees_in::core_vals): Likewise. * parser.cc (cp_parser_lambda_declarator_opt): Call record_lambda_scope_sig_discriminator. * pt.cc (tsubst_lambda_expr): Likewise. libcc1/ * libcp1plugin.cc (plugin_start_lambda_closure_class_type): Initialize the per-scope, per-signature discriminator. gcc/testsuite/ * g++.dg/abi/lambda-sig1-18.C: New. * g++.dg/abi/lambda-sig1-18vs17.C: New. * g++.dg/cpp1y/lambda-mangle-1-18.C: New.
2022-11-01configure: cache result of "sys/sdt.h" header checkDavid Seifert2-12/+24
Use AC_CACHE_CHECK to store the result of the header check for systemtap's "sys/sdt.h", which is similar in spirit to libstdc++'s AC_CACHE_CHECK(..., glibcxx_cv_sys_sdt_h). gcc/ * configure.ac: Add AC_CACHE_CHECK(..., gcc_cv_sys_sdt_h). * configure: Regenerate.
2022-11-01gcc/file-prefix-map: Allow remapping of relative pathsRichard Purdie1-3/+13
Relative paths currently aren't remapped by -ffile-prefix-map and friends. When cross compiling with separate 'source' and 'build' directories, the same relative paths between directories may not be available on target as compared to build time. In order to be able to remap these relative build paths to paths that would work on target, resolve paths within the file-prefix-map function using realpath(). This does cause a change of behaviour if users were previously relying upon symlinks or absolute paths not being resolved. Use basename to ensure plain filenames don't have paths added. gcc/ChangeLog: * file-prefix-map.cc (remap_filename): Allow remapping of relative paths.
2022-11-01[PR tree-optimization/107490] Handle NANs in op[12]_range.Aldy Hernandez2-8/+60
None of the build_<OP> functions in range-op handle NANs. This is by design in order to force us to handle NANs specially, because "x relop NAN" makes no sense. This patch fixes a handful of op[12]_range entries that weren't handling NANs. PR tree-optimization/107490 gcc/ChangeLog: * range-op-float.cc (foperator_unordered_lt::op1_range): Handle NANs. (foperator_unordered_lt::op2_range): Same. (foperator_unordered_le::op1_range): Same. (foperator_unordered_le::op2_range): Same. (foperator_unordered_gt::op1_range): Same. (foperator_unordered_gt::op2_range): Same. (foperator_unordered_ge::op1_range): Same. (foperator_unordered_ge::op2_range): Same. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr107490.c: New test.
2022-11-01Make sure ssa-name is valid.Andrew MacLeod1-1/+1
PR tree-optimization/107497 * tree-vrp.cc (remove_unreachable::remove_and_update_globals): Check that ssa-name still exists before accessing it.
2022-11-01Make ranger vrp1 default.Andrew MacLeod3-2/+44
Turn on ranger as the default vrp1 pass and adjust testcases. gcc/ * params.opt (param_vrp1_mode): Make ranger default. gcc/testsuite/ * gcc.dg/pr68217.c: Test [-INF, -INF][0, 0] instead of [-INF, 0]. * gcc.dg/tree-ssa/vrp-unreachable.c: New. Test unreachable removal.
2022-11-01Remove builtin_unreachable in VRPAndrew MacLeod1-3/+187
Removal of __builtin_unreachable calls were handled in an inconsistent way. This removes then in the VRP pass, and sets the global range appropriately. * tree-vrp.cc (class remove_unreachable): New. (remove_unreachable::maybe_register_block): New. (remove_unreachable::remove_and_update_globals): New. (rvrp_folder::rvrp_folder): Initialize m_unreachable. (rvrp_folder::post_fold_bb): Maybe register unreachable block. (rvrp_folder::m_unreachable): New member. (execute_ranger_vrp): Add final_pass flag, remove unreachables.
2022-11-01Allow queries on exit block.Andrew MacLeod2-7/+10
Ranger was not allowing the exit block to be queried for range_on_entry or exit. This removes that restriction. * gimple-range-cache.cc (ranger_cache::fill_block_cache): Allow exit block to be specified. (ranger_cache::range_from_dom): If exit block is specified, use the immediate predecessor instead of the dominator to start. * gimple-range.cc (gimple_ranger::range_on_exit): Allow query for exit block.
2022-11-01Intersect with nonzero bits can indicate change incorrectly.Andrew MacLeod1-0/+4
* value-range.cc (irange::intersect_nonzero_bits): If new non-zero mask is the same as original, flag no change.
2022-11-01libstdc++: std::from_chars std::{,b}float16_t supportJakub Jelinek4-29/+491
The following patch adds std::from_chars support, similarly to the previous std::to_chars patch through APIs that use float instead of the 16-bit floating point formats as container. The patch uses the fast_float library and doesn't need any changes to it, like the previous patch it introduces wrapper classes around float that represent the float holding float16_t or bfloat16_t value, and specializes binary_format etc. from fast_float for these classes. The new test verifies exhaustively to_chars and from_chars afterward results in the original value (except for nans) in all the fmt cases. 2022-11-01 Jakub Jelinek <jakub@redhat.com> * include/std/charconv (__from_chars_float16_t, __from_chars_bfloat16_t): Declare. (from_chars): Add _Float16 and __gnu_cxx::__bfloat16_t overloads. * config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Export _ZSt22__from_chars_float16_tPKcS0_RfSt12chars_format and _ZSt23__from_chars_bfloat16_tPKcS0_RfSt12chars_format. * src/c++17/floating_from_chars.cc (fast_float::floating_type_float16_t, fast_float::floating_type_bfloat16_t): New classes. (fast_float::binary_format<floating_type_float16_t>, fast_float::binary_format<floating_type_bfloat16_t>): New specializations. (fast_float::to_float<floating_type_float16_t>, fast_float::to_float<floating_type_bfloat16_t>, fast_float::to_extended<floating_type_float16_t>, fast_float::to_extended<floating_type_bfloat16_t>): Likewise. (fast_float::from_chars_16): New template function. (__floating_from_chars_hex): Allow instantiation with fast_float::floating_type_{,b}float16_t. (from_chars): Formatting fixes for float/double/long double overloads. (__from_chars_float16_t, __from_chars_bfloat16_t): New functions. * testsuite/20_util/to_chars/float16_c++23.cc: New test.
2022-11-01libstdc++: std::to_chars std::{,b}float16_t supportJakub Jelinek3-2/+208
The following patch on top of https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html adds std::{,b}float16_t support for std::to_chars. When precision is specified (or for std::bfloat16_t for hex mode even if not), I believe we can just use the std::to_chars float (when float is mode compatible with std::float32_t) overloads, both formats are proper subsets of std::float32_t. Unfortunately when precision is not specified and we are supposed to emit shortest string, the std::{,b}float16_t strings are usually much shorter. E.g. 1.e7p-14f16 shortest fixed representation is 0.0001161 and shortest scientific representation is 1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t) 0.00011610985 and 1.1610985e-04. Similarly for 1.38p-112bf16, 0.000000000000000000000000000000000235 2.35e-34 vs. 1.38p-112f32 0.00000000000000000000000000000000023472271 2.3472271e-34 For std::float16_t there are differences even in the shortest hex, say: 0.01p-14 vs. 1p-22 but only for denormal std::float16_t values (where all std::float16_t denormals converted to std::float32_t are normal), __FLT16_MIN__ and everything larger in absolute value than that is the same. Unless that is a bug and we should try to discover shorter representations even for denormals... std::bfloat16_t has the same exponent range as std::float32_t, so all std::bfloat16_t denormals are also std::float32_t denormals and thus the shortest hex representations are the same. As documented, ryu can handle arbitrary IEEE like floating point formats (probably not wider than IEEE quad) using the generic_128 handling, but ryu is hidden in libstdc++.so. As only few architectures support std::float16_t right now and some of them have special ISA requirements for those (e.g. on i?86 one needs -msse2) and std::bfloat16_t is right now supported only on x86 (again with -msse2), perhaps with aarch64/arm coming next if ARM is interested, but I think it is possible that more will be added later, instead of exporting APIs from the library to handle directly the std::{,b}float16_t overloads this patch instead exports functions which take a float which is a superset of those and expects the inline overloads to promote the 16-bit formats to 32-bit, then inside of the library it ensures they are printed right. With the added [[gnu::cold]] attribute because I think most users will primarily use these formats as storage formats and perform arithmetics in the excess precision for them and print also as std::float32_t the added support doesn't seem to be too large, on x86_64: readelf -Ws libstdc++.so.6.0.31 | grep float16_t 912: 00000000000ae824 950 FUNC GLOBAL DEFAULT 13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31 5767: 00000000000ae4a1 899 FUNC GLOBAL DEFAULT 13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31 842: 000000000016d430 106 FUNC LOCAL DEFAULT 13 _ZN12_GLOBAL__N_113get_ieee_reprINS_23floating_type_float16_tEEENS_6ieee_tIT_EES3_ 865: 0000000000170980 1613 FUNC LOCAL DEFAULT 13 +_ZSt23__floating_to_chars_hexIN12_GLOBAL__N_123floating_type_float16_tEESt15to_chars_resultPcS3_T_St8optionalIiE.constprop.0.isra.0 7205: 00000000000ae824 950 FUNC GLOBAL DEFAULT 13 _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format 7985: 00000000000ae4a1 899 FUNC GLOBAL DEFAULT 13 _ZSt20__to_chars_float16_tPcS_fSt12chars_format so 3568 code bytes together or so. Tested with the attached test (which doesn't prove the shortest representation, just prints std::{,b}float16_t and std::float32_t shortest strings side by side, then tries to verify it can be emitted even into the exact sized range and can't be into range one smaller than that and tries to read what is printed back using from_chars float32_t overload (so there could be double rounding, but apparently there is none for the shortest strings). The only differences printed are for NaNs, where sNaNs are canonicalized to canonical qNaNs and as to_chars doesn't print NaN mantissa, even qNaNs other than the canonical one are read back just as the canonical NaN. Also attaching what Patrick wrote to generate the pow10_adjustment_tab, for std::float16_t only 1.0, 10.0, 100.0, 1000.0 and 10000.0 are powers of 10 in the range because __FLT16_MAX__ is 65504.0, and all of the above are exactly representable in std::float16_t, so we want to use 0 in pow10_adjustment_tab. 2022-11-01 Jakub Jelinek <jakub@redhat.com> * include/std/charconv (__to_chars_float16_t, __to_chars_bfloat16_t): Declare. (to_chars): Add _Float16 and __gnu_cxx::__bfloat16_t overloads. * config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Export _ZSt20__to_chars_float16_tPcS_fSt12chars_format and _ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format. * src/c++17/floating_to_chars.cc (floating_type_float16_t, floating_type_bfloat16_t): New types. (floating_type_traits<floating_type_float16_t>, floating_type_traits<floating_type_bfloat16_t>, get_ieee_repr<floating_type_float16_t>, get_ieee_repr<floating_type_bfloat16_t>, __handle_special_value<floating_type_float16_t>, __handle_special_value<floating_type_bfloat16_t>): New specializations. (floating_to_shortest_scientific): Handle floating_type_float16_t and floating_type_bfloat16_t like IEEE quad. (__floating_to_chars_shortest): For floating_type_bfloat16_t call __floating_to_chars_hex<float> rather than __floating_to_chars_hex<floating_type_bfloat16_t> to avoid instantiating the latter. (__to_chars_float16_t, __to_chars_bfloat16_t): New functions.
2022-11-01libstdc++-v3: Some std::*float*_t charconv and i/ostream overloadsJakub Jelinek7-122/+710
The following patch adds the easy part of <charconv>, <istream> and <ostream> changes for extended floats. In particular, for the first one only overloads where the _Float* has the same format as float/double/long double and for the latter two everything but the _GLIBCXX_HAVE_FLOAT128_MATH case. For charconv, I'm not really familiar with it, I'm pretty sure we need new libstdc++.so.6 side implementation of from_chars for {,b}float16_t and for to_chars not really sure but for unspecified precision if it should emit minimum characters that to_chars then can unambiguously parse, I think it is less than in the float case. For float128_t {to,from}_chars I think we even have it on the library side already, just ifdefed for powerpc64le only. For i/o stream operator<</>>, not sure what is better, if not providing anything at all, or doing what we in the end do if user doesn't override the virtual functions, or use {to,from}_chars under the hood, something else? Besides this, the patch adds some further missed // { dg-options "-std=gnu++2b" } spots, I've also noticed I got the formatting wrong in some testcases by not using spaces around VERIFY conditions and elsewhere by having space before ( for calls. The testsuite coverage is limited, I've added test for from_chars because it was easy to port, but not really sure what to do about to_chars, it has for float/double huge testcases which would be excessive to repeat. And for i/ostream not really sure what exactly is worth testing. 2022-11-01 Jakub Jelinek <jakub@redhat.com> * include/std/charconv (from_chars, to_chars): Add _Float{32,64,128} overloads for cases where those types match {float,double,long double}. * include/std/istream (basic_istream::operator>>): Add _Float{16,32,64,128} and __gnu_cxx::__bfloat16_t overloads. * include/std/ostream (basic_ostream::operator<<): Add _Float{16,32,64,128} and __gnu_cxx::__bfloat16_t overloads. * testsuite/20_util/from_chars/8.cc: New test. * testsuite/26_numerics/headers/cmath/nextafter_c++23.cc (test): Formatting fixes. * testsuite/26_numerics/headers/cmath/functions_std_c++23.cc: Add dg-options "-std=gnu++2b". (test_functions, main): Formatting fixes. * testsuite/26_numerics/headers/cmath/c99_classification_macros_c++23.cc: Add dg-options "-std=gnu++2b".
2022-11-01i386: correct integer division modeling in znver.mdAlexander Monakov1-18/+21
In znver.md, division instructions have descriptions like (define_insn_reservation "znver1_idiv_DI" 41 (and (eq_attr "cpu" "znver1,znver2") (and (eq_attr "type" "idiv") (and (eq_attr "mode" "DI") (eq_attr "memory" "none")))) "znver1-double,znver1-ieu2*41") which says that DImode idiv has latency 41 (which is correct) and that it occupies 2nd integer execution unit for 41 consecutive cycles, but that is not correct: 1) the division instruction is partially pipelined, and has throughput 1/14, not 1/41; 2) for the most part it occupies a separate division unit, not the general arithmetic unit. Evidently, interaction of such 41-cycle paths with the rest of reservations causes a combinatorial explosion in the automaton. Fix this by modeling the integer division unit properly, and correcting reservations to use the measured reciprocal throughput of those instructions (available from uops.info). A similar correction for floating-point divisions is left for a followup patch. Top 5 znver table sizes, before: 68692 r znver1_ieu_check 68692 r znver1_ieu_transitions 99792 r znver1_ieu_min_issue_delay 428108 r znver1_fp_min_issue_delay 856216 r znver1_fp_transitions After: 1454 r znver1_ieu_translate 1454 r znver1_translate 2304 r znver1_ieu_transitions 428108 r znver1_fp_min_issue_delay 856216 r znver1_fp_transitions gcc/ChangeLog: PR target/87832 * config/i386/znver.md (znver1_idiv): New automaton. (znver1-idiv): New unit. (znver1_idiv_DI): Correct unit and cycles in the reservation. (znver1_idiv_SI): Ditto. (znver1_idiv_HI): Ditto. (znver1_idiv_QI): Ditto. (znver1_idiv_mem_DI): Ditto. (znver1_idiv_mem_SI): Ditto. (znver1_idiv_mem_HI): Ditto. (znver1_idiv_mem_QI): Ditto. (znver3_idiv_DI): Ditto. (znver3_idiv_SI): Ditto. (znver3_idiv_HI): Ditto. (znver3_idiv_QI): Ditto. (znver3_idiv_mem_DI): Ditto. (znver3_idiv_mem_SI): Ditto. (znver3_idiv_mem_HI): Ditto. (znver3_idiv_mem_QI): Ditto.
2022-11-01Fix urls in gm2.texi, remove older news, improvements to def2doc.py.Gaius Mulley3-67/+9
Remove older papers and talks as the front end has changed significantly. Remove older text description for coroutines since the underlying threading model has also changed. Corrected other urls given floppsie.comp.glam.ac.uk has disappeared. gcc/ChangeLog: * doc/gm2.texi (Papers and talks): Removed. Urls to floppsie.comp.glam.ac.uk replaced. gcc/m2/ChangeLog: * gm2-libs-coroutines/README.texi: Remove all GNU Pthread text as the processes now use __gthread. * tools-src/def2doc.py (emitSubSection): Use '-' character. (parseDefinition) Improved.
2022-11-01c++: Reorganize per-scope lambda discriminatorsNathan Sidwell12-98/+199
We currently use a per-extra-scope counter to discriminate multiple lambdas in a particular such scope. This is not ABI compliant. This patch merely refactors the existing code to make it easier to drop in a conformant mangling -- there's no functional change here. I rename the LAMBDA_EXPR_DISCIMINATOR to LAMBDA_EXPR_SCOPE_ONLY_DISCRIMINATOR, foreshadowing that there'll be a new discriminator. To provide ABI warnings we'll need to calculate both, and that requires some repacking of the lambda_expr's fields. Finally, although we end up calling the discriminator setter and the scope recorder (nearly) always consecutively, it's clearer to handle it as two separate operations. That also allows us to remove the instantiation special-case for a null extra-scope. gcc/cp/ * cp-tree.h (LAMBDA_EXPR_DISCRIMINATOR): Rename to ... (LAMBDA_EXPR_SCOPE_ONLY_DISCRIMINATOR): ... here. (struct tree_lambda_expr): Make default_capture_mode & discriminator_scope bitfields. (record_null_lambda_scope) Delete. (record_lambda_scope_discriminator): Declare. * lambda.cc (struct lambda_discriminator): New struct. (lambda_scope, lambda_scope_stack): Adjust types. (lambda_count): Delete. (struct tree_int): Delete. (start_lambda_scope, finish_lambda_scope): Adjust. (record_lambda_scope): Only record the scope. (record_lambda_scope_discriminator): New. * mangle.cc (write_closure_type_name): Adjust. * module.cc (trees_out::core_vals): Likewise, (trees_in::core_vals): Likewise. * parser.cc (cp_parser_lambda_expression): Call record_lambda_scope_discriminator. * pt.cc (tsubst_lambda_expr): Adjust record_lambda_scope caling. Call record_lambda_scope_discriminator. Commonize control flow on tsubsting the operator function. libcc1/ * libcp1plugin.cc (plugin_start_closure): Adjust. gcc/testsuite/ * g++.dg/abi/lambda-sig1-17.C: New. * g++.dg/abi/lambda-sig1.h: New. * g++.dg/cpp1y/lambda-mangle-1.C: Extracted to ... * g++.dg/cpp1y/lambda-mangle-1.h: ... here. * g++.dg/cpp1y/lambda-mangle-1-11.C: New * g++.dg/cpp1y/lambda-mangle-1-17.C
2022-11-01Fix incorrect digit constraintliuhongt3-84/+77
Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint. In pr107057, the 2 operands in the pattern are both input operands. gcc/ChangeLog: PR target/107057 * config/i386/sse.md (*vec_interleave_highv2df): Remove constraint 1. (*vec_interleave_lowv2df): Ditto. (vec_concatv2df): Ditto. (*avx512f_unpcklpd512<mask_name>): Ditto and renamed to .. (avx512f_unpcklpd512<mask_name>): .. this. (avx512f_movddup512<mask_name>): Change to define_insn. (avx_movddup256<mask_name>): Ditto. (*avx_unpcklpd256<mask_name>): Remove constraint 1 and renamed to .. (avx_unpcklpd256<mask_name>): .. this. * config/i386/i386.cc (ix86_vec_interleave_v2df_operator_ok): Disallow MEM_P (op1) && MEM_P (op2). gcc/testsuite/ChangeLog: * gcc.target/i386/pr107057.c: New test.
2022-11-01Enable more optimization for 32-bit/64-bit shrd/shld with imm shift count.liuhongt2-4/+173
This patch doens't handle variable count since it require 5 insns to be combined to get wanted pattern, but current pass_combine only supports at most 4. This patch doesn't handle 16-bit shrd/shld either. gcc/ChangeLog: PR target/55583 * config/i386/i386.md (*x86_64_shld_1): Rename to .. (x86_64_shld_1): .. this. (*x86_shld_1): Rename to .. (x86_shld_1): .. this. (*x86_64_shrd_1): Rename to .. (x86_64_shrd_1): .. this. (*x86_shrd_1): Rename to .. (x86_shrd_1): .. this. (*x86_64_shld_shrd_1_nozext): New pre_reload splitter. (*x86_shld_shrd_1_nozext): Ditto. (*x86_64_shrd_shld_1_nozext): Ditto. (*x86_shrd_shld_1_nozext): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr55583.c: New test.
2022-10-31c++: pass std attributes to make_call_declaratorJason Merrill1-4/+8
It seems preferable to pass these to the function rather than set them separately after the call. gcc/cp/ChangeLog: * parser.cc (make_call_declarator): Add std_attrs parm. (cp_parser_lambda_declarator_opt): Pass it. (cp_parser_direct_declarator): Pass it.
2022-10-31c++: set TREE_NOTHROW after genericizeJason Merrill1-8/+8
genericize might introduce function calls (and does on the contracts branch), so it's safer to set this flag later. gcc/cp/ChangeLog: * decl.cc (finish_function): Set TREE_NOTHROW later in the function.
2022-10-31c++: formatting tweaksJason Merrill3-7/+4
gcc/cp/ChangeLog: * decl.cc (duplicate_decls): Reformat loop. * parser.cc (cp_parser_member_declaration): Add newline. * semantics.cc: Remove newline.
2022-11-01Add attribute hot judgement for INLINE_HINT_known_hot hint.Cui,Lili2-4/+56
We set up INLINE_HINT_known_hot hint only when we have profile feedback, now add function attribute judgement for it, when both caller and callee have __attribute__((hot)), we will also set up INLINE_HINT_known_hot hint for it. With this patch applied, ADL Multi-copy: 538.imagic_r 16.7% ICX Multi-copy: 538.imagic_r 15.2% CLX Multi-copy: 538.imagic_r 12.7% Znver3 Multi-copy: 538.imagic_r 10.6% Arm Multi-copy: 538.imagic_r 13.4% gcc/ChangeLog * ipa-inline-analysis.cc (do_estimate_edge_time): Add function attribute judgement for INLINE_HINT_known_hot hint. gcc/testsuite/ChangeLog: * gcc.dg/ipa/inlinehint-6.c: New test.
2022-11-01Daily bump.GCC Administrator8-1/+409
2022-10-31RISC-V: Libitm add RISC-V support.Xiongchuan Tan4-0/+273
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> libitm/ChangeLog: * configure.tgt: Add riscv support. * config/riscv/asm.h: New file. * config/riscv/sjlj.S: New file. * config/riscv/target.h: New file.
2022-10-31libstdc++-v3: <complex> support for extended floating point typesJakub Jelinek4-2/+955
The following patch adds <complex> support for extended floating point types. C++23 removes the float/double/long double specializations from the spec and instead adds explicit(bool) specifier on the converting constructor. The patch uses that for converting constructor of the base template as well as the float/double/long double specializations's converting constructors (e.g. so that it handles convertion construction also from complex of extended floating point types). Copy ctor was already defaulted as the spec now requires. The patch also adds partial specialization for the _Float{16,32,64,128} and __gnu_cxx::__bfloat16_t types because the base template doesn't use __complex__ but a pair of floating point values. The g++.dg/cpp23/ testcase verifies explicit(bool) works correctly. 2022-10-31 Jakub Jelinek <jakub@redhat.com> gcc/testsuite/ * g++.dg/cpp23/ext-floating12.C: New test. libstdc++-v3/ * include/std/complex (complex::complex converting ctor): For C++23 use explicit specifier with constant expression. Explicitly cast both parts to _Tp. (__complex_abs, __complex_arg, __complex_cos, __complex_cosh, __complex_exp, __complex_log, __complex_sin, __complex_sinh, __complex_sqrt, __complex_tan, __complex_tanh, __complex_pow): Add __complex__ _Float{16,32,64,128} and __complex__ decltype(0.0bf16) overloads. (complex<float>::complex converting ctor, complex<double>::complex converting ctor, complex<long double>::complex converting ctor): For C++23 implement as template with explicit specifier with constant expression and explicit casts. (__complex_type): New template. (complex): New partial specialization for types with extended floating point types. (__complex_acos, __complex_asin, __complex_atan, __complex_acosh, __complex_asinh, __complex_atanh): Add __complex__ _Float{16,32,64,128} and __complex__ decltype(0.0bf16) overloads. (__complex_proj): Likewise. Add template for complex of extended floating point types. * include/bits/cpp_type_traits.h (__is_floating): Specialize for _Float{16,32,64,128} and __gnu_cxx::__bfloat16_t. * testsuite/26_numerics/complex/ext_c++23.cc: New test.