aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-01-18[C++ coroutines] Initial implementation.Iain Sandoe154-9/+11052
This is the squashed version of the first 6 patches that were split to facilitate review. The changes to libiberty (7th patch) to support demangling the co_await operator stand alone and are applied separately. The patch series is an initial implementation of a coroutine feature, expected to be standardised in C++20. Standardisation status (and potential impact on this implementation) -------------------------------------------------------------------- The facility was accepted into the working draft for C++20 by WG21 in February 2019. During following WG21 meetings, design and national body comments have been reviewed, with no significant change resulting. The current GCC implementation is against n4835 [1]. At this stage, the remaining potential for change comes from: * Areas of national body comments that were not resolved in the version we have worked to: (a) handling of the situation where aligned allocation is available. (b) handling of the situation where a user wants coroutines, but does not want exceptions (e.g. a GPU). * Agreed changes that have not yet been worded in a draft standard that we have worked to. It is not expected that the resolution to these can produce any major change at this phase of the standardisation process. Such changes should be limited to the coroutine-specific code. ABI --- The various compiler developers 'vendors' have discussed a minimal ABI to allow one implementation to call coroutines compiled by another. This amounts to: 1. The layout of a public portion of the coroutine frame. Coroutines need to preserve state across suspension points, the storage for this is called a "coroutine frame". The ABI mandates that pointers into the coroutine frame point to an area begining with two function pointers (to the resume and destroy functions described below); these are immediately followed by the "promise object" described in the standard. This is sufficient that the builtins can take a coroutine frame pointer and determine the address of the promise (or call the resume/destroy functions). 2. A number of compiler builtins that the standard library might use. These are implemented by this patch series. 3. This introduces a new operator 'co_await' the mangling for which is also agreed between vendors (and has an issue filed for that against the upstream c++abi). Demangling for this is added to libiberty in a separate patch. The ABI has currently no target-specific content (a given psABI might elect to mandate alignment, but the common ABI does not do this). Standard Library impact ----------------------- The current implementations require addition of only a single header to the standard library (no change to the runtime). This header is part of the patch. GCC Implementation outline -------------------------- The standard's design for coroutines does not decorate the definition of a coroutine in any way, so that a function is only known to be a coroutine when one of the keywords (co_await, co_yield, co_return) is encountered. This means that we cannot special-case such functions from the outset, but must process them differently when they are finalised - which we do from "finish_function ()". At a high level, this design of coroutine produces four pieces from the original user's function: 1. A coroutine state frame (taking the logical place of the activation record for a regular function). One item stored in that state is the index of the current suspend point. 2. A "ramp" function This is what the user calls to construct the coroutine frame and start the coroutine execution. This will return some object representing the coroutine's eventual return value (or means to continue it when it it suspended). 3. A "resume" function. This is what gets called when a the coroutine is resumed when suspended. 4. A "destroy" function. This is what gets called when the coroutine state should be destroyed and its memory released. The standard's coroutines involve cooperation of the user's authored function with a provided "promise" class, which includes mandatory methods for handling the state transitions and providing output values. Most realistic coroutines will also have one or more 'awaiter' classes that implement the user's actions for each suspend point. As we parse (or during template expansion) the types of the promise and awaiter classes become known, and can then be verified against the signatures expected by the standard. Once the function is parsed (and templates expanded) we are able to make the transformation into the four pieces noted above. The implementation here takes the approach of a series of AST transforms. The state machine suspend points are encoded in three internal functions (one of which represents an exit from scope without cleanups). These three IFNs are lowered early in the middle end, such that the majority of GCC's optimisers can be run on the resulting output. As a design choice, we have carried out the outlining of the user's function in the front end, and taken advantage of the existing middle end's abilities to inline and DCE where that is profitable. Since the state machine is actually common to both resumer and destroyer functions, we make only a single function "actor" that contains both the resume and destroy paths. The destroy function is represented by a small stub that sets a value to signal the use of the destroy path and calls the actor. The idea is that optimisation of the state machine need only be done once - and then the resume and destroy paths can be identified allowing the middle end's inline and DCE machinery to optimise as profitable as noted above. The middle end components for this implementation are: A pass that: 1. Lowers the coroutine builtins that allow the standard library header to interact with the coroutine frame (these fairly simple logical or numerical substitution of values, given a coroutine frame pointer). 2. Lowers the IFN that represents the exit from state without cleanup. Essentially, this becomes a gimple goto. 3. Sets the final size of the coroutine frame at this stage. A second pass (that requires the revised CFG that results from the lowering of the scope exit IFNs in the first). 1. Lower the IFNs that represent the state machine paths for the resume and destroy cases. Patches squashed into this commit: [C++ coroutines 1] Common code and base definitions. This part of the patch series provides the gating flag, the keywords, cpp defines etc. [C++ coroutines 2] Define builtins and internal functions. This part of the patch series provides the builtin functions used by the standard library code and the internal functions used to implement lowering of the coroutine state machine. [C++ coroutines 3] Front end parsing and transforms. There are two parts to this. 1. Parsing, template instantiation and diagnostics for the standard- mandated class entries. The user authors a function that becomes a coroutine (lazily) by making use of any of the co_await, co_yield or co_return keywords. Unlike a regular function, where the activation record is placed on the stack, and is destroyed on function exit, a coroutine has some state that persists between calls - the 'coroutine frame' (thus analogous to a stack frame). We transform the user's function into three pieces: 1. A so-called ramp function, that establishes the coroutine frame and begins execution of the coroutine. 2. An actor function that contains the state machine corresponding to the user's suspend/resume structure. 3. A stub function that calls the actor function in 'destroy' mode. The actor function is executed: * from "resume point 0" by the ramp. * from resume point N ( > 0 ) for handle.resume() calls. * from the destroy stub for destroy point N for handle.destroy() calls. The C++ coroutine design described in the standard makes use of some helper methods that are authored in a so-called "promise" class provided by the user. At parse time (or post substitution) the type of the coroutine promise will be determined. At that point, we can look up the required promise class methods and issue diagnostics if they are missing or incorrect. To avoid repeating these actions at code-gen time, we make use of temporary 'proxy' variables for the coroutine handle and the promise - which will eventually be instantiated in the coroutine frame. Each of the keywords will expand to a code sequence (although co_yield is just syntactic sugar for a co_await). We defer the analysis and transformatin until template expansion is complete so that we have complete types at that time. 2. AST analysis and transformation which performs the code-gen for the outlined state machine. The entry point here is morph_fn_to_coro () which is called from finish_function () when we have completed any template expansion. This is preceded by helper functions that implement the phases below. The process proceeds in four phases. A Initial framing. The user's function body is wrapped in the initial and final suspend points and we begin building the coroutine frame. We build empty decls for the actor and destroyer functions at this time too. When exceptions are enabled, the user's function body will also be wrapped in a try-catch block with the catch invoking the promise class 'unhandled_exception' method. B Analysis. The user's function body is analysed to determine the suspend points, if any, and to capture local variables that might persist across such suspensions. In most cases, it is not necessary to capture compiler temporaries, since the tree-lowering nests the suspensions correctly. However, in the case of a captured reference, there is a lifetime extension to the end of the full expression - which can mean across a suspend point in which case it must be promoted to a frame variable. At the conclusion of analysis, we have a conservative frame layout and maps of the local variables to their frame entry points. C Build the ramp function. Carry out the allocation for the coroutine frame (NOTE; the actual size computation is deferred until late in the middle end to allow for future optimisations that will be allowed to elide unused frame entries). We build the return object. D Build and expand the actor and destroyer function bodies. The destroyer is a trivial shim that sets a bit to indicate that the destroy dispatcher should be used and then calls into the actor. The actor function is the implementation of the user's state machine. The current suspend point is noted in an index. Each suspend point is encoded as a pair of internal functions, one in the relevant dispatcher, and one representing the suspend point. During this process, the user's local variables and the proxies for the self-handle and the promise class instanceare re-written to their coroutine frame equivalents. The complete bodies for the ramp, actor and destroy function are passed back to finish_function for folding and gimplification. [C++ coroutines 4] Middle end expanders and transforms. The first part of this is a pass that provides: * expansion of the library support builtins, these are simple boolean or numerical substitutions. * The functionality of implementing an exit from scope without cleanup is performed here by lowering an IFN to a gimple goto. This pass has to run for non-coroutine functions, since functions calling the builtins are not necessarily coroutines (i.e. they are implementing the library interfaces which may be called from anywhere). The second part is the expansion of the coroutine IFNs that describe the state machine connections to the dispatchers. This only has to be run for functions that are coroutine components. The work done by this pass is: In the front end we construct a single actor function that contains the coroutine state machine. The actor function has three entry conditions: 1. from the ramp, resume point 0 - to initial-suspend. 2. when resume () is executed (resume point N). 3. from the destroy () shim when that is executed. The actor function begins with two dispatchers; one for resume and one for destroy (where the initial entry from the ramp is a special- case of resume point 0). Each suspend point and each dispatch entry is marked with an IFN such that we can connect the relevant dispatchers to their target labels. So, if we have: CO_YIELD (NUM, FINAL, RES_LAB, DEST_LAB, FRAME_PTR) This is await point NUM, and is the final await if FINAL is non-zero. The resume point is RES_LAB, and the destroy point is DEST_LAB. We expect to find a CO_ACTOR (NUM) in the resume dispatcher and a CO_ACTOR (NUM+1) in the destroy dispatcher. Initially, the intent of keeping the resume and destroy paths together is that the conditionals controlling them are identical, and thus there would be duplication of any optimisation of those paths if the split were earlier. Subsequent inlining of the actor (and DCE) is then able to extract the resume and destroy paths as separate functions if that is found profitable by the optimisers. Once we have remade the connections to their correct postions, we elide the labels that the front end inserted. [C++ coroutines 5] Standard library header. This provides the interfaces mandated by the standard and implements the interaction with the coroutine frame by means of inline use of builtins expanded at compile-time. There should be a 1:1 correspondence with the standard sections which are cross-referenced. There is no runtime content. At this stage, we have the content in an inline namespace "__n4835" for the CD we worked to. [C++ coroutines 6] Testsuite. There are two categories of test: 1. Checks for correctly formed source code and the error reporting. 2. Checks for transformation and code-gen. The second set are run as 'torture' tests for the standard options set, including LTO. These are also intentionally run with no options provided (from the coroutines.exp script). gcc/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Makefile.in: Add coroutine-passes.o. * builtin-types.def (BT_CONST_SIZE): New. (BT_FN_BOOL_PTR): New. (BT_FN_PTR_PTR_CONST_SIZE_BOOL): New. * builtins.def (DEF_COROUTINE_BUILTIN): New. * coroutine-builtins.def: New file. * coroutine-passes.cc: New file. * function.h (struct GTY function): Add a bit to indicate that the function is a coroutine component. * internal-fn.c (expand_CO_FRAME): New. (expand_CO_YIELD): New. (expand_CO_SUSPN): New. (expand_CO_ACTOR): New. * internal-fn.def (CO_ACTOR): New. (CO_YIELD): New. (CO_SUSPN): New. (CO_FRAME): New. * passes.def: Add pass_coroutine_lower_builtins, pass_coroutine_early_expand_ifns. * tree-pass.h (make_pass_coroutine_lower_builtins): New. (make_pass_coroutine_early_expand_ifns): New. * doc/invoke.texi: Document the fcoroutines command line switch. gcc/c-family/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * c-common.c (co_await, co_yield, co_return): New. * c-common.h (RID_CO_AWAIT, RID_CO_YIELD, RID_CO_RETURN): New enumeration values. (D_CXX_COROUTINES): Bit to identify coroutines are active. (D_CXX_COROUTINES_FLAGS): Guard for coroutine keywords. * c-cppbuiltin.c (__cpp_coroutines): New cpp define. * c.opt (fcoroutines): New command-line switch. gcc/cp/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * Make-lang.in: Add coroutines.o. * cp-tree.h (lang_decl-fn): coroutine_p, new bit. (DECL_COROUTINE_P): New. * lex.c (init_reswords): Enable keywords when the coroutine flag is set, * operators.def (co_await): New operator. * call.c (add_builtin_candidates): Handle CO_AWAIT_EXPR. (op_error): Likewise. (build_new_op_1): Likewise. (build_new_function_call): Validate coroutine builtin arguments. * constexpr.c (potential_constant_expression_1): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETURN_EXPR. * coroutines.cc: New file. * cp-objcp-common.c (cp_common_init_ts): Add CO_AWAIT_EXPR, CO_YIELD_EXPR, CO_RETRN_EXPR as TS expressions. * cp-tree.def (CO_AWAIT_EXPR, CO_YIELD_EXPR, (CO_RETURN_EXPR): New. * cp-tree.h (coro_validate_builtin_call): New. * decl.c (emit_coro_helper): New. (finish_function): Handle the case when a function is found to be a coroutine, perform the outlining and emit the outlined functions. Set a bit to signal that this is a coroutine component. * parser.c (enum required_token): New enumeration RT_CO_YIELD. (cp_parser_unary_expression): Handle co_await. (cp_parser_assignment_expression): Handle co_yield. (cp_parser_statement): Handle RID_CO_RETURN. (cp_parser_jump_statement): Handle co_return. (cp_parser_operator): Handle co_await operator. (cp_parser_yield_expression): New. (cp_parser_required_error): Handle RT_CO_YIELD. * pt.c (tsubst_copy): Handle CO_AWAIT_EXPR. (tsubst_expr): Handle CO_AWAIT_EXPR, CO_YIELD_EXPR and CO_RETURN_EXPRs. * tree.c (cp_walk_subtrees): Likewise. libstdc++-v3/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * include/Makefile.am: Add coroutine to the std set. * include/Makefile.in: Regenerated. * include/std/coroutine: New file. gcc/testsuite/ChangeLog: 2020-01-18 Iain Sandoe <iain@sandoe.co.uk> * g++.dg/coroutines/co-await-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-await-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-await-syntax-03-auto.C: New test. * g++.dg/coroutines/co-await-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-await-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-await-syntax-06-main.C: New test. * g++.dg/coroutines/co-await-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-await-syntax-08-lambda-auto.C: New test. * g++.dg/coroutines/co-return-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-return-syntax-03-auto.C: New test. * g++.dg/coroutines/co-return-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-return-syntax-05-constexpr-fn.C: New test. * g++.dg/coroutines/co-return-syntax-06-main.C: New test. * g++.dg/coroutines/co-return-syntax-07-vararg.C: New test. * g++.dg/coroutines/co-return-syntax-08-bad-return.C: New test. * g++.dg/coroutines/co-return-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-00-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-01-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-02-outside-fn.C: New test. * g++.dg/coroutines/co-yield-syntax-03-auto.C: New test. * g++.dg/coroutines/co-yield-syntax-04-ctor-dtor.C: New test. * g++.dg/coroutines/co-yield-syntax-05-constexpr.C: New test. * g++.dg/coroutines/co-yield-syntax-06-main.C: New test. * g++.dg/coroutines/co-yield-syntax-07-varargs.C: New test. * g++.dg/coroutines/co-yield-syntax-08-needs-expr.C: New test. * g++.dg/coroutines/co-yield-syntax-09-lambda-auto.C: New test. * g++.dg/coroutines/coro-builtins.C: New test. * g++.dg/coroutines/coro-missing-gro.C: New test. * g++.dg/coroutines/coro-missing-promise-yield.C: New test. * g++.dg/coroutines/coro-missing-ret-value.C: New test. * g++.dg/coroutines/coro-missing-ret-void.C: New test. * g++.dg/coroutines/coro-missing-ueh-1.C: New test. * g++.dg/coroutines/coro-missing-ueh-2.C: New test. * g++.dg/coroutines/coro-missing-ueh-3.C: New test. * g++.dg/coroutines/coro-missing-ueh.h: New test. * g++.dg/coroutines/coro-pre-proc.C: New test. * g++.dg/coroutines/coro.h: New file. * g++.dg/coroutines/coro1-ret-int-yield-int.h: New file. * g++.dg/coroutines/coroutines.exp: New file. * g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: New test. * g++.dg/coroutines/torture/alloc-01-overload-newdel.C: New test. * g++.dg/coroutines/torture/call-00-co-aw-arg.C: New test. * g++.dg/coroutines/torture/call-01-multiple-co-aw.C: New test. * g++.dg/coroutines/torture/call-02-temp-co-aw.C: New test. * g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: New test. * g++.dg/coroutines/torture/class-00-co-ret.C: New test. * g++.dg/coroutines/torture/class-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/class-02-templ-parm.C: New test. * g++.dg/coroutines/torture/class-03-operator-templ-parm.C: New test. * g++.dg/coroutines/torture/class-04-lambda-1.C: New test. * g++.dg/coroutines/torture/class-05-lambda-capture-copy-local.C: New test. * g++.dg/coroutines/torture/class-06-lambda-capture-ref.C: New test. * g++.dg/coroutines/torture/co-await-00-trivial.C: New test. * g++.dg/coroutines/torture/co-await-01-with-value.C: New test. * g++.dg/coroutines/torture/co-await-02-xform.C: New test. * g++.dg/coroutines/torture/co-await-03-rhs-op.C: New test. * g++.dg/coroutines/torture/co-await-04-control-flow.C: New test. * g++.dg/coroutines/torture/co-await-05-loop.C: New test. * g++.dg/coroutines/torture/co-await-06-ovl.C: New test. * g++.dg/coroutines/torture/co-await-07-tmpl.C: New test. * g++.dg/coroutines/torture/co-await-08-cascade.C: New test. * g++.dg/coroutines/torture/co-await-09-pair.C: New test. * g++.dg/coroutines/torture/co-await-10-template-fn-arg.C: New test. * g++.dg/coroutines/torture/co-await-11-forwarding.C: New test. * g++.dg/coroutines/torture/co-await-12-operator-2.C: New test. * g++.dg/coroutines/torture/co-await-13-return-ref.C: New test. * g++.dg/coroutines/torture/co-ret-00-void-return-is-ready.C: New test. * g++.dg/coroutines/torture/co-ret-01-void-return-is-suspend.C: New test. * g++.dg/coroutines/torture/co-ret-03-different-GRO-type.C: New test. * g++.dg/coroutines/torture/co-ret-04-GRO-nontriv.C: New test. * g++.dg/coroutines/torture/co-ret-05-return-value.C: New test. * g++.dg/coroutines/torture/co-ret-06-template-promise-val-1.C: New test. * g++.dg/coroutines/torture/co-ret-07-void-cast-expr.C: New test. * g++.dg/coroutines/torture/co-ret-08-template-cast-ret.C: New test. * g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: New test. * g++.dg/coroutines/torture/co-ret-10-expression-evaluates-once.C: New test. * g++.dg/coroutines/torture/co-ret-11-co-ret-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-12-co-ret-fun-co-await.C: New test. * g++.dg/coroutines/torture/co-ret-13-template-2.C: New test. * g++.dg/coroutines/torture/co-ret-14-template-3.C: New test. * g++.dg/coroutines/torture/co-yield-00-triv.C: New test. * g++.dg/coroutines/torture/co-yield-01-multi.C: New test. * g++.dg/coroutines/torture/co-yield-02-loop.C: New test. * g++.dg/coroutines/torture/co-yield-03-tmpl.C: New test. * g++.dg/coroutines/torture/co-yield-04-complex-local-state.C: New test. * g++.dg/coroutines/torture/co-yield-05-co-aw.C: New test. * g++.dg/coroutines/torture/co-yield-06-fun-parm.C: New test. * g++.dg/coroutines/torture/co-yield-07-template-fn-param.C: New test. * g++.dg/coroutines/torture/co-yield-08-more-refs.C: New test. * g++.dg/coroutines/torture/co-yield-09-more-templ-refs.C: New test. * g++.dg/coroutines/torture/coro-torture.exp: New file. * g++.dg/coroutines/torture/exceptions-test-0.C: New test. * g++.dg/coroutines/torture/func-params-00.C: New test. * g++.dg/coroutines/torture/func-params-01.C: New test. * g++.dg/coroutines/torture/func-params-02.C: New test. * g++.dg/coroutines/torture/func-params-03.C: New test. * g++.dg/coroutines/torture/func-params-04.C: New test. * g++.dg/coroutines/torture/func-params-05.C: New test. * g++.dg/coroutines/torture/func-params-06.C: New test. * g++.dg/coroutines/torture/lambda-00-co-ret.C: New test. * g++.dg/coroutines/torture/lambda-01-co-ret-parm.C: New test. * g++.dg/coroutines/torture/lambda-02-co-yield-values.C: New test. * g++.dg/coroutines/torture/lambda-03-auto-parm-1.C: New test. * g++.dg/coroutines/torture/lambda-04-templ-parm.C: New test. * g++.dg/coroutines/torture/lambda-05-capture-copy-local.C: New test. * g++.dg/coroutines/torture/lambda-06-multi-capture.C: New test. * g++.dg/coroutines/torture/lambda-07-multi-yield.C: New test. * g++.dg/coroutines/torture/lambda-08-co-ret-parm-ref.C: New test. * g++.dg/coroutines/torture/local-var-0.C: New test. * g++.dg/coroutines/torture/local-var-1.C: New test. * g++.dg/coroutines/torture/local-var-2.C: New test. * g++.dg/coroutines/torture/local-var-3.C: New test. * g++.dg/coroutines/torture/local-var-4.C: New test. * g++.dg/coroutines/torture/mid-suspend-destruction-0.C: New test. * g++.dg/coroutines/torture/pr92933.C: New test.
2020-01-18arm: Remove yet another unused variable.Jakub Jelinek2-1/+2
Bootstrap found yet another unused variable: ../../gcc/config/arm/vfp.md:1651:17: warning: unused variable 'regname' [-Wunused-variable] 2020-01-18 Jakub Jelinek <jakub@redhat.com> * config/arm/vfp.md (*clear_vfp_multiple): Remove unused variable.
2020-01-18arm: fix rtl checking bootstrap (PR target/93312)Jakub Jelinek2-7/+15
As reported in PR93312, the: > > > > > >         * config/arm/arm.c (clear_operation_p): New function. change broke RTL checking bootstrap. On the testcase from the PR (which is distilled from libgcc2.c, so I think we don't need to add it into testsuite) we ICE because SET_DEST (elt) is not a REG, but SUBREG. The code uses REGNO on it, which is invalid, but only stores it into a variable, then performs REG_P (reg) check, determines it is not a REG and bails early. The following patch just moves the regno variable initialization after that check, it isn't used in between. And, as a small optimization, because reg doesn't change, doesn't use REGNO (reg) a second time to set last_regno. 2020-01-18 Jakub Jelinek <jakub@redhat.com> PR target/93312 * config/arm/arm.c (clear_operation_p): Don't use REGNO until after checking the argument is a REG. Don't use REGNO (reg) again to set last_regno, reuse regno variable instead.
2020-01-17PR93234 INQUIRE on pre-assigned files of ROUND and SIGN propertiesJerry DeLisle4-8/+87
PR libfortran/93234 * io/unit.c (set_internal_unit): Set round and sign flags correctly. * gfortran.dg/inquire_pre.f90: New test.
2020-01-18Daily bump.GCC Administrator1-1/+1
2020-01-17analyzer: prevent ICE on isnan (PR 93290)David Malcolm6-4/+34
PR analyzer/93290 reports an ICE on calls to isnan(). The root cause is that an UNORDERED_EXPR is passed to region_model::eval_condition_without_cm, and there's a stray gcc_unreachable () in the case where we're comparing an svalue against itself. I attempted a more involved patch that properly handled NaN in general but it seems I've baked the assumption of reflexivity too deeply into the constraint_manager code. For now, this patch avoids the ICE and documents the limitation. gcc/analyzer/ChangeLog: PR analyzer/93290 * region-model.cc (region_model::eval_condition_without_cm): Avoid gcc_unreachable for unexpected operations for the case where we're comparing an svalue against itself. gcc/ChangeLog * doc/analyzer.texi (Limitations): Add note about NaN. gcc/testsuite/ChangeLog: PR analyzer/93290 * gcc.dg/analyzer/pr93290.c: New test.
2020-01-17PR90374 Zero width format specifiers.Jerry DeLisle3-2/+14
PR libfortran/90374 * io/format.c (parse_format_list): Zero width not allowed with FMT_D. * io/write_float.def (build_float_string): Include range of higher exponent values that require wider width.
2020-01-17Add testcase of PR c++/92542, already fixed.Paolo Carlini1-0/+5
PR c++/92542 * g++.dg/pr92542.C: New.
2020-01-17Add testcase of PR c++/92542, already fixed.Paolo Carlini1-0/+15
PR c++/92542 * g++.dg/pr92542.C: New.
2020-01-17[GCC/ARM, 2/2] Add support for ASRL(imm), LSLL(imm) and LSRL(imm) ↵Mihail Ionescu7-6/+86
instructions for Armv8.1-M Mainline This patch is adding the following instructions: ASRL (imm) LSLL (imm) LSRL (imm) *** gcc/ChangeLog *** 2020-01-17 Mihail-Calin Ionescu <mihail.ionescu@arm.com> Sudakshina Das <sudi.das@arm.com> * config/arm/arm.md (ashldi3): Generate thumb2_lsll for both reg and valid immediate. (ashrdi3): Generate thumb2_asrl for both reg and valid immediate. (lshrdi3): Generate thumb2_lsrl for valid immediates. * config/arm/constraints.md (Pg): New. * config/arm/predicates.md (long_shift_imm): New. (arm_reg_or_long_shift_imm): Likewise. * config/arm/thumb2.md (thumb2_asrl): New immediate alternative. (thumb2_lsll): Likewise. (thumb2_lsrl): New. *** gcc/testsuite/ChangeLog *** 2020-01-17 Mihail-Calin Ionescu <mihail.ionescu@arm.com> Sudakshina Das <sudi.das@arm.com> * gcc.target/arm/armv8_1m-shift-imm_1.c: New test.
2020-01-17[GCC/ARM, 1/2] Add support for ASRL(reg) and LSLL(reg) instructions for ↵Mihail Ionescu6-3/+82
Armv8.1-M Mainline This patch is adding the following instructions: ASRL (reg) LSLL (reg) *** gcc/ChangeLog *** 2020-01-17 Mihail-Calin Ionescu <mihail.ionescu@arm.com> Sudakshina Das <sudi.das@arm.com> * config/arm/arm.md (ashldi3): Generate thumb2_lsll for TARGET_HAVE_MVE. (ashrdi3): Generate thumb2_asrl for TARGET_HAVE_MVE. * config/arm/arm.c (arm_hard_regno_mode_ok): Allocate even odd register pairs for doubleword quantities for ARMv8.1M-Mainline. * config/arm/thumb2.md (thumb2_asrl): New. (thumb2_lsll): Likewise. 2020-01-17 Mihail-Calin Ionescu <mihail.ionescu@arm.com> Sudakshina Das <sudi.das@arm.com> * gcc.target/arm/armv8_1m-shift-reg_1.c: New test.
2020-01-17Fix up ChangeLog.Jakub Jelinek1-1/+1
2020-01-17arm: Unbreak bootstrapJakub Jelinek2-1/+6
2020-01-17 Jakub Jelinek <jakub@redhat.com> * config/arm/arm.c (cmse_nonsecure_call_inline_register_clear): Remove unused variable.
2020-01-17Rename acc_device_gcn to acc_device_radeonAndrew Stubbs10-17/+42
2020-01-17 Andrew Stubbs <ams@codesourcery.com> libgomp/ * config/accel/openacc.f90 (openacc_kinds): Rename acc_device_gcn to acc_device_radeon. (openacc): Likewise. * openacc.f90 (openacc_kinds): Likewise. (openacc): Likewise. * openacc.h (acc_device_t): Likewise. * openacc_lib.h: Likewise. * testsuite/lib/libgomp.exp (check_effective_target_openacc_amdgcn_accel_present): Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c (cb_compute_construct_end): Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c (cb_enqueue_launch_start): Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c (cb_enter_data_end): Likewise. (cb_exit_data_start): Likewise. (cb_exit_data_end): Likewise. (cb_compute_construct_end): Likewise. (cb_enqueue_launch_start): Likewise. (cb_enqueue_launch_end): Likewise. * testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c (main): Likewise.
2020-01-17libstdc++: Fix freestanding build PR 92376)Jonathan Wakely3-1/+27
In a freestanding library we don't install the <pstl/pstl_config.h> header, so don't try to include it unless it exists. Explicitly declare aligned alloc functions for freestanding, because <cstdlib> doesn't declare them. PR libstdc++/92376 * include/bits/c++config: Only do PSTL config when the header is present, to fix freestanding. * libsupc++/new_opa.cc [!_GLIBCXX_HOSTED]: Declare allocation functions if they were detected by configure.
2020-01-17gdbinit.in: make shorthands accept an explicit argumentAlexander Monakov2-51/+129
Make gdb shorthands such as 'pr' accept an argument, in addition to implictly taking register '$' as the thing to examine. The 'eval ...' one-liners are used to workaround GDB bug #22466. * gdbinit.in (help-gcc-hooks): New command. (pp, pr, prl, pt, pct, pgg, pgq, pgs, pge, pmz, ptc, pdn, ptn, pdd, prc, pi, pbm, pel, trt): Take $arg0 instead of $ if supplied. Update documentation.
2020-01-17[AArch64] [Obvious] Correct pattern target requirementMatthew Malcomson2-1/+6
Had mistakenly used a target macro that was not defined and not the relevant one instead of the macro that should be used. TARGET_ARMV8_6 is not defined, and also not the macro we want to check. Instead check TARGET_F64MM. gcc/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * config/aarch64/aarch64-sve.md (@aarch64_sve_ld1ro<mode>): Use the correct target macro.
2020-01-17Fix g++ testsuite failure caused by std::is_pod deprecationJonathan Wakely2-0/+7
PR testsuite/93227 * g++.dg/cpp0x/std-layout1.C: Use -Wno-deprecated-declarations for C++20, due to std::is_pod being deprecated.
2020-01-17[AArch64] [SVE] Implement svld1ro intrinsic.Matthew Malcomson23-6/+1462
We take no action to ensure the SVE vector size is large enough. It is left to the user to check that before compiling this intrinsic or before running such a program on a machine. The main difference between ld1ro and ld1rq is in the allowed offsets, the implementation difference is that ld1ro is implemented using integer modes since there are no pre-existing vector modes of the relevant size. Adding new vector modes simply for this intrinsic seems to make the code less tidy. Specifications can be found under the "Arm C Language Extensions for Scalable Vector Extension" title at https://developer.arm.com/architectures/system-architectures/software-standards/acle gcc/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * config/aarch64/aarch64-protos.h (aarch64_sve_ld1ro_operand_p): New. * config/aarch64/aarch64-sve-builtins-base.cc (class load_replicate): New. (class svld1ro_impl): New. (class svld1rq_impl): Change to inherit from load_replicate. (svld1ro): New sve intrinsic function base. * config/aarch64/aarch64-sve-builtins-base.def (svld1ro): New DEF_SVE_FUNCTION. * config/aarch64/aarch64-sve-builtins-base.h (svld1ro): New decl. * config/aarch64/aarch64-sve-builtins.cc (function_expander::add_mem_operand): Modify assert to allow OImode. * config/aarch64/aarch64-sve.md (@aarch64_sve_ld1ro<mode>): New pattern. * config/aarch64/aarch64.c (aarch64_sve_ld1rq_operand_p): Implement in terms of ... (aarch64_sve_ld1rq_ld1ro_operand_p): This. (aarch64_sve_ld1ro_operand_p): New. * config/aarch64/aarch64.md (UNSPEC_LD1RO): New unspec. * config/aarch64/constraints.md (UOb,UOh,UOw,UOd): New. * config/aarch64/predicates.md (aarch64_sve_ld1ro_operand_{b,h,w,d}): New. gcc/testsuite/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: New test. * gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: New test.
2020-01-17[AArch64] Enable CLI for Armv8.6-A f64mmMatthew Malcomson7-14/+67
This patch is necessary for sve-ld1ro intrinsic I posted in https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00466.html . I had mistakenly thought this option was already enabled upstream. This provides the option +f64mm, that turns on the 64 bit floating point matrix multiply extension. This extension is only available for AArch64. Turning on this extension also turns on the SVE extension. This extension is optional and only available at Armv8.2-A and onward. We also add the ACLE defined macro for this extension. gcc/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * config/aarch64/aarch64-c.c (_ARM_FEATURE_MATMUL_FLOAT64): Introduce this ACLE specified predefined macro. * config/aarch64/aarch64-option-extensions.def (f64mm): New. (fp): Disabling this disables f64mm. (simd): Disabling this disables f64mm. (fp16): Disabling this disables f64mm. (sve): Disabling this disables f64mm. * config/aarch64/aarch64.h (AARCH64_FL_F64MM): New. (AARCH64_ISA_F64MM): New. (TARGET_F64MM): New. * doc/invoke.texi (f64mm): Document new option. gcc/testsuite/ChangeLog: 2020-01-17 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.target/aarch64/pragma_cpp_predefs_2.c: Check for f64mm predef.
2020-01-17[AArch64] Enable compare branch fusionWilco Dijkstra2-2/+7
Enable the most basic form of compare-branch fusion since various CPUs support it. This has no measurable effect on cores which don't support branch fusion, but increases fusion opportunities on cores which do. gcc/ * config/aarch64/aarch64.c (generic_tunings): Add branch fusion. (neoversen1_tunings): Likewise.
2020-01-17PR c++/92531 - ICE with noexcept(lambda).Jason Merrill3-14/+16
This was failing because uses_template_parms didn't recognize LAMBDA_EXPR as a kind of expression. Instead of trying to enumerate all the different varieties of expression and then aborting if what's left isn't error_mark_node, let's handle error_mark_node and then assume anything else is an expression. * pt.c (uses_template_parms): Don't try to enumerate all the expression cases.
2020-01-17c++: Fix deprecated attribute handling on templates (PR c++/93228)Jakub Jelinek4-1/+35
As the following testcase shows, when deprecated attribute is on a template, we'd never print the message if any, because the attribute is not present on the TEMPLATE_DECL with which warn_deprecated_use is called, but on its DECL_TEMPLATE_RESULT or its type. 2020-01-17 Jakub Jelinek <jakub@redhat.com> PR c++/93228 * parser.c (cp_parser_template_name): Look up deprecated attribute in DECL_TEMPLATE_RESULT or its type's attributes. * g++.dg/cpp1y/attr-deprecated-3.C: New test.
2020-01-17[PR93306] Short-circuit has_includeNathan Sidwell2-22/+18
the preprocessor evaluator has a skip_eval counter, but we weren't checking it after parsing has_include(foo), but before looking for foo. Resulting in unnecessary io for 'FALSE_COND && has_include <foo>' PR preprocessor/93306 * expr.c (parse_has_include): Refactor. Check skip_eval before looking.
2020-01-17analyzer: fix handling of negative byte offsets (v2) (PR 93281)David Malcolm2-4/+15
Various 32-bit targets show failures in gcc.dg/analyzer/data-model-1.c with tests of the form: __analyzer_eval (q[-2].x == 107024); /* { dg-warning "TRUE" } */ __analyzer_eval (q[-2].y == 107025); /* { dg-warning "TRUE" } */ where they emit UNKNOWN instead. The root cause is that gimple has a byte-based twos-complement offset of -16 expressed like this: _55 = q_92 + 4294967280; (32-bit) or: _55 = q_92 + 18446744073709551600; (64-bit) Within region_model::convert_byte_offset_to_array_index that unsigned offset was being divided by the element size to get an offset within an array. This happened to work on 64-bit target and host, but not elsewhere; the offset needs to be converted to a signed type before the division is meaningful. This patch does so, fixing the failures. gcc/analyzer/ChangeLog: PR analyzer/93281 * region-model.cc (region_model::convert_byte_offset_to_array_index): Convert to ssizetype before dividing by byte_size. Use fold_binary rather than fold_build2 to avoid needlessly constructing a tree for the non-const case.
2020-01-17[AArch64] Fix shrinkwrapping interactions with atomics (PR92692)Wilco Dijkstra3-10/+32
The separate shrinkwrapping pass may insert stores in the middle of atomics loops which can cause issues on some implementations. Avoid this by delaying splitting atomics patterns until after prolog/epilog generation. gcc/ PR target/92692 * config/aarch64/aarch64.c (aarch64_split_compare_and_swap) Add assert to ensure prolog has been emitted. (aarch64_split_atomic_op): Likewise. * config/aarch64/atomics.md (aarch64_compare_and_swap<mode>) Use epilogue_completed rather than reload_completed. (aarch64_atomic_exchange<mode>): Likewise. (aarch64_atomic_<atomic_optab><mode>): Likewise. (atomic_nand<mode>): Likewise. (aarch64_atomic_fetch_<atomic_optab><mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise.
2020-01-17Add PR number to change logRichard Sandiford1-0/+1
2020-01-17aarch64: Don't raise FE_INVALID for -__builtin_isgreater [PR93133]Richard Sandiford6-35/+136
AIUI, the main purpose of REVERSE_CONDITION is to take advantage of any integer vs. FP information encoded in the CC mode, particularly when handling LT, LE, GE and GT. For integer comparisons we can safely map LT->GE, LE->GT, GE->LT and GT->LE, but for float comparisons this would usually be invalid without -ffinite-math-only. The aarch64 definition of REVERSE_CONDITION used reverse_condition_maybe_unordered for FP comparisons, which had the effect of converting an unordered-signalling LT, LE, GE or GT into a quiet UNGE, UNGT, UNLT or UNLE. And it would do the same in reverse: convert a quiet UN* into an unordered-signalling comparison. This would be safe in practice (although a little misleading) if we always used a compare:CCFP or compare:CCFPE to do the comparison and then used (gt (reg:CCFP/CCFPE CC_REGNUM) (const_int 0)) etc. to test the result. In that case any signal is raised by the compare and the choice of quiet vs. signalling relations doesn't matter when testing the result. The problem is that we also want to use GT directly on float registers, where any signal is raised by the comparison operation itself and so must follow the normal rtl rules (GT signalling, UNLE quiet). I think the safest fix is to make REVERSIBLE_CC_MODE return false for FP comparisons. We can then use the default REVERSE_CONDITION for integer comparisons and the usual conservatively-correct reversed_comparison_code_parts behaviour for FP comparisons. Unfortunately reversed_comparison_code_parts doesn't yet handle -ffinite-math-only, but that's probably GCC 11 material. A downside is that: int f (float x, float y) { return !(x < y); } now generates: fcmpe s0, s1 cset w0, mi eor w0, w0, 1 ret without -ffinite-math-only. Maybe for GCC 11 we should define rtx codes for all IEEE comparisons, so that we don't have this kind of representational gap. Changing REVERSE_CONDITION itself is pretty easy. However, the macro was also used in the ccmp handling, which relied on being able to reverse all comparisons. The patch adds new reversed patterns for cases in which the original condition needs to be kept. The test is based on gcc.dg/torture/pr91323.c. It might well fail on other targets that have similar bugs; please XFAIL as appropriate if you don't want to fix the target for GCC 10. 2020-01-17 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.h (REVERSIBLE_CC_MODE): Return false for FP modes. (REVERSE_CONDITION): Delete. * config/aarch64/iterators.md (CC_ONLY): New mode iterator. (CCFP_CCFPE): Likewise. (e): New mode attribute. * config/aarch64/aarch64.md (ccmp<GPI:mode>): Rename to... (@ccmp<CC_ONLY:mode><GPI:mode>): ...this, using CC_ONLY instead of CC. (fccmp<GPF:mode>, fccmpe<GPF:mode>): Merge into... (@ccmp<CCFP_CCFPE:mode><GPF:mode>): ...this combined pattern. (@ccmp<CC_ONLY:mode><GPI:mode>_rev): New pattern. (@ccmp<CCFP_CCFPE:mode><GPF:mode>_rev): Likewise. * config/aarch64/aarch64.c (aarch64_gen_compare_reg): Update name of generator from gen_ccmpdi to gen_ccmpccdi. (aarch64_gen_ccmp_next): Use code_for_ccmp. If we want to reverse the previous comparison but aren't able to, use the new ccmp_rev patterns instead.
2020-01-17gimplifier: handle POLY_INT_CST-sized TARGET_EXPRsRichard Sandiford4-3/+17
If a TARGET_EXPR has poly-int size, the gimplifier would treat it like a VLA and use gimplify_vla_decl. gimplify_vla_decl in turn would use an alloca and expect all references to be gimplified via the DECL_VALUE_EXPR. This caused confusion later in gimplify_var_or_parm_decl_1 when we (correctly) had direct rather than indirect references. For completeness, the patch also fixes similar tests in the RETURN_EXPR handling and OpenMP depend clauses. 2020-01-17 Richard Sandiford <richard.sandiford@arm.com> gcc/ * gimplify.c (gimplify_return_expr): Use poly_int_tree_p rather than testing directly for INTEGER_CST. (gimplify_target_expr, gimplify_omp_depend): Likewise. gcc/testsuite/ * g++.target/aarch64/sve/acle/general-c++/gimplify_1.C: New test.
2020-01-17PATCH] Fortran: PR93263 -fno-automatic and RECURSIVEMark Eggleston5-1/+70
The use of -fno-automatic should not affect the save attribute of a recursive procedure. The first test case checks unsaved variables and the second checks saved variables.
2020-01-17vect: Fix ICE in vectorizable_comparison PR93292Jakub Jelinek4-1/+28
The following testcase ICEs on powerpc64le-linux. The problem is that get_vectype_for_scalar_type returns NULL, and while most places in tree-vect-stmts.c handle that case, this spot doesn't and punts only if it is non-NULL, but with different number of elts than expected. 2020-01-17 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93292 * tree-vect-stmts.c (vectorizable_comparison): Punt also if get_vectype_for_scalar_type returns NULL. * g++.dg/opt/pr93292.C: New test.
2020-01-17testsuite: Unbreak compat.exp testing with alt compiler PR93294Jakub Jelinek2-0/+17
2020-01-17 Jakub Jelinek <jakub@redhat.com> PR testsuite/93294 * lib/c-compat.exp (compat-use-alt-compiler): Handle -fdiagnostics-urls=never similarly to -fdiagnostics-color=never. (compat_setup_dfp): Likewise.
2020-01-17ChangeLog fixes.Jakub Jelinek2-19/+19
2020-01-17contrib/gcc_update: Insert "tformat:" for git log --pretty=tformat:%p:%t:%HHans-Peter Nilsson2-1/+6
Really old git versions (like 1.6.0) require "git log --pretty=tformat:%p:%t:%H" or else we see: Updating GIT tree Current branch master is up to date. fatal: invalid --pretty format: %p:%t:%H Adjusting file timestamps Touching gcc/config.in... Touching gcc/config/arm/arm-tune.md... ...and an empty revision in LAST_UPDATED and gcc/REVISION. In its absence, for newer git versions, "tformat" is the default qualifier, documented as such default for at least git-2.11.0.
2020-01-16PR c++/93286 - ICE with __is_constructible and variadic template.Jason Merrill3-7/+89
Here we had been recursing in tsubst_copy_and_build if type2 was a TREE_LIST because that function knew how to deal with pack expansions, and tsubst didn't. But tsubst_copy_and_build expects to be dealing with expressions, so we crash when trying to convert_from_reference a type. * pt.c (tsubst) [TREE_LIST]: Handle pack expansion. (tsubst_copy_and_build) [TRAIT_EXPR]: Always use tsubst for type2.
2020-01-17Daily bump.GCC Administrator1-1/+1
2020-01-17Extern -param=max-predicted-iterations range.Jan Hubicka3-2/+8
* params.opt (-param=max-predicted-iterations): Increase range from 0. * predict.c (estimate_loops): Add 1 to param_max_predicted_iterations.
2020-01-16Fix ICE caused by swallowing a token in c_parser_consume_tokenKerem Kat7-1/+36
This patch fixes ICE on invalid code, specifically files that have conflict-marker-like signs before EOF. PR c/92833 gcc/c/ * c-parser.c (c_parser_consume_token): Fix peeked token stack pop to support 4 available tokens. gcc/testsuite/ * c-c++-common/pr92833-1.c, c-c++-common/pr92833-2.c, c-c++-common/pr92833-3.c, c-c++-common/pr92833-4.c: New tests.
2020-01-16Make profile estimation more preciseJan Hubicka6-60/+76
While analyzing code size regression in SPEC2k GCC binary I noticed that we perform some inline decisions because we think that number of executions are very high. In particular there was inline decision inlining gen_rtx_fmt_ee to find_reloads believing that it is called 4 billion times. This turned out to be cummulation of roundoff errors in propagate_freq which was bit mechanically updated from original sreals to C++ sreals and later to new probabilities. This led us to estimate that a loopback edge is reached with probability 2.3 which was capped to 1-1/10000 and since this happened in nested loop it quickly escalated to large values. Originally capping to REG_BR_PROB_BASE avoided such problems but now we have much higher range. This patch avoids going from probabilites to REG_BR_PROB_BASE so precision is kept. In addition it makes the propagation to not estimate more than param-max-predicted-loop-iterations. The first change makes the cap to not be triggered on the gcc build, but it is still better to be safe than sorry. * ipa-fnsummary.c (estimate_calls_size_and_time): Fix formating of dump. * params.opt: (max-predicted-iterations): Set bounds. * predict.c (real_almost_one, real_br_prob_base, real_inv_br_prob_base, real_one_half, real_bb_freq_max): Remove. (propagate_freq): Add max_cyclic_prob parameter; cap cyclic probabilities; do not truncate to reg_br_prob_bases. (estimate_loops_at_level): Pass max_cyclic_prob. (estimate_loops): Compute max_cyclic_prob. (estimate_bb_frequencies): Do not initialize real_*; update calculation of back edge prob. * profile-count.c (profile_probability::to_sreal): New. * profile-count.h (class sreal): Move up in file. (profile_probability::to_sreal): Declare.
2020-01-16PR c++/93280 - ICE with aggregate assignment and DMI.Jason Merrill3-3/+10
I recently added an assert to cp-gimplify to catch any TARGET_EXPR_DIRECT_INIT_P being expanded without a target object, and this testcase found one. We started out with a TARGET_EXPR around the CONSTRUCTOR, which would normally mean that the member initializer would be used to directly initialize the appropriate member of whatever object the TARGET_EXPR ends up initializing. But then gimplify_modify_expr_rhs stripped the TARGET_EXPR in order to assign directly from the elements of the CONSTRUCTOR, leaving no object for the TARGET_EXPR_DIRECT_INIT_P to initialize. I considered setting CONSTRUCTOR_PLACEHOLDER_BOUNDARY in that case, which implies TARGET_EXPR_NO_ELIDE, but decided that there's no particular reason the A initializer needs to initialize a member of a B rather than a distinct A object, so let's only set TARGET_EXPR_DIRECT_INIT_P when we're using the DMI in a constructor. * init.c (get_nsdmi): Set TARGET_EXPR_DIRECT_INIT_P here. * typeck2.c (digest_nsdmi_init): Not here.
2020-01-16Fix noreorder symbol partitioning reversion.Martin Liska2-0/+9
* lto-partition.c (lto_balanced_map): Remember best_noreorder_pos and then restore to it when we revert.
2020-01-16libstdc++: std::ctype fixes for recent versions of NetBSDJonathan Wakely4-35/+37
This removes support for EOL versions of NetBSD and syncs the definitions with patches from NetBSD upstream. The only change here that isn't from upstream is to use _CTYPE_BL for the isblank class, which is correct but wasn't previously done either in FSF GCC or the NetBSD packages. 2020-01-16 Kai-Uwe Eckhardt <kuehro@gmx.de> Matthew Bauer <mjbauer95@gmail.com> Jonathan Wakely <jwakely@redhat.com> PR bootstrap/64271 (partial) * config/os/bsd/netbsd/ctype_base.h (ctype_base::mask): Change type to unsigned short. (ctype_base::alpha, ctype_base::digit, ctype_base::xdigit) (ctype_base::print, ctype_base::graph, ctype_base::alnum): Sync definitions with NetBSD upstream. (ctype_base::blank): Use _CTYPE_BL. * config/os/bsd/netbsd/ctype_configure_char.cc (_C_ctype_): Remove Declaration. (ctype<char>::classic_table): Use _C_ctype_tab_ instead of _C_ctype_. (ctype<char>::do_toupper, ctype<char>::do_tolower): Cast char parameters to unsigned char. * config/os/bsd/netbsd/ctype_inline.h (ctype<char>::is): Likewise.
2020-01-16[GCC][PATCH][ARM] Add Bfloat16_t scalar type, vector types and machine modes ↵Stam Markianos-Wright7-0/+820
to ARM back-end [2/2] gcc/ChangeLog: 2020-01-16 Stam Markianos-Wright <stam.markianos-wright@arm.com> * config/arm/arm.c (arm_invalid_conversion): New function for target hook. (arm_invalid_unary_op): New function for target hook. (arm_invalid_binary_op): New function for target hook. gcc/testsuite/ChangeLog: 2020-01-16 Stam Markianos-Wright <stam.markianos-wright@arm.com> * g++.target/arm/bfloat_cpp_typecheck.C: New test. * gcc.target/arm/bfloat16_scalar_typecheck.c: New test. * gcc.target/arm/bfloat16_vector_typecheck_1.c: New test. * gcc.target/arm/bfloat16_vector_typecheck_2.c: New test.
2020-01-16[GCC][PATCH][ARM] Add Bfloat16_t scalar type, vector types and machine modes ↵Stam Markianos-Wright29-74/+1546
to ARM back-end [1/2] gcc/ChangeLog: 2020-01-16 Stam Markianos-Wright <stam.markianos-wright@arm.com> * config.gcc: Add arm_bf16.h. * config/arm/arm-builtins.c (arm_mangle_builtin_type): Fix comment. (arm_simd_builtin_std_type): Add BFmode. (arm_init_simd_builtin_types): Define element types for vector types. (arm_init_bf16_types): New function. (arm_init_builtins): Add arm_init_bf16_types function call. * config/arm/arm-modes.def: Add BFmode and V4BF, V8BF vector modes. * config/arm/arm-simd-builtin-types.def: Add V4BF, V8BF. * config/arm/arm.c (aapcs_vfp_sub_candidate): Add BFmode. (arm_hard_regno_mode_ok): Add BFmode and tidy up statements. (arm_vector_mode_supported_p): Add V4BF, V8BF. (arm_mangle_type): Add __bf16. * config/arm/arm.h: Add V4BF, V8BF to VALID_NEON_DREG_MODE, VALID_NEON_QREG_MODE respectively. Add export arm_bf16_type_node, arm_bf16_ptr_type_node. * config/arm/arm.md: Add BFmode to movhf expand, mov pattern and define_split between ARM registers. * config/arm/arm_bf16.h: New file. * config/arm/arm_neon.h: Add arm_bf16.h and Bfloat vector types. * config/arm/iterators.md: (ANY64_BF, VDXMOV, VHFBF, HFBF, fporbf): New. (VQXMOV): Add V8BF. * config/arm/neon.md: Add BF vector types to movhf NEON move patterns. * config/arm/vfp.md: Add BFmode to movhf patterns. gcc/testsuite/ChangeLog: 2020-01-16 Stam Markianos-Wright <stam.markianos-wright@arm.com> * g++.dg/abi/mangle-neon.C: Add BF16 SIMD types. * g++.dg/ext/arm-bf16/bf16-mangle-1.C: New test. * gcc.target/arm/bfloat16_scalar_1_1.c: New test. * gcc.target/arm/bfloat16_scalar_1_2.c: New test. * gcc.target/arm/bfloat16_scalar_2_1.c: New test. * gcc.target/arm/bfloat16_scalar_2_2.c: New test. * gcc.target/arm/bfloat16_scalar_3_1.c: New test. * gcc.target/arm/bfloat16_scalar_3_2.c: New test. * gcc.target/arm/bfloat16_scalar_4.c: New test. * gcc.target/arm/bfloat16_simd_1_1.c: New test. * gcc.target/arm/bfloat16_simd_1_2.c: New test. * gcc.target/arm/bfloat16_simd_2_1.c: New test. * gcc.target/arm/bfloat16_simd_2_2.c: New test. * gcc.target/arm/bfloat16_simd_3_1.c: New test. * gcc.target/arm/bfloat16_simd_3_2.c: New test.
2020-01-16Add CLI and multilib support for Armv8.1-M Mainline MVE extensionsMihail Ionescu7-1/+75
gcc/ChangeLog: 2020-01-16 Mihail Ionescu <mihail.ionescu@arm.com> 2020-01-16 Andre Vieira <andre.simoesdiasvieira@arm.com> * config/arm/arm-cpus.in (mve, mve_float): New features. (dsp, mve, mve.fp): New options. * config/arm/arm.h (TARGET_HAVE_MVE, TARGET_HAVE_MVE_FLOAT): Define. * config/arm/t-rmprofile: Map v8.1-M multilibs to v8-M. * doc/invoke.texi: Document the armv8.1-m mve and dps options. gcc/testsuite/ChangeLog: 2020-01-16 Mihail Ionescu <mihail.ionescu@arm.com> 2020-01-16 Andre Vieira <andre.simoesdiasvieira@arm.com> * testsuite/gcc.target/arm/multilib.exp: Add v8.1-M entries.
2020-01-16[PATCH, GCC/ARM, 10/10] Enable -mcmseMihail Ionescu3-7/+9
The patch is straightforward: it redefines ARMv8_1m_main as having the same features as ARMv8m_main (and thus as having the cmse feature) with the extra features represented by armv8_1m_main. It also removes the error for using -mcmse on Armv8.1-M Mainline. *** gcc/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * config/arm/arm-cpus.in (ARMv8_1m_main): Redefine as an extension to Armv8-M Mainline. * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Remove error for using -mcmse when targeting Armv8.1-M Mainline.
2020-01-16[PATCH, GCC/ARM, 9/10] Call nscall function with blxnsMihail Ionescu30-112/+357
This change to use BLXNS to call a nonsecure function from secure directly (not using a libcall) is made in 2 steps: - change nonsecure_call patterns to use blxns instead of calling __gnu_cmse_nonsecure_call - loosen requirement for function address to allow any register when doing BLXNS. The former is a straightforward check over whether instructions added in Armv8.1-M Mainline are available while the latter consist in making the nonsecure call pattern accept any register by using match_operand and changing the nonsecure_call_internal expander to no force r4 when targeting Armv8.1-M Mainline. The tricky bit is actually in the test update, specifically how to check that register lists for CLRM have all registers except for the one holding parameters (already done) and the one holding the address used by BLXNS. This is achieved with 3 scan-assembler directives. 1) The first one lists all registers that can appear in CLRM but make each of them optional. Property guaranteed: no wrong register is cleared and none appears twice in the register list. 2) The second directive check that the CLRM is made of a fixed number of the right registers to be cleared. The number used is the number of registers that could contain a secret minus one (used to hold the address of the function to call. Property guaranteed: register list has the right number of registers Cumulated property guaranteed: only registers with a potential secret are cleared and they are all listed but ont 3) The last directive checks that we cannot find a CLRM with a register in it that also appears in BLXNS. This is check via the use of a back-reference on any of the allowed register in CLRM, the back-reference enforcing that whatever register match in CLRM must be the same in the BLXNS. Property guaranteed: register used for BLXNS is different from registers cleared in CLRM. Some more care needs to happen for the gcc.target/arm/cmse/cmse-1.c testcase due to there being two CLRM generated. To ensure the third directive match the right CLRM to the BLXNS, a negative lookahead is used between the CLRM register list and the BLXNS. The way negative lookahead work is by matching the *position* where a given regular expression does not match. In this case, since it comes after the CLRM register list it is requesting that what comes after the register list does not have a CLRM again followed by BLXNS. This guarantees that the .*blxns after only matches a blxns without another CLRM before. *** gcc/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * config/arm/arm.md (nonsecure_call_internal): Do not force memory address in r4 when targeting Armv8.1-M Mainline. (nonsecure_call_value_internal): Likewise. * config/arm/thumb2.md (nonsecure_call_reg_thumb2): Make memory address a register match_operand again. Emit BLXNS when targeting Armv8.1-M Mainline. (nonsecure_call_value_reg_thumb2): Likewise. *** gcc/testsuite/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * gcc.target/arm/cmse/cmse-1.c: Add check for BLXNS when instructions introduced in Armv8.1-M Mainline Security Extensions are available and restrict checks for libcall to __gnu_cmse_nonsecure_call to Armv8-M targets only. Adapt CLRM check to verify register used for BLXNS is not in the CLRM register list. * gcc.target/arm/cmse/cmse-14.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Likewise and adapt check for LSB clearing bit to be using the same register as BLXNS when targeting Armv8.1-M Mainline. * gcc.target/arm/cmse/mainline/8_1m/bitfield-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-9.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/union-1.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise. * gcc.target/arm/cmse/cmse-15.c: Count BLXNS when targeting Armv8.1-M Mainline and restrict libcall count to Armv8-M.
2020-01-16[PATCH, GCC/ARM, 8/10] Do lazy store & load inline when calling nscall functionMihail Ionescu13-4/+107
This patch adds two new patterns for the VLSTM and VLLDM instructions. cmse_nonsecure_call_inline_register_clear is then modified to generate VLSTM and VLLDM respectively before and after calls to functions with the cmse_nonsecure_call attribute in order to have lazy saving, clearing and restoring of VFP registers. Since these instructions do not do writeback of the base register, the stack is adjusted prior the lazy store and after the lazy load with appropriate frame debug notes to describe the effect on the CFA register. As with CLRM, VSCCLRM and VSTR/VLDR, the instruction is modeled as an unspecified operation to the memory pointed to by the base register. *** gcc/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * config/arm/arm.c (arm_add_cfa_adjust_cfa_note): Declare early. (cmse_nonsecure_call_inline_register_clear): Define new lazy_fpclear variable as true when floating-point ABI is not hard. Replace check against TARGET_HARD_FLOAT_ABI by checks against lazy_fpclear. Generate VLSTM and VLLDM instruction respectively before and after a function call to cmse_nonsecure_call function. * config/arm/unspecs.md (VUNSPEC_VLSTM): Define unspec. (VUNSPEC_VLLDM): Likewise. * config/arm/vfp.md (lazy_store_multiple_insn): New define_insn. (lazy_load_multiple_insn): Likewise. *** gcc/testsuite/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-13.c: Add check for VLSTM and VLLDM. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-8.c: Likewise.
2020-01-16[PATCH, GCC/ARM, 7/10] Clear all VFP regs inline in hardfloat nscall functionsMihail Ionescu9-11/+75
The patch is fairly straightforward in its approach and consist of the following 3 logical changes: - abstract the number of floating-point register to clear in max_fp_regno - use max_fp_regno to decide how many registers to clear so that the same code works for Armv8-M and Armv8.1-M Mainline - emit vpush and vpop instruction respectively before and after a nonsecure call Note that as in the patch to clear GPRs inline, debug information has to be disabled for VPUSH and VPOP due to VPOP adding CFA adjustment note for SP when R7 is sometimes used as CFA. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * config/arm/arm.c (vfp_emit_fstmd): Declare early. (arm_emit_vfp_multi_reg_pop): Likewise. (cmse_nonsecure_call_inline_register_clear): Abstract number of VFP registers to clear in max_fp_regno. Emit VPUSH and VPOP to save and restore callee-saved VFP registers. *** gcc/testsuite/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-13.c: Add check for VPUSH and VPOP and update expectation for VSCCLRM. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-8.c: Likewise.
2020-01-16[PATCH, GCC/ARM, 6/10] Clear GPRs inline when calling nscall functionMihail Ionescu28-35/+176
Besides changing the set of registers that needs to be cleared inline, this patch also generates the push and pop to save and restore callee-saved registers without trusting the callee inline. To make the code more future-proof, this (currently) Armv8.1-M specific behavior is expressed in terms of clearing of callee-saved registers rather than directly based on the targets. The patch contains 1 subtlety: Debug information is disabled for push and pop because the REG_CFA_RESTORE notes used to describe popping of registers do not stack. Instead, they just reset the debug state for the register to the one at the beginning of the function, which is incorrect for a register that is pushed twice (in prologue and before nonsecure call) and then popped for the first time. In particular, this occasionally trips CFI note creation code when there are two codepaths to the epilogue, one of which does not go through the nonsecure call. Obviously this mean that debugging between the push and pop is not reliable. *** gcc/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * config/arm/arm.c (arm_emit_multi_reg_pop): Declare early. (cmse_nonsecure_call_clear_caller_saved): Rename into ... (cmse_nonsecure_call_inline_register_clear): This. Save and clear callee-saved GPRs as well as clear ip register before doing a nonsecure call then restore callee-saved GPRs after it when targeting Armv8.1-M Mainline. (arm_reorg): Adapt to function rename. *** gcc/testsuite/ChangeLog *** 2020-01-16 Mihail-Calin Ionescu <mihail.ionescu@arm.com> 2020-01-16 Thomas Preud'homme <thomas.preudhomme@arm.com> * gcc.target/arm/cmse/cmse-1.c: Add check for PUSH and POP and update CLRM check. * gcc.target/arm/cmse/cmse-14.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-5.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-9.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft-sp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/soft-sp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-13.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-7.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-8.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/union-1.c: Likewise. * gcc.target/arm/cmse/mainline/8_1m/softfp/union-2.c: Likewise.