Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds support in gcc+gcov for modified condition/decision
coverage (MC/DC) with the -fcondition-coverage flag. MC/DC is a type of
test/code coverage and it is particularly important for safety-critical
applicaitons in industries like aviation and automotive. Notably, MC/DC
is required or recommended by:
* DO-178C for the most critical software (Level A) in avionics.
* IEC 61508 for SIL 4.
* ISO 26262-6 for ASIL D.
From the SQLite webpage:
Two methods of measuring test coverage were described above:
"statement" and "branch" coverage. There are many other test
coverage metrics besides these two. Another popular metric is
"Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
MC/DC as follows:
* Each decision tries every possible outcome.
* Each condition in a decision takes on every possible outcome.
* Each entry and exit point is invoked.
* Each condition in a decision is shown to independently affect
the outcome of the decision.
In the C programming language where && and || are "short-circuit"
operators, MC/DC and branch coverage are very nearly the same thing.
The primary difference is in boolean vector tests. One can test for
any of several bits in bit-vector and still obtain 100% branch test
coverage even though the second element of MC/DC - the requirement
that each condition in a decision take on every possible outcome -
might not be satisfied.
https://sqlite.org/testing.html#mcdc
MC/DC comes in different flavors, the most important being unique cause
MC/DC and masking MC/DC. This patch implements masking MC/DC, which is
works well with short circuiting semantics, and according to John
Chilenski's "An Investigation of Three Forms of the Modified Condition
Decision Coverage (MCDC) Criterion" (2001) is as good as unique cause at
catching bugs.
Whalen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
MC/DC" describes an algorithm for finding the masking table from an AST
walk, but my algorithm figures this out by analyzing the control flow
graph. The CFG is considered a reduced ordered binary decision diagram
and an input vector a path through the BDD, which is recorded. Specific
edges will mask ("null out") the contribution from earlier path
segments, which can be determined by finding short circuit endpoints.
Masking is most easily understood as circuiting of terms in the
reverse-ordered Boolean function, and the masked conditions do not
affect the decision like short-circuited conditions do not affect the
decision.
A tag/discriminator mapping from gcond->uid is created during
gimplification and made available through the function struct. The
values are unimportant as long as basic conditions constructed from a
single Boolean expression are given the same identifier. This happens in
the breaking down of ANDIF/ORIF trees, so the coverage generally works
well for frontends that create such trees.
Like Whalen et al this implementation records coverage in fixed-size
bitsets which gcov knows how to interpret. Recording conditions only
requires a few bitwise operations per condition and is very fast, but
comes with a limit on the number of terms in a single boolean
expression; the number of bits in a gcov_unsigned_type (which is usually
typedef'd to uint64_t). For most practical purposes this is acceptable,
and by default a warning will be issued if gcc cannot instrument the
expression. This is a practical limitation in the implementation, and
not a limitation of the algorithm, so support for more conditions can be
supported by introducing arbitrary-sized bitsets.
In action it looks pretty similar to the branch coverage. The -g short
opt carries no significance, but was chosen because it was an available
option with the upper-case free too.
gcov --conditions:
3: 17:void fn (int a, int b, int c, int d) {
3: 18: if ((a && (b || c)) && d)
conditions covered 3/8
condition 0 not covered (true false)
condition 1 not covered (true)
condition 2 not covered (true)
condition 3 not covered (true)
1: 19: x = 1;
-: 20: else
2: 21: x = 2;
3: 22:}
gcov --conditions --json-format:
"conditions": [
{
"not_covered_false": [
0
],
"count": 8,
"covered": 3,
"not_covered_true": [
0,
1,
2,
3
]
}
],
Expressions with constants may be heavily rewritten before it reaches
the gimplification, so constructs like int x = a ? 0 : 1 becomes
_x = (_a == 0). From source you would expect coverage, but it gets
neither branch nor condition coverage. The same applies to expressions
like int x = 1 || a which are simply replaced by a constant.
The test suite contains a lot of small programs and functions. Some of
these were designed by hand to test for specific behaviours and graph
shapes, and some are previously-failed test cases in other programs
adapted into the test suite.
gcc/ChangeLog:
* builtins.cc (expand_builtin_fork_or_exec): Check
condition_coverage_flag.
* collect2.cc (main): Add -fno-condition-coverage to OBSTACK.
* common.opt: Add new options -fcondition-coverage and
-Wcoverage-too-many-conditions.
* doc/gcov.texi: Add --conditions documentation.
* doc/invoke.texi: Add -fcondition-coverage documentation.
* function.cc (free_after_compilation): Free cond_uids.
* function.h (struct function): Add cond_uids.
* gcc.cc: Link gcov on -fcondition-coverage.
* gcov-counter.def (GCOV_COUNTER_CONDS): New.
* gcov-dump.cc (tag_conditions): New.
* gcov-io.h (GCOV_TAG_CONDS): New.
(GCOV_TAG_CONDS_LENGTH): New.
(GCOV_TAG_CONDS_NUM): New.
* gcov.cc (class condition_info): New.
(condition_info::condition_info): New.
(condition_info::popcount): New.
(struct coverage_info): New.
(add_condition_counts): New.
(output_conditions): New.
(print_usage): Add -g, --conditions.
(process_args): Likewise.
(output_intermediate_json_line): Output conditions.
(read_graph_file): Read condition counters.
(read_count_file): Likewise.
(file_summary): Print conditions.
(accumulate_line_info): Accumulate conditions.
(output_line_details): Print conditions.
* gimplify.cc (next_cond_uid): New.
(reset_cond_uid): New.
(shortcut_cond_r): Set condition discriminator.
(tag_shortcut_cond): New.
(gimple_associate_condition_with_expr): New.
(shortcut_cond_expr): Set condition discriminator.
(gimplify_cond_expr): Likewise.
(gimplify_function_tree): Call reset_cond_uid.
* ipa-inline.cc (can_early_inline_edge_p): Check
condition_coverage_flag.
* ipa-split.cc (pass_split_functions::gate): Likewise.
* passes.cc (finish_optimization_passes): Likewise.
* profile.cc (struct condcov): New declaration.
(cov_length): Likewise.
(cov_blocks): Likewise.
(cov_masks): Likewise.
(cov_maps): Likewise.
(cov_free): Likewise.
(instrument_decisions): New.
(read_thunk_profile): Control output to file.
(branch_prob): Call find_conditions, instrument_decisions.
(init_branch_prob): Add total_num_conds.
(end_branch_prob): Likewise.
* tree-core.h (struct tree_exp): Add condition_uid.
* tree-profile.cc (struct conds_ctx): New.
(CONDITIONS_MAX_TERMS): New.
(EDGE_CONDITION): New.
(topological_cmp): New.
(index_of): New.
(single_p): New.
(single_edge): New.
(contract_edge_up): New.
(struct outcomes): New.
(conditional_succs): New.
(condition_index): New.
(condition_uid): New.
(masking_vectors): New.
(emit_assign): New.
(emit_bitwise_op): New.
(make_top_index_visit): New.
(make_top_index): New.
(paths_between): New.
(struct condcov): New.
(cov_length): New.
(cov_blocks): New.
(cov_masks): New.
(cov_maps): New.
(cov_free): New.
(find_conditions): New.
(struct counters): New.
(find_counters): New.
(resolve_counter): New.
(resolve_counters): New.
(instrument_decisions): New.
(tree_profiling): Check condition_coverage_flag.
(pass_ipa_tree_profile::gate): Likewise.
* tree.h (SET_EXPR_UID): New.
(EXPR_COND_UID): New.
libgcc/ChangeLog:
* libgcov-merge.c (__gcov_merge_ior): New.
gcc/testsuite/ChangeLog:
* lib/gcov.exp: Add condition coverage test function.
* g++.dg/gcov/gcov-18.C: New test.
* gcc.misc-tests/gcov-19.c: New test.
* gcc.misc-tests/gcov-20.c: New test.
* gcc.misc-tests/gcov-21.c: New test.
* gcc.misc-tests/gcov-22.c: New test.
* gcc.misc-tests/gcov-23.c: New test.
|
|
This another one of these ICE after error issues with the
gimplifier and a fallout from r12-3278-g823685221de986af.
This case happens when we are trying to fold memcpy/memmove.
There is already code to try to catch ERROR_MARKs as arguments
to the builtins so just need to change them to use error_operand_p
which checks the type of the expression to see if it was an error mark
also.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR c/109619
* builtins.cc (fold_builtin_1): Use error_operand_p
instead of checking against ERROR_MARK.
(fold_builtin_2): Likewise.
(fold_builtin_3): Likewise.
gcc/testsuite/ChangeLog:
PR c/109619
* gcc.dg/redecl-26.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
I've searched for some uses of (HOST_WIDE_INT) constant or (unsigned
HOST_WIDE_INT) constant and turned them into uses of the appropriate
macros.
THere are quite a few cases in non-i386 backends but I've left that out
for now.
The only behavior change is in build_replicated_int_cst where the
left shift was done in HOST_WIDE_INT type but assigned to unsigned
HOST_WIDE_INT, which I've changed into unsigned HOST_WIDE_INT shift.
2024-02-24 Jakub Jelinek <jakub@redhat.com>
gcc/
* builtins.cc (fold_builtin_isascii): Use HOST_WIDE_INT_UC macro.
* combine.cc (make_field_assignment): Use HOST_WIDE_INT_1U macro.
* double-int.cc (double_int::mask): Use HOST_WIDE_INT_UC macros.
* genattrtab.cc (attr_alt_complement): Use HOST_WIDE_INT_1 macro.
(mk_attr_alt): Use HOST_WIDE_INT_0 macro.
* genautomata.cc (bitmap_set_bit, CLEAR_BIT): Use HOST_WIDE_INT_1
macros.
* ipa-strub.cc (can_strub_internally_p): Use HOST_WIDE_INT_1 macro.
* loop-iv.cc (implies_p): Use HOST_WIDE_INT_1U macro.
* pretty-print.cc (test_pp_format): Use HOST_WIDE_INT_C and
HOST_WIDE_INT_UC macros.
* rtlanal.cc (nonzero_bits1): Use HOST_WIDE_INT_UC macro.
* tree.cc (build_replicated_int_cst): Use HOST_WIDE_INT_1U macro.
* tree.h (DECL_OFFSET_ALIGN): Use HOST_WIDE_INT_1U macro.
* tree-ssa-structalias.cc (dump_varinfo): Use ~HOST_WIDE_INT_0U
macros.
* wide-int.cc (divmod_internal_2): Use HOST_WIDE_INT_1U macro.
* config/i386/constraints.md (define_constraint "L"): Use
HOST_WIDE_INT_C macro.
* config/i386/i386.md (movabsq split peephole2): Use HOST_WIDE_INT_C
macro.
(movl + movb peephole2): Likewise.
* config/i386/predicates.md (x86_64_zext_immediate_operand): Likewise.
(const_32bit_mask): Likewise.
gcc/objc/
* objc-encoding.cc (encode_array): Use HOST_WIDE_INT_0 macros.
|
|
Recent libhwasan updates[1] intercept various string and memory functions.
These functions have checking in them, which means there's no need to
inline the checking.
This patch marks said functions as intercepted, and adjusts a testcase
to handle the difference. It also looks for HWASAN in a check in
expand_builtin. This check originally is there to avoid using expand to
inline the behaviour of builtins like memset which are intercepted by
ASAN and hence which we rely on the function call staying as a function
call. With the new reliance on function calls in HWASAN we need to do
the same thing for HWASAN too.
HWASAN and ASAN don't seem to however instrument the same functions.
Looking into libsanitizer/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
it looks like the common ones are memset, memmove and memcpy.
The rest of the routines for asan seem to be defined in
compiler-rt/lib/asan/asan_interceptors.h however compiler-rt/lib/hwasan/
does not have such a file but it does have
compiler-rt/lib/hwasan/hwasan_platform_interceptors.h which it looks like is
forcing off everything but memset, memmove, memcpy, memcmp and bcmp.
As such I've taken those as the final list that hwasan currently supports.
This also means that on future updates this list should be cross checked.
[1] https://discourse.llvm.org/t/hwasan-question-about-the-recent-interceptors-being-added/75351
gcc/ChangeLog:
PR sanitizer/112644
* asan.h (asan_intercepted_p): Incercept memset, memmove, memcpy and
memcmp.
* builtins.cc (expand_builtin): Include HWASAN when checking for
builtin inlining.
gcc/testsuite/ChangeLog:
PR sanitizer/112644
* c-c++-common/hwasan/builtin-special-handling.c: Update testcase.
Co-Authored-By: Matthew Malcomson <matthew.malcomson@arm.com>
|
|
strub: introduce STACK_ADDRESS_OFFSET
Since STACK_POINTER_OFFSET is not necessarily at the boundary between
caller- and callee-owned stack, as desired by
__builtin_stack_address(), and using it as if it were or not causes
problems, introduce a new macro so that ports can define it suitably,
without modifying STACK_POINTER_OFFSET.
for gcc/ChangeLog
PR middle-end/112917
PR middle-end/113100
* builtins.cc (expand_builtin_stack_address): Use
STACK_ADDRESS_OFFSET.
* doc/extend.texi (__builtin_stack_address): Adjust.
* config/sparc/sparc.h (STACK_ADDRESS_OFFSET): Define.
* doc/tm.texi.in (STACK_ADDRESS_OFFSET): Document.
* doc/tm.texi: Rebuilt.
|
|
The symbols for the functions supporting heap-based trampolines were
exported at an incorrect symbol version, the following patch fixes that.
As requested in the PR, this also renames __builtin_nested_func_ptr* to
__gcc_nested_func_ptr*. In carrying our the rename, we move the builtins
to use DEF_EXT_LIB_BUILTIN.
PR libgcc/113402
gcc/ChangeLog:
* builtins.cc (expand_builtin): Handle BUILT_IN_GCC_NESTED_PTR_CREATED
and BUILT_IN_GCC_NESTED_PTR_DELETED.
* builtins.def (BUILT_IN_GCC_NESTED_PTR_CREATED,
BUILT_IN_GCC_NESTED_PTR_DELETED): Make these builtins LIB-EXT and
rename the library fallbacks to __gcc_nested_func_ptr_created and
__gcc_nested_func_ptr_deleted.
* doc/invoke.texi: Rename these to __gcc_nested_func_ptr_created
and __gcc_nested_func_ptr_deleted.
* tree-nested.cc (finalize_nesting_tree_1): Use builtin_explicit for
BUILT_IN_GCC_NESTED_PTR_CREATED and BUILT_IN_GCC_NESTED_PTR_DELETED.
* tree.cc (build_common_builtin_nodes): Build the
BUILT_IN_GCC_NESTED_PTR_CREATED and BUILT_IN_GCC_NESTED_PTR_DELETED local
builtins only for non-explicit.
libgcc/ChangeLog:
* config/aarch64/heap-trampoline.c: Rename
__builtin_nested_func_ptr_created to __gcc_nested_func_ptr_created and
__builtin_nested_func_ptr_deleted to __gcc_nested_func_ptr_deleted.
* config/i386/heap-trampoline.c: Likewise.
* libgcc2.h: Likewise.
* libgcc-std.ver.in (GCC_7.0.0): Likewise and then move
__gcc_nested_func_ptr_created and
__gcc_nested_func_ptr_deleted from this symbol version to ...
(GCC_14.0.0): ... this one.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
|
|
As PR113100 shows, the unbiasing introduced by r14-6737 can
cause the scrubbing to overrun and screw some critical data
on stack like saved toc base consequently cause segfault.
By checking PR112917, IMHO we should keep this unbiasing
guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_ARCH64 &&
TARGET_STACK_BIAS), similar to some existing code special
treating SPARC stack bias.
PR middle-end/113100
gcc/ChangeLog:
* builtins.cc (expand_builtin_stack_address): Guard stack point
adjustment with SPARC_STACK_BOUNDARY_HACK.
|
|
|
|
When fixing the PR, I failed to remove the comment that raised the
very concern that the PR confirmed, and that the earlier patch for the
PR fixed.
for gcc/ChangeLog
PR target/112778
* builtins.cc (try_store_by_multiple_pieces): Drop obsolete
comment.
|
|
The stack pointer is biased by 2047 bytes on sparc64, so the range it
delimits is way off. Unbias the addresses returned by
__builtin_stack_address (), so that the strub builtins, inlined or
not, can function correctly. I've considered introducing a new target
macro, but using STACK_POINTER_OFFSET seems safe, and it enables the
register save areas to be scrubbed as well.
Because of the large fixed-size outgoing args area next to the
register save area on sparc, we still need __strub_leave to not
allocate its own frame, otherwise it won't be able to clear part of
the frame it should.
for gcc/ChangeLog
PR middle-end/112917
* builtins.cc (expand_bultin_stack_address): Add
STACK_POINTER_OFFSET.
* doc/extend.texi (__builtin_stack_address): Adjust.
|
|
Instead of get and set macros to apply a delta, use a single macro
that resorts to a temporary wrapper class to apply it.
for gcc/ChangeLog
* builtins.cc (delta_type): New template class.
(set_apply_args_size, get_apply_args_size): Replace with...
(saved_apply_args_size): ... this.
(set_apply_result_size, get_apply_result_size): Replace with...
(saved_apply_result_size): ... this.
(apply_args_size, apply_result_size): Adjust.
|
|
The computation of apply_args_size and apply_result_size is saved in a
static variable, so that the corresponding _mode arrays are
initialized only once. That is not compatible with switchable
targets, and ARM's arm_set_current_function, by saving and restoring
target globals, exercises this problem with a testcase such as that in
the PR, in which more than one function in the translation unit calls
__builtin_apply or __builtin_return, respectively.
This patch moves the _size statics into the target_builtins array,
with a bit of ugliness over _plus_one so that zero initialization of
the struct does the right thing.
for gcc/ChangeLog
PR target/112334
* builtins.h (target_builtins): Add fields for apply_args_size
and apply_result_size.
* builtins.cc (apply_args_size, apply_result_size): Cache
results in fields rather than in static variables.
(get_apply_args_size, set_apply_args_size): New.
(get_apply_result_size, set_apply_result_size): New.
|
|
The recently-added logic for -finline-stringops=memset introduced an
assumption that doesn't necessarily hold, namely, that
can_store_by_pieces of a larger size implies can_store_by_pieces by
smaller sizes. Checks for all sizes the by-multiple-pieces machinery
might use before committing to an expansion pattern.
for gcc/ChangeLog
PR target/112778
* builtins.cc (can_store_by_multiple_pieces): New.
(try_store_by_multiple_pieces): Call it.
for gcc/testsuite/ChangeLog
PR target/112778
* gcc.dg/inline-mem-cmp-pr112778.c: New.
|
|
On aarch64 -milp32, and presumably on other such targets, ptr can be
in a different mode than ptr_mode in the testcase. Cope with it.
for gcc/ChangeLog
PR target/112804
* builtins.cc (try_store_by_multiple_pieces): Use ptr's mode
for the increment.
for gcc/testsuite/ChangeLog
PR target/112804
* gcc.target/aarch64/inline-mem-set-pr112804.c: New.
|
|
This commit adds -fopenmp-allocators which enables support for
'omp allocators' and 'omp allocate' that are associated with a Fortran
allocate-stmt. If such a construct is encountered, an error is shown,
unless the -fopenmp-allocators flag is present.
With -fopenmp -fopenmp-allocators, those constructs get turned into
GOMP_alloc allocations, while -fopenmp-allocators (also without -fopenmp)
ensures deallocation and reallocation (via intrinsic assignments) are
properly directed to GOMP_free/omp_realloc - while normal Fortran
allocations are processed by free/realloc.
In order to distinguish a 'malloc'ed from a 'GOMP_alloc'ed memory, the
version field of the Fortran array discriptor is (mis)used: 0 indicates
the normal Fortran allocation while 1 denotes GOMP_alloc. For scalars,
there is record keeping in libgomp: GOMP_add_alloc(ptr) will add the
pointer address to a splay_tree while GOMP_is_alloc(ptr) will return
true it was previously added but also removes it from the list.
Besides Fortran FE work, BUILT_IN_GOMP_REALLOC is no part of
omp-builtins.def and libgomp gains the mentioned two new function.
gcc/ChangeLog:
* builtin-types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New.
* omp-builtins.def (BUILT_IN_GOMP_REALLOC): New.
* builtins.cc (builtin_fnspec): Handle it.
* gimple-ssa-warn-access.cc (fndecl_alloc_p,
matching_alloc_calls_p): Likewise.
* gimple.cc (nonfreeing_call_p): Likewise.
* predict.cc (expr_expected_value_1): Likewise.
* tree-ssa-ccp.cc (evaluate_stmt): Likewise.
* tree.cc (fndecl_dealloc_argno): Likewise.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_node): Handle EXEC_OMP_ALLOCATE
and EXEC_OMP_ALLOCATORS.
* f95-lang.cc (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST):
Add 'ECF_LEAF | ECF_MALLOC' to existing 'ECF_NOTHROW'.
(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): Define.
* gfortran.h (gfc_omp_clauses): Add contained_in_target_construct.
* invoke.texi (-fopenacc, -fopenmp): Update based on C version.
(-fopenmp-simd): New, based on C version.
(-fopenmp-allocators): New.
* lang.opt (fopenmp-allocators): Add.
* openmp.cc (resolve_omp_clauses): For allocators/allocate directive,
add target and no dynamic_allocators diagnostic and more invalid
diagnostic.
* parse.cc (decode_omp_directive): Set contains_teams_construct.
* trans-array.h (gfc_array_allocate): Update prototype.
(gfc_conv_descriptor_version): New prototype.
* trans-decl.cc (gfc_init_default_dt): Fix comment.
* trans-array.cc (gfc_conv_descriptor_version): New.
(gfc_array_allocate): Support GOMP_alloc allocation.
(gfc_alloc_allocatable_for_assignment, structure_alloc_comps):
Handle GOMP_free/omp_realloc as needed.
* trans-expr.cc (gfc_conv_procedure_call): Likewise.
(alloc_scalar_allocatable_for_assignment): Likewise.
* trans-intrinsic.cc (conv_intrinsic_move_alloc): Likewise.
* trans-openmp.cc (gfc_trans_omp_allocators,
gfc_trans_omp_directive): Handle allocators/allocate directive.
(gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New.
* trans-stmt.h (gfc_trans_allocate): Update prototype.
* trans-stmt.cc (gfc_trans_allocate): Support GOMP_alloc.
* trans-types.cc (gfc_get_dtype_rank_type): Set version field.
* trans.cc (gfc_allocate_using_malloc, gfc_allocate_allocatable):
Update to handle GOMP_alloc.
(gfc_deallocate_with_status, gfc_deallocate_scalar_with_status):
Handle GOMP_free.
(trans_code): Update call.
* trans.h (gfc_allocate_allocatable, gfc_allocate_using_malloc):
Update prototype.
(gfc_omp_call_add_alloc, gfc_omp_call_is_alloc): New prototype.
* types.def (BT_FN_PTR_PTR_SIZE_PTRMODE_PTRMODE): New.
libgomp/ChangeLog:
* allocator.c (struct fort_alloc_splay_tree_key_s,
fort_alloc_splay_compare, GOMP_add_alloc, GOMP_is_alloc): New.
* libgomp.h: Define splay_tree_static for 'reverse' splay tree.
* libgomp.map (GOMP_5.1.2): New; add GOMP_add_alloc and
GOMP_is_alloc; move GOMP_target_map_indirect_ptr from ...
(GOMP_5.1.1): ... here.
* libgomp.texi (Impl. Status, Memory management): Update for
allocators/allocate directives.
* splay-tree.c: Handle splay_tree_static define to declare all
functions as static.
(splay_tree_lookup_node): New.
* splay-tree.h: Handle splay_tree_decl_only define.
(splay_tree_lookup_node): New prototype.
* target.c: Define splay_tree_static for 'reverse'.
* testsuite/libgomp.fortran/allocators-1.f90: New test.
* testsuite/libgomp.fortran/allocators-2.f90: New test.
* testsuite/libgomp.fortran/allocators-3.f90: New test.
* testsuite/libgomp.fortran/allocators-4.f90: New test.
* testsuite/libgomp.fortran/allocators-5.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-14.f90: Add coarray and
not-listed tests.
* gfortran.dg/gomp/allocate-5.f90: Remove sorry dg-message.
* gfortran.dg/bind_c_array_params_2.f90: Update expected
dump for dtype '.version=0'.
* gfortran.dg/gomp/allocate-16.f90: New test.
* gfortran.dg/gomp/allocators-3.f90: New test.
* gfortran.dg/gomp/allocators-4.f90: New test.
|
|
This patch adds the strub attribute for function and variable types,
command-line options, passes and adjustments to implement it,
documentation, and tests.
Stack scrubbing is implemented in a machine-independent way: functions
with strub enabled are modified so that they take an extra stack
watermark argument, that they update with their stack use, and the
caller can then zero it out once it regains control, whether by return
or exception. There are two ways to go about it: at-calls, that
modifies the visible interface (signature) of the function, and
internal, in which the body is moved to a clone, the clone undergoes
the interface change, and the function becomes a wrapper, preserving
its original interface, that calls the clone and then clears the stack
used by it.
Variables can also be annotated with the strub attribute, so that
functions that read from them get stack scrubbing enabled implicitly,
whether at-calls, for functions only usable within a translation unit,
or internal, for functions whose interfaces must not be modified.
There is a strict mode, in which functions that have their stack
scrubbed can only call other functions with stack-scrubbing
interfaces, or those explicitly marked as callable from strub
contexts, so that an entire call chain gets scrubbing, at once or
piecemeal depending on optimization levels. In the default mode,
relaxed, this requirement is not enforced by the compiler.
The implementation adds two IPA passes, one that assigns strub modes
early on, another that modifies interfaces and adds calls to the
builtins that jointly implement stack scrubbing. Another builtin,
that obtains the stack pointer, is added for use in the implementation
of the builtins, whether expanded inline or called in libgcc.
There are new command-line options to change operation modes and to
force the feature disabled; it is enabled by default, but it has no
effect and is implicitly disabled if the strub attribute is never
used. There are also options meant to use for testing the feature,
enabling different strubbing modes for all (viable) functions.
for gcc/ChangeLog
* Makefile.in (OBJS): Add ipa-strub.o.
(GTFILES): Add ipa-strub.cc.
* builtins.def (BUILT_IN_STACK_ADDRESS): New.
(BUILT_IN___STRUB_ENTER): New.
(BUILT_IN___STRUB_UPDATE): New.
(BUILT_IN___STRUB_LEAVE): New.
* builtins.cc: Include ipa-strub.h.
(STACK_STOPS, STACK_UNSIGNED): Define.
(expand_builtin_stack_address): New.
(expand_builtin_strub_enter): New.
(expand_builtin_strub_update): New.
(expand_builtin_strub_leave): New.
(expand_builtin): Call them.
* common.opt (fstrub=*): New options.
* doc/extend.texi (strub): New type attribute.
(__builtin_stack_address): New function.
(Stack Scrubbing): New section.
* doc/invoke.texi (-fstrub=*): New options.
(-fdump-ipa-*): New passes.
* gengtype-lex.l: Ignore multi-line pp-directives.
* ipa-inline.cc: Include ipa-strub.h.
(can_inline_edge_p): Test strub_inlinable_to_p.
* ipa-split.cc: Include ipa-strub.h.
(execute_split_functions): Test strub_splittable_p.
* ipa-strub.cc, ipa-strub.h: New.
* passes.def: Add strub_mode and strub passes.
* tree-cfg.cc (gimple_verify_flow_info): Note on debug stmts.
* tree-pass.h (make_pass_ipa_strub_mode): Declare.
(make_pass_ipa_strub): Declare.
(make_pass_ipa_function_and_variable_visibility): Fix
formatting.
* tree-ssa-ccp.cc (optimize_stack_restore): Keep restores
before strub leave.
* attribs.cc: Include ipa-strub.h.
(decl_attributes): Support applying attributes to function
type, rather than pointer type, at handler's request.
(comp_type_attributes): Combine strub_comptypes and target
comp_type results.
* doc/tm.texi.in (TARGET_STRUB_USE_DYNAMIC_ARRAY): New.
(TARGET_STRUB_MAY_USE_MEMSET): New.
* doc/tm.texi: Rebuilt.
* cgraph.h (symtab_node::reset): Add preserve_comdat_group
param, with a default.
* cgraphunit.cc (symtab_node::reset): Use it.
for gcc/c-family/ChangeLog
* c-attribs.cc: Include ipa-strub.h.
(handle_strub_attribute): New.
(c_common_attribute_table): Add strub.
for gcc/ada/ChangeLog
* gcc-interface/trans.cc: Include ipa-strub.h.
(gigi): Make internal decls for targets of compiler-generated
calls strub-callable too.
(build_raise_check): Likewise.
* gcc-interface/utils.cc: Include ipa-strub.h.
(handle_strub_attribute): New.
(gnat_internal_attribute_table): Add strub.
for gcc/testsuite/ChangeLog
* c-c++-common/strub-O0.c: New.
* c-c++-common/strub-O1.c: New.
* c-c++-common/strub-O2.c: New.
* c-c++-common/strub-O2fni.c: New.
* c-c++-common/strub-O3.c: New.
* c-c++-common/strub-O3fni.c: New.
* c-c++-common/strub-Og.c: New.
* c-c++-common/strub-Os.c: New.
* c-c++-common/strub-all1.c: New.
* c-c++-common/strub-all2.c: New.
* c-c++-common/strub-apply1.c: New.
* c-c++-common/strub-apply2.c: New.
* c-c++-common/strub-apply3.c: New.
* c-c++-common/strub-apply4.c: New.
* c-c++-common/strub-at-calls1.c: New.
* c-c++-common/strub-at-calls2.c: New.
* c-c++-common/strub-defer-O1.c: New.
* c-c++-common/strub-defer-O2.c: New.
* c-c++-common/strub-defer-O3.c: New.
* c-c++-common/strub-defer-Os.c: New.
* c-c++-common/strub-internal1.c: New.
* c-c++-common/strub-internal2.c: New.
* c-c++-common/strub-parms1.c: New.
* c-c++-common/strub-parms2.c: New.
* c-c++-common/strub-parms3.c: New.
* c-c++-common/strub-relaxed1.c: New.
* c-c++-common/strub-relaxed2.c: New.
* c-c++-common/strub-short-O0-exc.c: New.
* c-c++-common/strub-short-O0.c: New.
* c-c++-common/strub-short-O1.c: New.
* c-c++-common/strub-short-O2.c: New.
* c-c++-common/strub-short-O3.c: New.
* c-c++-common/strub-short-Os.c: New.
* c-c++-common/strub-strict1.c: New.
* c-c++-common/strub-strict2.c: New.
* c-c++-common/strub-tail-O1.c: New.
* c-c++-common/strub-tail-O2.c: New.
* c-c++-common/torture/strub-callable1.c: New.
* c-c++-common/torture/strub-callable2.c: New.
* c-c++-common/torture/strub-const1.c: New.
* c-c++-common/torture/strub-const2.c: New.
* c-c++-common/torture/strub-const3.c: New.
* c-c++-common/torture/strub-const4.c: New.
* c-c++-common/torture/strub-data1.c: New.
* c-c++-common/torture/strub-data2.c: New.
* c-c++-common/torture/strub-data3.c: New.
* c-c++-common/torture/strub-data4.c: New.
* c-c++-common/torture/strub-data5.c: New.
* c-c++-common/torture/strub-indcall1.c: New.
* c-c++-common/torture/strub-indcall2.c: New.
* c-c++-common/torture/strub-indcall3.c: New.
* c-c++-common/torture/strub-inlinable1.c: New.
* c-c++-common/torture/strub-inlinable2.c: New.
* c-c++-common/torture/strub-ptrfn1.c: New.
* c-c++-common/torture/strub-ptrfn2.c: New.
* c-c++-common/torture/strub-ptrfn3.c: New.
* c-c++-common/torture/strub-ptrfn4.c: New.
* c-c++-common/torture/strub-pure1.c: New.
* c-c++-common/torture/strub-pure2.c: New.
* c-c++-common/torture/strub-pure3.c: New.
* c-c++-common/torture/strub-pure4.c: New.
* c-c++-common/torture/strub-run1.c: New.
* c-c++-common/torture/strub-run2.c: New.
* c-c++-common/torture/strub-run3.c: New.
* c-c++-common/torture/strub-run4.c: New.
* c-c++-common/torture/strub-run4c.c: New.
* c-c++-common/torture/strub-run4d.c: New.
* c-c++-common/torture/strub-run4i.c: New.
* g++.dg/strub-run1.C: New.
* g++.dg/torture/strub-init1.C: New.
* g++.dg/torture/strub-init2.C: New.
* g++.dg/torture/strub-init3.C: New.
* gnat.dg/strub_attr.adb, gnat.dg/strub_attr.ads: New.
* gnat.dg/strub_ind.adb, gnat.dg/strub_ind.ads: New.
for libgcc/ChangeLog
* Makefile.in (LIB2ADD): Add strub.c.
* libgcc2.h (__strub_enter, __strub_update, __strub_leave):
Declare.
* strub.c: New.
* libgcc-std.ver.in (__strub_enter): Add to GCC_14.0.0.
(__strub_update, __strub_leave): Likewise.
|
|
The following avoids turning aggregate copy involving non-default
address-spaces to memcpy since that is not prepared for that.
GIMPLE verification no longer accepts WITH_SIZE_EXPR in aggregate
copies, the following re-allows that for the RHS. I also needed
to adjust one assert in DCE.
get_memory_address is used for string builtin expansion, so instead
of fixing that up for non-generic address-spaces I've put an assert
there.
I'll note that the same issue exists for initialization from an
empty CTOR which we gimplify to a memset call but since we are
not prepared to handle RTL expansion of the original VLA init and
I failed to provide test coverage (without extending the GNU C
extension for VLA structs) and the Ada frontend (or other frontends)
to not have address-space support the patch instead asserts we only
see generic address-spaces there.
PR middle-end/112830
* gimplify.cc (gimplify_modify_expr): Avoid turning aggregate
copy of non-generic address-spaces to memcpy.
(gimplify_modify_expr_to_memcpy): Assert we are dealing with
a copy inside the generic address-space.
(gimplify_modify_expr_to_memset): Likewise.
* tree-cfg.cc (verify_gimple_assign_single): Allow
WITH_SIZE_EXPR as part of the RHS of an assignment.
* builtins.cc (get_memory_address): Assert we are dealing
with the generic address-space.
* tree-ssa-dce.cc (ref_may_be_aliased): Handle WITH_SIZE_EXPR.
* gcc.target/avr/pr112830.c: New testcase.
* gcc.target/i386/pr112830.c: Likewise.
|
|
try_store_by_multiple_pieces was added not long ago, enabling
variable-sized memset to be expanded inline when the worst-case
in-range constant length would, using conditional blocks with powers
of two to cover all possibilities of length and alignment.
This patch introduces -finline-stringops[=fn] to request expansions to
start with a loop, so as to still take advantage of known alignment
even with long lengths, but without necessarily adding store blocks
for every power of two.
This makes it possible for the supported stringops (memset, memcpy,
memmove, memset) to be expanded, even if storing a single byte per
iteration. Surely efficient implementations can run faster, with a
pre-loop to increase alignment, but that would likely be excessive for
inline expansions.
Still, in some cases, such as in freestanding environments, users
prefer to inline such stringops, especially those that the compiler
may introduce itself, even if the expansion is not as performant as a
highly optimized C library implementation could be, to avoid
depending on a C runtime library.
for gcc/ChangeLog
* expr.cc (emit_block_move_hints): Take ctz of len. Obey
-finline-stringops. Use oriented or sized loop.
(emit_block_move): Take ctz of len, and pass it on.
(emit_block_move_via_sized_loop): New.
(emit_block_move_via_oriented_loop): New.
(emit_block_move_via_loop): Take incr. Move an incr-sized
block per iteration.
(emit_block_cmp_via_cmpmem): Take ctz of len. Obey
-finline-stringops.
(emit_block_cmp_via_loop): New.
* expr.h (emit_block_move): Add ctz of len defaulting to zero.
(emit_block_move_hints): Likewise.
(emit_block_cmp_hints): Likewise.
* builtins.cc (expand_builtin_memory_copy_args): Pass ctz of
len to emit_block_move_hints.
(try_store_by_multiple_pieces): Support starting with a loop.
(expand_builtin_memcmp): Pass ctz of len to
emit_block_cmp_hints.
(expand_builtin): Allow inline expansion of memset, memcpy,
memmove and memcmp if requested.
* common.opt (finline-stringops): New.
(ilsop_fn): New enum.
* flag-types.h (enum ilsop_fn): New.
* doc/invoke.texi (-finline-stringops): Add.
for gcc/testsuite/ChangeLog
* gcc.dg/torture/inline-mem-cmp-1.c: New.
* gcc.dg/torture/inline-mem-cpy-1.c: New.
* gcc.dg/torture/inline-mem-cpy-cmp-1.c: New.
* gcc.dg/torture/inline-mem-move-1.c: New.
* gcc.dg/torture/inline-mem-set-1.c: New.
|
|
As the testcase shows, I've missed one spot where initially the code thinks
it could use 2 argument IFN_CLZ/IFN_CTZ form, but then verifies it can't
because it doesn't have the right target value and turns it into the
arg0 ? arg1 : .C[LT]Z (arg0)
form. That form evaluates the argument twice though and so needs save_expr,
which I've missed to call in that case. In other cases where it is known
from the beginning that it will be needed (e.g. the __builtin_clzg case
on types smaller than unsigned int where we'll need to add an addend
to the clz value) or the unsigned __int128 expansion called save_expr
before.
2023-11-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/112639
* builtins.cc (fold_builtin_bit_query): If arg0 has side-effects, arg1
is specified but cleared, call save_expr on arg0.
* gcc.dg/torture/pr112639.c: New test.
|
|
While filing a clang request to return 18 on _BitInts for
__builtin_classify_type instead of -1 they return currently, I've
noticed that we return -1 for vector types. Initially I wanted to change
behavior just for __builtin_classify_type (type) form, as that is new in
GCC 14 and we've returned for 20+ years -1 for __builtin_classify_type
on vector expressions, but I was convinved otherwise, so this changes
the behavior even for that and now returns 19.
2023-11-20 Jakub Jelinek <jakub@redhat.com>
gcc/
* typeclass.h (enum type_class): Add vector_type_class.
* builtins.cc (type_to_class): Return vector_type_class for
VECTOR_TYPE.
* doc/extend.texi (__builtin_classify_type): Mention bit-precise
integer types and vector types.
gcc/testsuite/
* c-c++-common/builtin-classify-type-1.c (main): Add tests for vector
types.
|
|
The following patch adds 6 new type-generic builtins,
__builtin_clzg
__builtin_ctzg
__builtin_clrsbg
__builtin_ffsg
__builtin_parityg
__builtin_popcountg
The g at the end stands for generic because the unsuffixed variant
of the builtins already have unsigned int or int arguments.
The main reason to add these is to support arbitrary unsigned (for
clrsb/ffs signed) bit-precise integer types and also __int128 which
wasn't supported by the existing builtins, so that e.g. <stdbit.h>
type-generic functions could then support not just bit-precise unsigned
integer type whose width matches a standard or extended integer type,
but others too.
None of these new builtins promote their first argument, so the argument
can be e.g. unsigned char or unsigned short or unsigned __int20 etc.
The first 2 support either 1 or 2 arguments, if only 1 argument is supplied,
the behavior is undefined for argument 0 like for other __builtin_c[lt]z*
builtins, if 2 arguments are supplied, the second argument should be int
that will be returned if the argument is 0. All other builtins have
just one argument. For __builtin_clrsbg and __builtin_ffsg the argument
shall be any signed standard/extended or bit-precise integer, for the others
any unsigned standard/extended or bit-precise integer (bool not allowed).
One possibility would be to also allow signed integer types for
the clz/ctz/parity/popcount ones (and just cast the argument to
unsigned_type_for during folding) and similarly unsigned integer types
for the clrsb/ffs ones, dunno what is better; for stdbit.h the current
version is sufficient and diagnoses use of the inappropriate sign,
though on the other side I wonder if users won't be confused by
__builtin_clzg (1) being an error and having to write __builtin_clzg (1U).
The new builtins are lowered to corresponding builtins with other suffixes
or internal calls (plus casts and adjustments where needed) during FE
folding or during gimplification at latest, the non-suffixed builtins
handling precisions up to precision of int, l up to precision of long,
ll up to precision of long long, up to __int128 precision lowered to
double-word expansion early and the rest (which must be _BitInt) lowered
to internal fn calls - those are then lowered during bitint lowering pass.
The patch also changes representation of IFN_CLZ and IFN_CTZ calls,
previously they were in the IL only if they are directly supported optab
and depending on C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 they had or didn't
have defined behavior at 0, now they are in the IL either if directly
supported optab, or for the large/huge BITINT_TYPEs and they have either
1 or 2 arguments. If one, the behavior is undefined at zero, if 2, the
second argument is an int constant that should be returned for 0.
As there is no extra support during expansion, for directly supported optab
the second argument if present should still match the
C[LT]Z_DEFINED_VALUE_AT_ZERO (...) == 2 value, but for BITINT_TYPE arguments
it can be arbitrary int INTEGER_CST.
The indended uses in stdbit.h are e.g.
#ifdef __has_builtin
#if __has_builtin(__builtin_clzg) && __has_builtin(__builtin_ctzg) && __has_builtin(__builtin_popcountg)
#define stdc_leading_zeros(value) \
((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
#define stdc_leading_ones(value) \
((unsigned int) __builtin_clzg ((__typeof (value)) ~(value), __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
#define stdc_first_trailing_one(value) \
((unsigned int) (__builtin_ctzg (value, -1) + 1))
#define stdc_trailing_zeros(value) \
((unsigned int) __builtin_ctzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
#endif
#endif
where __builtin_popcountg ((__typeof (x)) -1) computes the bit precision
of x's type (kind of _Bitwidthof (x) alternative).
They also allow casting of arbitrary unsigned _BitInt other than
unsigned _BitInt(1) to corresponding signed _BitInt by using
signed _BitInt(__builtin_popcountg ((__typeof (a)) -1))
and of arbitrary signed _BitInt to corresponding unsigned _BitInt
using unsigned _BitInt(__builtin_clrsbg ((__typeof (a)) -1) + 1).
2023-11-14 Jakub Jelinek <jakub@redhat.com>
PR c/111309
gcc/
* builtins.def (BUILT_IN_CLZG, BUILT_IN_CTZG, BUILT_IN_CLRSBG,
BUILT_IN_FFSG, BUILT_IN_PARITYG, BUILT_IN_POPCOUNTG): New
builtins.
* builtins.cc (fold_builtin_bit_query): New function.
(fold_builtin_1): Use it for
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
(fold_builtin_2): Use it for BUILT_IN_{CLZ,CTZ}G.
* fold-const-call.cc: Fix comment typo on tm.h inclusion.
(fold_const_call_ss): Handle
CFN_BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
(fold_const_call_sss): New function.
(fold_const_call_1): Call it for 2 argument functions returning
scalar when passed 2 INTEGER_CSTs.
* genmatch.cc (cmp_operand): For function calls also compare
number of arguments.
(fns_cmp): New function.
(dt_node::gen_kids): Sort fns and generic_fns.
(dt_node::gen_kids_1): Handle fns with the same id but different
number of arguments.
* match.pd (CLZ simplifications): Drop checks for defined behavior
at zero. Add variant of simplifications for IFN_CLZ with 2 arguments.
(CTZ simplifications): Drop checks for defined behavior at zero,
don't optimize precisions above MAX_FIXED_MODE_SIZE. Add variant of
simplifications for IFN_CTZ with 2 arguments.
(a != 0 ? CLZ(a) : CST -> .CLZ(a)): Use TREE_TYPE (@3) instead of
type, add BITINT_TYPE handling, create 2 argument IFN_CLZ rather than
one argument. Add variant for matching CLZ with 2 arguments.
(a != 0 ? CTZ(a) : CST -> .CTZ(a)): Similarly.
* gimple-lower-bitint.cc (bitint_large_huge::lower_bit_query): New
method.
(bitint_large_huge::lower_call): Use it for IFN_{CLZ,CTZ,CLRSB,FFS}
and IFN_{PARITY,POPCOUNT} calls.
* gimple-range-op.cc (cfn_clz::fold_range): Don't check
CLZ_DEFINED_VALUE_AT_ZERO for m_gimple_call_internal_p, instead
assume defined value at zero if the call has 2 arguments and use
second argument value for that case.
(cfn_ctz::fold_range): Similarly.
(gimple_range_op_handler::maybe_builtin_call): Use op_cfn_clz_internal
or op_cfn_ctz_internal only if internal fn call has 2 arguments and
set m_op2 in that case.
* tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern,
vect_recog_popcount_clz_ctz_ffs_pattern): For value defined at zero
use second argument of calls if present, otherwise assume UB at zero,
create 2 argument .CLZ/.CTZ calls if needed.
* tree-vect-stmts.cc (vectorizable_call): Handle 2 argument .CLZ/.CTZ
calls.
* tree-ssa-loop-niter.cc (build_cltz_expr): Create 2 argument
.CLZ/.CTZ calls if needed.
* tree-ssa-forwprop.cc (simplify_count_trailing_zeroes): Create 2
argument .CTZ calls if needed.
* tree-ssa-phiopt.cc (cond_removal_in_builtin_zero_pattern): Handle
2 argument .CLZ/.CTZ calls, handle BITINT_TYPE, create 2 argument
.CLZ/.CTZ calls.
* doc/extend.texi (__builtin_clzg, __builtin_ctzg, __builtin_clrsbg,
__builtin_ffsg, __builtin_parityg, __builtin_popcountg): Document.
gcc/c-family/
* c-common.cc (check_builtin_function_arguments): Handle
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
* c-gimplify.cc (c_gimplify_expr): If __builtin_c[lt]zg second
argument hasn't been folded into constant yet, transform it to one
argument call inside of a COND_EXPR which for first argument 0
returns the second argument.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote first argument
of BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
gcc/cp/
* call.cc (magic_varargs_p): Return 4 for
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
(build_over_call): Don't promote first argument of
BUILT_IN_{CLZ,CTZ,CLRSB,FFS,PARITY,POPCOUNT}G.
* cp-gimplify.cc (cp_gimplify_expr): For BUILT_IN_C{L,T}ZG use
c_gimplify_expr.
gcc/testsuite/
* c-c++-common/pr111309-1.c: New test.
* c-c++-common/pr111309-2.c: New test.
* gcc.dg/torture/bitint-43.c: New test.
* gcc.dg/torture/bitint-44.c: New test.
|
|
c_readstr only operated on integer modes. It worked by reading
the source string into an array of HOST_WIDE_INTs, converting
that array into a wide_int, and from there to an rtx.
It's simpler to do this by building a target memory image and
using native_decode_rtx to convert that memory image into an rtx.
It avoids all the endianness shenanigans because both the string and
native_decode_rtx follow target memory order. It also means that the
function can handle all fixed-size modes, which simplifies callers
and allows vector modes to be used more widely.
gcc/
* builtins.h (c_readstr): Take a fixed_size_mode rather than a
scalar_int_mode.
* builtins.cc (c_readstr): Likewise. Build a local array of
bytes and use native_decode_rtx to get the rtx image.
(builtin_memcpy_read_str): Simplify accordingly.
(builtin_strncpy_read_str): Likewise.
(builtin_memset_read_str): Likewise.
(builtin_memset_gen_str): Likewise.
* expr.cc (string_cst_read_str): Likewise.
|
|
Make __atomic_test_and_set consistent with other __atomic_ and __sync_
builtins: call a matching library function instead of emitting
non-atomic code when the target has no direct insn support.
There's special-case code handling targetm.atomic_test_and_set_trueval
!= 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously,
if that worked but its matching emit_store_flag_force returned NULL,
we'd segfault later on. Now that the caller handles NULL, gcc_assert
here instead.
While the referenced PR:s are ARM-specific, the issue is general.
PR target/107567
PR target/109166
* builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>:
Handle failure from expand_builtin_atomic_test_and_set.
* optabs.cc (expand_atomic_test_and_set): When all attempts fail to
generate atomic code through target support, return NULL
instead of emitting non-atomic code. Also, for code handling
targetm.atomic_test_and_set_trueval != 1, gcc_assert result
from calling emit_store_flag_force instead of returning NULL.
|
|
As mentioned in my stdckdint.h mail, __builtin_classify_type has
a problem that argument promotion (the argument is passed to ...
prototyped builtin function) means that certain type classes will
simply never appear.
I think it is too late to change how it behaves, lots of code in the
wild might rely on the current behavior.
So, the following patch adds option to use a typename rather than
expression as the operand to the builtin, making it behave similarly
to sizeof, typeof or say the clang _Generic extension where the
first argument can be there not just expression, but also typename.
I think we have other prior art here, e.g. __builtin_va_arg also
expects typename.
I've added this to both C and C++, because it would be weird if it
supported it only in C and not in C++.
2023-09-20 Jakub Jelinek <jakub@redhat.com>
gcc/
* builtins.h (type_to_class): Declare.
* builtins.cc (type_to_class): No longer static. Return
int rather than enum.
* doc/extend.texi (__builtin_classify_type): Document.
gcc/c/
* c-parser.cc (c_parser_postfix_expression_after_primary): Parse
__builtin_classify_type call with typename as argument.
gcc/cp/
* parser.cc (cp_parser_postfix_expression): Parse
__builtin_classify_type call with typename as argument.
* pt.cc (tsubst_copy_and_build): Handle __builtin_classify_type
with dependent typename as argument.
gcc/testsuite/
* c-c++-common/builtin-classify-type-1.c: New test.
* g++.dg/ext/builtin-classify-type-1.C: New test.
* g++.dg/ext/builtin-classify-type-2.C: New test.
* gcc.dg/builtin-classify-type-1.c: New test.
|
|
The following patch introduces the middle-end part of the _BitInt
support, a new BITINT_TYPE, handling it where needed, except the lowering
pass and sanitizer support.
2023-09-06 Jakub Jelinek <jakub@redhat.com>
PR c/102989
* tree.def (BITINT_TYPE): New type.
* tree.h (TREE_CHECK6, TREE_NOT_CHECK6): Define.
(NUMERICAL_TYPE_CHECK, INTEGRAL_TYPE_P): Include
BITINT_TYPE.
(BITINT_TYPE_P): Define.
(CONSTRUCTOR_BITFIELD_P): Return true even for BLKmode bit-fields if
they have BITINT_TYPE type.
(tree_check6, tree_not_check6): New inline functions.
(any_integral_type_check): Include BITINT_TYPE.
(build_bitint_type): Declare.
* tree.cc (tree_code_size, wide_int_to_tree_1, cache_integer_cst,
build_zero_cst, type_hash_canon_hash, type_cache_hasher::equal,
type_hash_canon): Handle BITINT_TYPE.
(bitint_type_cache): New variable.
(build_bitint_type): New function.
(signed_or_unsigned_type_for, verify_type_variant, verify_type):
Handle BITINT_TYPE.
(tree_cc_finalize): Free bitint_type_cache.
* builtins.cc (type_to_class): Handle BITINT_TYPE.
(fold_builtin_unordered_cmp): Handle BITINT_TYPE like INTEGER_TYPE.
* cfgexpand.cc (expand_debug_expr): Punt on BLKmode BITINT_TYPE
INTEGER_CSTs.
* convert.cc (convert_to_pointer_1, convert_to_real_1,
convert_to_complex_1): Handle BITINT_TYPE like INTEGER_TYPE.
(convert_to_integer_1): Likewise. For BITINT_TYPE don't check
GET_MODE_PRECISION (TYPE_MODE (type)).
* doc/generic.texi (BITINT_TYPE): Document.
* doc/tm.texi.in (TARGET_C_BITINT_TYPE_INFO): New.
* doc/tm.texi: Regenerated.
* dwarf2out.cc (base_type_die, is_base_type, modified_type_die,
gen_type_die_with_usage): Handle BITINT_TYPE.
(rtl_for_decl_init): Punt on BLKmode BITINT_TYPE INTEGER_CSTs or
handle those which fit into shwi.
* expr.cc (expand_expr_real_1): Define EXTEND_BITINT macro, reduce
to bitfield precision reads from BITINT_TYPE vars, parameters or
memory locations. Expand large/huge BITINT_TYPE INTEGER_CSTs into
memory.
* fold-const.cc (fold_convert_loc, make_range_step): Handle
BITINT_TYPE.
(extract_muldiv_1): For BITINT_TYPE use TYPE_PRECISION rather than
GET_MODE_SIZE (SCALAR_INT_TYPE_MODE).
(native_encode_int, native_interpret_int, native_interpret_expr):
Handle BITINT_TYPE.
* gimple-expr.cc (useless_type_conversion_p): Make BITINT_TYPE
to some other integral type or vice versa conversions non-useless.
* gimple-fold.cc (gimple_fold_builtin_memset): Punt for BITINT_TYPE.
(clear_padding_unit): Mention in comment that _BitInt types don't need
to fit either.
(clear_padding_bitint_needs_padding_p): New function.
(clear_padding_type_may_have_padding_p): Handle BITINT_TYPE.
(clear_padding_type): Likewise.
* internal-fn.cc (expand_mul_overflow): For unsigned non-mode
precision operands force pos_neg? to 1.
(expand_MULBITINT, expand_DIVMODBITINT, expand_FLOATTOBITINT,
expand_BITINTTOFLOAT): New functions.
* internal-fn.def (MULBITINT, DIVMODBITINT, FLOATTOBITINT,
BITINTTOFLOAT): New internal functions.
* internal-fn.h (expand_MULBITINT, expand_DIVMODBITINT,
expand_FLOATTOBITINT, expand_BITINTTOFLOAT): Declare.
* match.pd (non-equality compare simplifications from fold_binary):
Punt if TYPE_MODE (arg1_type) is BLKmode.
* pretty-print.h (pp_wide_int): Handle printing of large precision
wide_ints which would buffer overflow digit_buffer.
* stor-layout.cc (finish_bitfield_representative): For bit-fields
with BITINT_TYPE, prefer representatives with precisions in
multiple of limb precision.
(layout_type): Handle BITINT_TYPE. Handle COMPLEX_TYPE with BLKmode
element type and assert it is BITINT_TYPE.
* target.def (bitint_type_info): New C target hook.
* target.h (struct bitint_info): New type.
* targhooks.cc (default_bitint_type_info): New function.
* targhooks.h (default_bitint_type_info): Declare.
* tree-pretty-print.cc (dump_generic_node): Handle BITINT_TYPE.
Handle printing large wide_ints which would buffer overflow
digit_buffer.
* tree-ssa-sccvn.cc: Include target.h.
(eliminate_dom_walker::eliminate_stmt): Punt for large/huge
BITINT_TYPE.
* tree-switch-conversion.cc (jump_table_cluster::emit): For more than
64-bit BITINT_TYPE subtract low bound from expression and cast to
64-bit integer type both the controlling expression and case labels.
* typeclass.h (enum type_class): Add bitint_type_class enumerator.
* varasm.cc (output_constant): Handle BITINT_TYPE INTEGER_CSTs.
* vr-values.cc (check_for_binary_op_overflow): Use widest2_int rather
than widest_int.
(simplify_using_ranges::simplify_internal_call_using_ranges): Use
unsigned_type_for rather than build_nonstandard_integer_type.
|
|
iseqsig() is a C2x library function, for signaling floating-point
equality checks. Provide a GCC-builtin for it, which is folded to
a series of comparisons.
2022-09-01 Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>
PR middle-end/77928
gcc/
* doc/extend.texi: Document iseqsig builtin.
* builtins.cc (fold_builtin_iseqsig): New function.
(fold_builtin_2): Handle BUILT_IN_ISEQSIG.
(is_inexpensive_builtin): Handle BUILT_IN_ISEQSIG.
* builtins.def (BUILT_IN_ISEQSIG): New built-in.
gcc/c-family/
* c-common.cc (check_builtin_function_arguments):
Handle BUILT_IN_ISEQSIG.
gcc/testsuite/
* gcc.dg/torture/builtin-iseqsig-1.c: New test.
* gcc.dg/torture/builtin-iseqsig-2.c: New test.
* gcc.dg/torture/builtin-iseqsig-3.c: New test.
|
|
While the design of these builtins in clang is questionable,
rather than being say
unsigned __builtin_addc (unsigned, unsigned, bool, bool *)
so that it is clear they add two [0, 0xffffffff] range numbers
plus one [0, 1] range carry in and give [0, 0xffffffff] range
return plus [0, 1] range carry out, they actually instead
add 3 [0, 0xffffffff] values together but the carry out
isn't then the expected [0, 2] value because
0xffffffffULL + 0xffffffff + 0xffffffff is 0x2fffffffd,
but just [0, 1] whether there was any overflow at all.
It is something used in the wild and shorter to write than the
corresponding
#define __builtin_addc(a,b,carry_in,carry_out) \
({ unsigned _s; \
unsigned _c1 = __builtin_uadd_overflow (a, b, &_s); \
unsigned _c2 = __builtin_uadd_overflow (_s, carry_in, &_s); \
*(carry_out) = (_c1 | _c2); \
_s; })
and so a canned builtin for something people could often use.
It isn't that hard to maintain on the GCC side, as we just lower
it to two .ADD_OVERFLOW calls early, and the already committed
pottern recognization code can then make .UADDC/.USUBC calls out of
that if the carry in is in [0, 1] range and the corresponding
optab is supported by the target.
2023-06-16 Jakub Jelinek <jakub@redhat.com>
PR middle-end/79173
* builtin-types.def (BT_FN_UINT_UINT_UINT_UINT_UINTPTR,
BT_FN_ULONG_ULONG_ULONG_ULONG_ULONGPTR,
BT_FN_ULONGLONG_ULONGLONG_ULONGLONG_ULONGLONG_ULONGLONGPTR): New
types.
* builtins.def (BUILT_IN_ADDC, BUILT_IN_ADDCL, BUILT_IN_ADDCLL,
BUILT_IN_SUBC, BUILT_IN_SUBCL, BUILT_IN_SUBCLL): New builtins.
* builtins.cc (fold_builtin_addc_subc): New function.
(fold_builtin_varargs): Handle BUILT_IN_{ADD,SUB}C{,L,LL}.
* doc/extend.texi (__builtin_addc, __builtin_subc): Document.
* gcc.target/i386/pr79173-11.c: New test.
* gcc.dg/builtin-addc-1.c: New test.
|
|
gcc/ChangeLog:
* alias.cc (ref_all_alias_ptr_type_p): Use _P() defines from tree.h.
* attribs.cc (diag_attr_exclusions): Ditto.
(decl_attributes): Ditto.
(build_type_attribute_qual_variant): Ditto.
* builtins.cc (fold_builtin_carg): Ditto.
(fold_builtin_next_arg): Ditto.
(do_mpc_arg2): Ditto.
* cfgexpand.cc (expand_return): Ditto.
* cgraph.h (decl_in_symtab_p): Ditto.
(symtab_node::get_create): Ditto.
* dwarf2out.cc (base_type_die): Ditto.
(implicit_ptr_descriptor): Ditto.
(gen_array_type_die): Ditto.
(gen_type_die_with_usage): Ditto.
(optimize_location_into_implicit_ptr): Ditto.
* expr.cc (do_store_flag): Ditto.
* fold-const.cc (negate_expr_p): Ditto.
(fold_negate_expr_1): Ditto.
(fold_convert_const): Ditto.
(fold_convert_loc): Ditto.
(constant_boolean_node): Ditto.
(fold_binary_op_with_conditional_arg): Ditto.
(build_fold_addr_expr_with_type_loc): Ditto.
(fold_comparison): Ditto.
(fold_checksum_tree): Ditto.
(tree_unary_nonnegative_warnv_p): Ditto.
(integer_valued_real_unary_p): Ditto.
(fold_read_from_constant_string): Ditto.
* gcc-rich-location.cc (maybe_range_label_for_tree_type_mismatch::get_text): Ditto.
* gimple-expr.cc (useless_type_conversion_p): Ditto.
(is_gimple_reg): Ditto.
(is_gimple_asm_val): Ditto.
(mark_addressable): Ditto.
* gimple-expr.h (is_gimple_variable): Ditto.
(virtual_operand_p): Ditto.
* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores): Ditto.
* gimplify.cc (gimplify_bind_expr): Ditto.
(gimplify_return_expr): Ditto.
(gimple_add_padding_init_for_auto_var): Ditto.
(gimplify_addr_expr): Ditto.
(omp_add_variable): Ditto.
(omp_notice_variable): Ditto.
(omp_get_base_pointer): Ditto.
(omp_strip_components_and_deref): Ditto.
(omp_strip_indirections): Ditto.
(omp_accumulate_sibling_list): Ditto.
(omp_build_struct_sibling_lists): Ditto.
(gimplify_adjust_omp_clauses_1): Ditto.
(gimplify_adjust_omp_clauses): Ditto.
(gimplify_omp_for): Ditto.
(goa_lhs_expr_p): Ditto.
(gimplify_one_sizepos): Ditto.
* graphite-scop-detection.cc (scop_detection::graphite_can_represent_scev): Ditto.
* ipa-devirt.cc (odr_types_equivalent_p): Ditto.
* ipa-prop.cc (ipa_set_jf_constant): Ditto.
(propagate_controlled_uses): Ditto.
* ipa-sra.cc (type_prevails_p): Ditto.
(scan_expr_access): Ditto.
* optabs-tree.cc (optab_for_tree_code): Ditto.
* toplev.cc (wrapup_global_declaration_1): Ditto.
* trans-mem.cc (transaction_invariant_address_p): Ditto.
* tree-cfg.cc (verify_types_in_gimple_reference): Ditto.
(verify_gimple_comparison): Ditto.
(verify_gimple_assign_binary): Ditto.
(verify_gimple_assign_single): Ditto.
* tree-complex.cc (get_component_ssa_name): Ditto.
* tree-emutls.cc (lower_emutls_2): Ditto.
* tree-inline.cc (copy_tree_body_r): Ditto.
(estimate_move_cost): Ditto.
(copy_decl_for_dup_finish): Ditto.
* tree-nested.cc (convert_nonlocal_omp_clauses): Ditto.
(note_nonlocal_vla_type): Ditto.
(convert_local_omp_clauses): Ditto.
(remap_vla_decls): Ditto.
(fixup_vla_decls): Ditto.
* tree-parloops.cc (loop_has_vector_phi_nodes): Ditto.
* tree-pretty-print.cc (print_declaration): Ditto.
(print_call_name): Ditto.
* tree-sra.cc (compare_access_positions): Ditto.
* tree-ssa-alias.cc (compare_type_sizes): Ditto.
* tree-ssa-ccp.cc (get_default_value): Ditto.
* tree-ssa-coalesce.cc (populate_coalesce_list_for_outofssa): Ditto.
* tree-ssa-dom.cc (reduce_vector_comparison_to_scalar_comparison): Ditto.
* tree-ssa-forwprop.cc (can_propagate_from): Ditto.
* tree-ssa-propagate.cc (may_propagate_copy): Ditto.
* tree-ssa-sccvn.cc (fully_constant_vn_reference_p): Ditto.
* tree-ssa-sink.cc (statement_sink_location): Ditto.
* tree-ssa-structalias.cc (type_must_have_pointers): Ditto.
* tree-ssa-ter.cc (find_replaceable_in_bb): Ditto.
* tree-ssa-uninit.cc (warn_uninit): Ditto.
* tree-ssa.cc (maybe_rewrite_mem_ref_base): Ditto.
(non_rewritable_mem_ref_base): Ditto.
* tree-streamer-in.cc (lto_input_ts_type_non_common_tree_pointers): Ditto.
* tree-streamer-out.cc (write_ts_type_non_common_tree_pointers): Ditto.
* tree-vect-generic.cc (do_binop): Ditto.
(do_cond): Ditto.
* tree-vect-stmts.cc (vect_init_vector): Ditto.
* tree-vector-builder.h (tree_vector_builder::note_representative): Ditto.
* tree.cc (sign_mask_for): Ditto.
(verify_type_variant): Ditto.
(gimple_canonical_types_compatible_p): Ditto.
(verify_type): Ditto.
* ubsan.cc (get_ubsan_type_info_for_type): Ditto.
* var-tracking.cc (prepare_call_arguments): Ditto.
(vt_add_function_parameters): Ditto.
* varasm.cc (decode_addr_const): Ditto.
|
|
I've noticed 4 typos in comments, fixed thusly.
2023-05-05 Jakub Jelinek <jakub@redhat.com>
* builtins.cc (do_mpfr_ckconv, do_mpc_ckconv): Fix comment typo,
mpft_t -> mpfr_t.
* fold-const-call.cc (do_mpfr_ckconv, do_mpc_ckconv): Likewise.
|
|
The following generalizes the range-op for __builtin_expect
by using the fnspec machinery.
PR tree-optimization/109170
* gimple-range-op.cc (gimple_range_op_handler::maybe_builtin_call):
Handle __builtin_expect and similar via cfn_pass_through_arg1
and inspecting the calls fnspec.
* builtins.cc (builtin_fnspec): Handle BUILT_IN_EXPECT
and BUILT_IN_EXPECT_WITH_PROBABILITY.
|
|
gcc/ChangeLog:
* builtins.cc (expand_builtin_strnlen): Rewrite deprecated irange
API uses to new API.
* gimple-predicate-analysis.cc (find_var_cmp_const): Same.
* internal-fn.cc (get_min_precision): Same.
* match.pd: Same.
* tree-affine.cc (expr_to_aff_combination): Same.
* tree-data-ref.cc (dr_step_indicator): Same.
* tree-dfa.cc (get_ref_base_and_extent): Same.
* tree-scalar-evolution.cc (iv_can_overflow_p): Same.
* tree-ssa-phiopt.cc (two_value_replacement): Same.
* tree-ssa-pre.cc (insert_into_preds_of_block): Same.
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Same.
* tree-ssa-strlen.cc (compare_nonzero_chars): Same.
* tree-switch-conversion.cc (bit_test_cluster::emit): Same.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Same.
* tree.cc (get_range_pos_neg): Same.
|
|
This patch converts the users of the legacy API to a function called
get_legacy_range() which will return the pieces of the soon to be
removed API (min, max, and kind). This is a temporary measure while
these users are converted.
In upcoming patches I will convert most users, but most of the
middle-end warning uses will remain. Naive attempts to remove them
showed that a lot of these uses are quite dependant on the anti-range
idiom, and converting them to the new API broke the tests, even when
the conversion was conceptually correct. Perhaps someone who
understands these passes could take a stab at it. In the meantime,
the legacy uses can be trivially found by grepping for
get_legacy_range.
gcc/ChangeLog:
* builtins.cc (determine_block_size): Convert use of legacy API to
get_legacy_range.
* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Same.
(array_bounds_checker::check_array_ref): Same.
* gimple-ssa-warn-restrict.cc
(builtin_memref::extend_offset_range): Same.
* ipa-cp.cc (ipcp_store_vr_results): Same.
* ipa-fnsummary.cc (set_switch_stmt_execution_predicate): Same.
* ipa-prop.cc (struct ipa_vr_ggc_hash_traits): Same.
(ipa_write_jump_function): Same.
* pointer-query.cc (get_size_range): Same.
* tree-data-ref.cc (split_constant_offset): Same.
* tree-ssa-strlen.cc (get_range): Same.
(maybe_diag_stxncpy_trunc): Same.
(strlen_pass::get_len_or_size): Same.
(strlen_pass::count_nonzero_bytes_addr): Same.
* tree-vect-patterns.cc (vect_get_range_info): Same.
* value-range.cc (irange::maybe_anti_range): Remove.
(get_legacy_range): New.
(irange::copy_to_legacy): Use get_legacy_range.
(ranges_from_anti_range): Same.
* value-range.h (class irange): Remove maybe_anti_range.
(get_legacy_range): New.
* vr-values.cc (check_for_binary_op_overflow): Convert use of
legacy API to get_legacy_range.
(compare_ranges): Same.
(compare_range_with_value): Same.
(bounds_of_var_in_loop): Same.
(find_case_label_ranges): Same.
(simplify_using_ranges::simplify_switch_using_ranges): Same.
|
|
On Wed, Feb 22, 2023 at 09:52:06AM +0000, Richard Biener wrote:
> > The following testcase ICEs because we still have some spots that
> > treat BUILT_IN_UNREACHABLE specially but not BUILT_IN_UNREACHABLE_TRAP
> > the same.
This patch uses (fndecl_built_in_p (node, BUILT_IN_UNREACHABLE)
|| fndecl_built_in_p (node, BUILT_IN_UNREACHABLE_TRAP))
a lot and from grepping around, we do something like that in lots of
other places, or in some spots instead as
(fndecl_built_in_p (node, BUILT_IN_NORMAL)
&& (DECL_FUNCTION_CODE (node) == BUILT_IN_WHATEVER1
|| DECL_FUNCTION_CODE (node) == BUILT_IN_WHATEVER2))
The following patch adds an overload for this case, so we can write
it in a shorter way, using C++11 argument packs so that it supports
as many codes as one needs.
2023-04-20 Jakub Jelinek <jakub@redhat.com>
Jonathan Wakely <jwakely@redhat.com>
* tree.h (built_in_function_equal_p): New helper function.
(fndecl_built_in_p): Turn into variadic template to support
1 or more built_in_function arguments.
* builtins.cc (fold_builtin_expect): Use 3 argument fndecl_built_in_p.
* gimplify.cc (goa_stabilize_expr): Likewise.
* cgraphclones.cc (cgraph_node::create_clone): Likewise.
* ipa-fnsummary.cc (compute_fn_summary): Likewise.
* omp-low.cc (setjmp_or_longjmp_p): Likewise.
* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee,
cgraph_update_edges_for_call_stmt_node,
cgraph_edge::verify_corresponds_to_fndecl,
cgraph_node::verify_node): Likewise.
* tree-stdarg.cc (optimize_va_list_gpr_fpr_size): Likewise.
* gimple-ssa-warn-access.cc (matching_alloc_calls_p): Likewise.
* ipa-prop.cc (try_make_edge_direct_virtual_call): Likewise.
|
|
The following picks up the coccinelle generated patch from Bernhard,
leaving out the fortran frontend parts and fixing up the rest.
In particular both gmp.h and mpfr.h contain macros like
#define mpfr_inf_p(_x) ((_x)->_mpfr_exp == __MPFR_EXP_INF)
for which I add operator-> overloads to the auto_* classes.
* system.h (auto_mpz::operator->()): New.
* realmpfr.h (auto_mpfr::operator->()): New.
* builtins.cc (do_mpfr_lgamma_r): Use auto_mpfr.
* real.cc (real_from_string): Likewise.
(dconst_e_ptr): Likewise.
(dconst_sqrt2_ptr): Likewise.
* tree-ssa-loop-niter.cc (refine_value_range_using_guard):
Use auto_mpz.
(bound_difference_of_offsetted_base): Likewise.
(number_of_iterations_ne): Likewise.
(number_of_iterations_lt_to_ne): Likewise.
* ubsan.cc: Include realmpfr.h.
(ubsan_instrument_float_cast): Use auto_mpfr.
|
|
The following testcase is miscompiled on aarch64-linux in the regname pass,
because while the function takes arguments in the p0 register,
FUNCTION_ARG_REGNO_P doesn't reflect that, so DF doesn't know the register is
used in register passing. It sees 2 chains with p1 register and wants to
replace the second one and as DF doesn't know p0 is live at the start of the
function, it will happily use p0 register even when it is used in subsequent
instructions.
The following patch fixes that. FUNCTION_ARG_REGNO_P returns non-zero
for p0-p3 (unconditionally, seems for the floating/vector registers it
doesn't conditionalize them on TARGET_FLOAT either, but if you want,
I can conditionalize p0-p3 on TARGET_SVE), similarly
targetm.calls.function_value_regno_p returns true for p0-p3 registers
if TARGET_SVE (again for consistency, that function conditionalizes
the float/vector on TARGET_FLOAT).
Now, that change broke bootstrap in libobjc and some
__builtin_apply_args/__builtin_apply/__builtin_return tests. The
aarch64_get_reg_raw_mode hook already documents that SVE scalable arg/return
passing is fundamentally incompatible with those builtins, but unlike
the floating/vector regs where it forces a fixed vector mode, I think
there is no fixed mode which could be used for p0-p3. So, I have tweaked
the generic code so that it uses VOIDmode return from that hook to signal
that a register shouldn't be touched by
__builtin_apply_args/__builtin_apply/__builtin_return
despite being mentioned in FUNCTION_ARG_REGNO_P or
targetm.calls.function_value_regno_p.
gcc/
2023-04-01 Jakub Jelinek <jakub@redhat.com>
PR target/109254
* builtins.cc (apply_args_size): If targetm.calls.get_raw_arg_mode
returns VOIDmode, handle it like if the register isn't used for
passing arguments at all.
(apply_result_size): If targetm.calls.get_raw_result_mode returns
VOIDmode, handle it like if the register isn't used for returning
results at all.
* target.def (get_raw_result_mode, get_raw_arg_mode): Document what it
means to return VOIDmode.
* doc/tm.texi: Regenerated.
* config/aarch64/aarch64.cc (aarch64_function_value_regno_p): Return
TARGET_SVE for P0_REGNUM.
(aarch64_function_arg_regno_p): Also return true for p0-p3.
(aarch64_get_reg_raw_mode): Return VOIDmode for PR_REGNUM_P regs.
gcc/testsuite/
2023-04-01 Jakub Jelinek <jakub@redhat.com>
Richard Sandiford <richard.sandiford@arm.com>
PR target/109254
* gcc.target/aarch64/sve/pr109254.c: New test.
|
|
The PR109086 r13-6690 inline_string_cmp change to
if (diff != result)
emit_move_insn (result, diff);
regressed
FAIL: go.test/test/fixedbugs/bug207.go, -O2 -g (internal compiler error: in emit_move_insn, at expr.cc:4224)
The problem is the Go FE doesn't mark __builtin_memcmp as pure (I'll also
send patch for that) and so result is const0_rtx when the call lost its lhs
and the above move ICEs because moving something into const0_rtx is obviously
invalid.
I think it is better not to rely on all FEs having these *cmp functions
pure anD DCE being performed. The following patch just punts from the
inline expansion in that case, so we just emit normal library call.
2023-03-24 Jakub Jelinek <jakub@redhat.com>
PR middle-end/109258
* builtins.cc (inline_expand_builtin_bytecmp): Return NULL_RTX early
if target == const0_rtx.
|
|
The target hook is only used by i386, and the current definition is
same as default gen_reg_rtx.
gcc/ChangeLog:
* builtins.cc (builtin_memset_read_str): Replace
targetm.gen_memset_scratch_rtx with gen_reg_rtx.
(builtin_memset_gen_str): Ditto.
* config/i386/i386-expand.cc
(ix86_convert_const_wide_int_to_broadcast): Replace
ix86_gen_scratch_sse_rtx with gen_reg_rtx.
(ix86_expand_vector_move): Ditto.
* config/i386/i386-protos.h (ix86_gen_scratch_sse_rtx):
Removed.
* config/i386/i386.cc (ix86_gen_scratch_sse_rtx): Removed.
(TARGET_GEN_MEMSET_SCRATCH_RTX): Removed.
* doc/tm.texi: Remove TARGET_GEN_MEMSET_SCRATCH_RTX.
* doc/tm.texi.in: Ditto.
* target.def: Ditto.
|
|
result [PR109086]
expand_simple_binop() is allowed to allocate a new pseudo-register and
return it, instead of forcing the result into the provided
pseudo-register. This can cause a problem when we expand the unrolled
loop for __builtin_strcmp: the compiler always generates code for all n
iterations of the loop, so "result" will be an alias of the
pseudo-register allocated and used in the last iteration; but at runtime
the loop can break early, causing this pseudo-register uninitialized.
Emit a move instruction in the iteration to force the difference into
one register which has been allocated before the loop, to avoid this
issue.
gcc/ChangeLog:
PR other/109086
* builtins.cc (inline_string_cmp): Force the character
difference into "result" pseudo-register, instead of reassign
the pseudo-register.
|
|
Calls to vectorized versions of routines in the math library will now
be inserted when vectorizing code containing supported math functions.
2023-03-02 Kwok Cheung Yeung <kcy@codesourcery.com>
Paul-Antoine Arras <pa@codesourcery.com>
gcc/
* builtins.cc (mathfn_built_in_explicit): New.
* config/gcn/gcn.cc: Include case-cfn-macros.h.
(mathfn_built_in_explicit): Add prototype.
(gcn_vectorize_builtin_vectorized_function): New.
(gcn_libc_has_function): New.
(TARGET_LIBC_HAS_FUNCTION): Define.
(TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define.
gcc/testsuite/
* gcc.target/gcn/simd-math-1.c: New testcase.
* gcc.target/gcn/simd-math-2.c: New testcase.
libgomp/
* testsuite/libgomp.c/simd-math-1.c: New testcase.
|
|
While in the -fsanitize=address case libasan overloads memcpy, memset,
memmove and many other builtins, such that they are always instrumented,
Linux kernel for -fsanitize=kernel-address recently changed or is changing,
such that memcpy, memset and memmove actually aren't instrumented because
they are often used also from no_sanitize ("kernel-address") functions
and wants __{,hw,}asaN_{memcpy,memset,memmove} to be used instead
for the instrumented calls. See e.g. the https://lkml.org/lkml/2023/2/9/1182
thread. Without appropriate support on the compiler side, that will mean
any time a kernel-address instrumented function (most of them) calls
memcpy/memset/memmove, they will not be instrumented and thus won't catch
kernel bugs. Apparently clang 15 has a param for this.
The following patch implements the same (except it is a usual GCC --param,
not -mllvm argument) on the GCC side. I know this isn't a regression
bugfix, but given that -fsanitize=kernel-address has a single project that
uses it which badly wants this I think it would be worthwhile to make an
exception and get this into GCC 13 rather than waiting another year, it
won't affect non-kernel code, nor even the kernel unless the new parameter
is used.
2023-02-14 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/108777
* params.opt (-param=asan-kernel-mem-intrinsic-prefix=): New param.
* asan.h (asan_memfn_rtl): Declare.
* asan.cc (asan_memfn_rtls): New variable.
(asan_memfn_rtl): New function.
* builtins.cc (expand_builtin): If
param_asan_kernel_mem_intrinsic_prefix and function is
kernel-{,hw}address sanitized, emit calls to
__{,hw}asan_{memcpy,memmove,memset} rather than
{memcpy,memmove,memset}. Use sanitize_flags_p (SANITIZE_ADDRESS)
instead of flag_sanitize & SANITIZE_ADDRESS to check if
asan_intercepted_p functions shouldn't be expanded inline.
* gcc.dg/asan/pr108777-1.c: New test.
* gcc.dg/asan/pr108777-2.c: New test.
* gcc.dg/asan/pr108777-3.c: New test.
* gcc.dg/asan/pr108777-4.c: New test.
* gcc.dg/asan/pr108777-5.c: New test.
* gcc.dg/asan/pr108777-6.c: New test.
* gcc.dg/completion-3.c: Adjust expected multiline output.
|
|
For PR106099 I've added IFN_TRAP as an alternative to __builtin_trap
meant for __builtin_unreachable purposes (e.g. with -funreachable-traps
or some sanitizers) which doesn't need vops because __builtin_unreachable
doesn't need them either. This works in various cases, but unfortunately
IPA likes to decide on the redirection to unreachable just by tweaking
the cgraph edge to point to a different FUNCTION_DECL. As internal
functions don't have a decl, this causes problems like in the following
testcase.
The following patch fixes it by removing IFN_TRAP again and replacing
it with user inaccessible BUILT_IN_UNREACHABLE_TRAP, so that e.g.
builtin_decl_unreachable can return it directly and we don't need to tweak
it later in wherever we actually replace the call stmt.
2023-02-02 Jakub Jelinek <jakub@redhat.com>
PR ipa/107300
* builtins.def (BUILT_IN_UNREACHABLE_TRAP): New builtin.
* internal-fn.def (TRAP): Remove.
* internal-fn.cc (expand_TRAP): Remove.
* tree.cc (build_common_builtin_nodes): Define
BUILT_IN_UNREACHABLE_TRAP if not yet defined.
(builtin_decl_unreachable): Use BUILT_IN_UNREACHABLE_TRAP
instead of BUILT_IN_TRAP.
* gimple.cc (gimple_build_builtin_unreachable): Remove
emitting internal function for BUILT_IN_TRAP.
* asan.cc (maybe_instrument_call): Handle BUILT_IN_UNREACHABLE_TRAP.
* cgraph.cc (cgraph_edge::verify_corresponds_to_fndecl): Handle
BUILT_IN_UNREACHABLE_TRAP instead of BUILT_IN_TRAP.
* ipa-devirt.cc (possible_polymorphic_call_target_p): Handle
BUILT_IN_UNREACHABLE_TRAP.
* builtins.cc (expand_builtin, is_inexpensive_builtin): Likewise.
* tree-cfg.cc (verify_gimple_call,
pass_warn_function_return::execute): Likewise.
* attribs.cc (decl_attributes): Don't report exclusions on
BUILT_IN_UNREACHABLE_TRAP either.
* gcc.dg/pr107300.c: New test.
|
|
|
|
According to the documentation, the -Werror= option makes the specified
warning into an error and also automatically implies that option. Then
it seems that the behavior of the compiler when specifying
-Werror=array-bounds=X should be the same as specifying
"-Werror=array-bounds -Warray-bounds=X", so we expect to receive
array-bounds pass diagnostics and they must be processed as errors.
In practice, we observe that the array-bounds pass is indeed invoked,
but its diagnostics are processed as warnings, not errors.
This happens because Warray-bounds and Warray-bounds= are
declared as two different options in common.opt, so when
diagnostic_classify_diagnostic is called, DK_ERROR is set for
the Warray-bounds= option, but diagnostic_report_diagnostic called from
warning_at receives opt_index of Warray-bounds, so information about
DK_ERROR is lost. Fix this by using Alias in declaration of
Warray-bounds (similar to Wattribute-alias).
Co-authored-by: Franz Sirl <Franz.Sirl-kernel@lauterbach.com>
gcc/ChangeLog:
PR driver/107787
* common.opt (Warray-bounds): Turn into alias of
-Warray-bounds=1.
* builtins.cc (c_strlen): Use OPT_Warray_bounds_
instead of OPT_Warray_bounds.
* diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Ditto.
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref,
array_bounds_checker::check_mem_ref,
array_bounds_checker::check_addr_expr,
array_bounds_checker::check_array_bounds): Ditto.
* gimple-ssa-warn-restrict.cc (maybe_diag_access_bounds): Ditto.
gcc/c-family/ChangeLog:
PR driver/107787
* c-common.cc (fold_offsetof,
convert_vector_to_array_for_subscript): Use OPT_Warray_bounds_
instead of OPT_Warray_bounds.
gcc/testsuite/ChangeLog:
PR driver/107787
* gcc.dg/Warray-bounds-34.c: Correct the regular expression
for -Warray-bounds=.
* gcc.dg/Warray-bounds-43.c: Likewise.
* gcc.dg/pr107787.c: New test.
|
|
trunk bootstrap recently broke on Solaris like this:
/vol/gcc/src/hg/master/local/gcc/builtins.cc:2104:8: error: pasting
"CFN_BUILT_IN_" and "(" does not give a valid preprocessing token
2104 | case CFN_BUILT_IN_##MATHFN: \
| ^~~~~~~~~~~~~
/vol/gcc/src/hg/master/local/gcc/builtins.cc:2112:3: note: in expansion of
macro 'CASE_MATHFN'
2112 | CASE_MATHFN(MATHFN) \
| ^~~~~~~~~~~
/vol/gcc/src/hg/master/local/gcc/builtins.cc:1967:5: note: in expansion of macro 'CASE_MATHFN_FLOATN'
1967 | CASE_MATHFN_FLOATN (HUGE_VAL) \
and similarly for NAN.
It turns out this happens because <math.h> is included at some point,
which (in <iso/math_c99.h>) defines
While this only happpens on Solaris right now, the same issue would be
present on other targets when <math.h> gets included somehow.
To avoid this, this patch #undef's both macros.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11.
2022-11-01 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc:
* builtins.cc (mathfn_built_in_2): #undef HUGE_VAL, NAN.
|
|
The following patch adds some complex builtins which have libm
implementation in glibc 2.26 and later on various arches.
It is needed for libstdc++ _Float128 support when long double is not
IEEE quad.
2022-10-31 Jakub Jelinek <jakub@redhat.com>
* builtin-types.def (BT_COMPLEX_FLOAT16, BT_COMPLEX_FLOAT32,
BT_COMPLEX_FLOAT64, BT_COMPLEX_FLOAT128, BT_COMPLEX_FLOAT32X,
BT_COMPLEX_FLOAT64X, BT_COMPLEX_FLOAT128X,
BT_FN_COMPLEX_FLOAT16_COMPLEX_FLOAT16,
BT_FN_COMPLEX_FLOAT32_COMPLEX_FLOAT32,
BT_FN_COMPLEX_FLOAT64_COMPLEX_FLOAT64,
BT_FN_COMPLEX_FLOAT128_COMPLEX_FLOAT128,
BT_FN_COMPLEX_FLOAT32X_COMPLEX_FLOAT32X,
BT_FN_COMPLEX_FLOAT64X_COMPLEX_FLOAT64X,
BT_FN_COMPLEX_FLOAT128X_COMPLEX_FLOAT128X,
BT_FN_FLOAT16_COMPLEX_FLOAT16, BT_FN_FLOAT32_COMPLEX_FLOAT32,
BT_FN_FLOAT64_COMPLEX_FLOAT64, BT_FN_FLOAT128_COMPLEX_FLOAT128,
BT_FN_FLOAT32X_COMPLEX_FLOAT32X, BT_FN_FLOAT64X_COMPLEX_FLOAT64X,
BT_FN_FLOAT128X_COMPLEX_FLOAT128X,
BT_FN_COMPLEX_FLOAT16_COMPLEX_FLOAT16_COMPLEX_FLOAT16,
BT_FN_COMPLEX_FLOAT32_COMPLEX_FLOAT32_COMPLEX_FLOAT32,
BT_FN_COMPLEX_FLOAT64_COMPLEX_FLOAT64_COMPLEX_FLOAT64,
BT_FN_COMPLEX_FLOAT128_COMPLEX_FLOAT128_COMPLEX_FLOAT128,
BT_FN_COMPLEX_FLOAT32X_COMPLEX_FLOAT32X_COMPLEX_FLOAT32X,
BT_FN_COMPLEX_FLOAT64X_COMPLEX_FLOAT64X_COMPLEX_FLOAT64X,
BT_FN_COMPLEX_FLOAT128X_COMPLEX_FLOAT128X_COMPLEX_FLOAT128X): New.
* builtins.def (CABS_TYPE, CACOSH_TYPE, CARG_TYPE, CASINH_TYPE,
CPOW_TYPE, CPROJ_TYPE): Define and undefine later.
(BUILT_IN_CABS, BUILT_IN_CACOSH, BUILT_IN_CACOS, BUILT_IN_CARG,
BUILT_IN_CASINH, BUILT_IN_CASIN, BUILT_IN_CATANH, BUILT_IN_CATAN,
BUILT_IN_CCOSH, BUILT_IN_CCOS, BUILT_IN_CEXP, BUILT_IN_CLOG,
BUILT_IN_CPOW, BUILT_IN_CPROJ, BUILT_IN_CSINH, BUILT_IN_CSIN,
BUILT_IN_CSQRT, BUILT_IN_CTANH, BUILT_IN_CTAN): Add
DEF_EXT_LIB_FLOATN_NX_BUILTINS.
* fold-const-call.cc (fold_const_call_sc, fold_const_call_cc,
fold_const_call_ccc): Add various CASE_CFN_*_FN: cases when
CASE_CFN_* is present.
* gimple-ssa-backprop.cc (backprop::process_builtin_call_use):
Likewise.
* builtins.cc (expand_builtin, fold_builtin_1): Likewise.
* fold-const.cc (negate_mathfn_p, tree_expr_finite_p,
tree_expr_maybe_signaling_nan_p, tree_expr_maybe_nan_p,
tree_expr_maybe_real_minus_zero_p, tree_call_nonnegative_warnv_p):
Likewise.
|
|
When working on libstdc++ extended float support in <cmath>, I found that
we need various builtins for the _Float{16,32,64,128,32x,64x,128x} types.
Glibc 2.26 and later provides the underlying libm routines (except for
_Float16 and _Float128x for the time being) and in libstdc++ I think we
need at least the _Float128 builtins on x86_64, i?86, powerpc64le and ia64
(when long double is IEEE quad, we can handle it by using __builtin_*l
instead), because without the builtins the overloads couldn't be constexpr
(say when it would declare the *f128 extern "C" routines itself and call
them).
The testcase covers just types of those builtins and their constant
folding, so doesn't need actual libm support.
2022-10-31 Jakub Jelinek <jakub@redhat.com>
* builtin-types.def (BT_FLOAT16_PTR, BT_FLOAT32_PTR, BT_FLOAT64_PTR,
BT_FLOAT128_PTR, BT_FLOAT32X_PTR, BT_FLOAT64X_PTR, BT_FLOAT128X_PTR):
New DEF_PRIMITIVE_TYPE.
(BT_FN_INT_FLOAT16, BT_FN_INT_FLOAT32, BT_FN_INT_FLOAT64,
BT_FN_INT_FLOAT128, BT_FN_INT_FLOAT32X, BT_FN_INT_FLOAT64X,
BT_FN_INT_FLOAT128X, BT_FN_LONG_FLOAT16, BT_FN_LONG_FLOAT32,
BT_FN_LONG_FLOAT64, BT_FN_LONG_FLOAT128, BT_FN_LONG_FLOAT32X,
BT_FN_LONG_FLOAT64X, BT_FN_LONG_FLOAT128X, BT_FN_LONGLONG_FLOAT16,
BT_FN_LONGLONG_FLOAT32, BT_FN_LONGLONG_FLOAT64,
BT_FN_LONGLONG_FLOAT128, BT_FN_LONGLONG_FLOAT32X,
BT_FN_LONGLONG_FLOAT64X, BT_FN_LONGLONG_FLOAT128X): New
DEF_FUNCTION_TYPE_1.
(BT_FN_FLOAT16_FLOAT16_FLOAT16PTR, BT_FN_FLOAT32_FLOAT32_FLOAT32PTR,
BT_FN_FLOAT64_FLOAT64_FLOAT64PTR, BT_FN_FLOAT128_FLOAT128_FLOAT128PTR,
BT_FN_FLOAT32X_FLOAT32X_FLOAT32XPTR,
BT_FN_FLOAT64X_FLOAT64X_FLOAT64XPTR,
BT_FN_FLOAT128X_FLOAT128X_FLOAT128XPTR, BT_FN_FLOAT16_FLOAT16_INT,
BT_FN_FLOAT32_FLOAT32_INT, BT_FN_FLOAT64_FLOAT64_INT,
BT_FN_FLOAT128_FLOAT128_INT, BT_FN_FLOAT32X_FLOAT32X_INT,
BT_FN_FLOAT64X_FLOAT64X_INT, BT_FN_FLOAT128X_FLOAT128X_INT,
BT_FN_FLOAT16_FLOAT16_INTPTR, BT_FN_FLOAT32_FLOAT32_INTPTR,
BT_FN_FLOAT64_FLOAT64_INTPTR, BT_FN_FLOAT128_FLOAT128_INTPTR,
BT_FN_FLOAT32X_FLOAT32X_INTPTR, BT_FN_FLOAT64X_FLOAT64X_INTPTR,
BT_FN_FLOAT128X_FLOAT128X_INTPTR, BT_FN_FLOAT16_FLOAT16_LONG,
BT_FN_FLOAT32_FLOAT32_LONG, BT_FN_FLOAT64_FLOAT64_LONG,
BT_FN_FLOAT128_FLOAT128_LONG, BT_FN_FLOAT32X_FLOAT32X_LONG,
BT_FN_FLOAT64X_FLOAT64X_LONG, BT_FN_FLOAT128X_FLOAT128X_LONG): New
DEF_FUNCTION_TYPE_2.
(BT_FN_FLOAT16_FLOAT16_FLOAT16_INTPTR,
BT_FN_FLOAT32_FLOAT32_FLOAT32_INTPTR,
BT_FN_FLOAT64_FLOAT64_FLOAT64_INTPTR,
BT_FN_FLOAT128_FLOAT128_FLOAT128_INTPTR,
BT_FN_FLOAT32X_FLOAT32X_FLOAT32X_INTPTR,
BT_FN_FLOAT64X_FLOAT64X_FLOAT64X_INTPTR,
BT_FN_FLOAT128X_FLOAT128X_FLOAT128X_INTPTR): New DEF_FUNCTION_TYPE_3.
* builtins.def (ACOSH_TYPE, ATAN2_TYPE, ATANH_TYPE, COSH_TYPE,
FDIM_TYPE, HUGE_VAL_TYPE, HYPOT_TYPE, ILOGB_TYPE, LDEXP_TYPE,
LGAMMA_TYPE, LLRINT_TYPE, LOG10_TYPE, LRINT_TYPE, MODF_TYPE,
NEXTAFTER_TYPE, REMQUO_TYPE, SCALBLN_TYPE, SCALBN_TYPE, SINH_TYPE):
Define and undefine later.
(FMIN_TYPE, SQRT_TYPE): Undefine at a later line.
(INF_TYPE): Define at a later line.
(BUILT_IN_ACOSH, BUILT_IN_ACOS, BUILT_IN_ASINH, BUILT_IN_ASIN,
BUILT_IN_ATAN2, BUILT_IN_ATANH, BUILT_IN_ATAN, BUILT_IN_CBRT,
BUILT_IN_COSH, BUILT_IN_COS, BUILT_IN_ERFC, BUILT_IN_ERF,
BUILT_IN_EXP2, BUILT_IN_EXP, BUILT_IN_EXPM1, BUILT_IN_FDIM,
BUILT_IN_FMOD, BUILT_IN_FREXP, BUILT_IN_HYPOT, BUILT_IN_ILOGB,
BUILT_IN_LDEXP, BUILT_IN_LGAMMA, BUILT_IN_LLRINT, BUILT_IN_LLROUND,
BUILT_IN_LOG10, BUILT_IN_LOG1P, BUILT_IN_LOG2, BUILT_IN_LOGB,
BUILT_IN_LOG, BUILT_IN_LRINT, BUILT_IN_LROUND, BUILT_IN_MODF,
BUILT_IN_NEXTAFTER, BUILT_IN_POW, BUILT_IN_REMAINDER, BUILT_IN_REMQUO,
BUILT_IN_SCALBLN, BUILT_IN_SCALBN, BUILT_IN_SINH, BUILT_IN_SIN,
BUILT_IN_TANH, BUILT_IN_TAN, BUILT_IN_TGAMMA): Add
DEF_EXT_LIB_FLOATN_NX_BUILTINS.
(BUILT_IN_HUGE_VAL): Use HUGE_VAL_TYPE instead of INF_TYPE in
DEF_GCC_FLOATN_NX_BUILTINS.
* fold-const-call.cc (fold_const_call_ss): Add various CASE_CFN_*_FN:
cases when CASE_CFN_* is present.
(fold_const_call_sss): Likewise.
* builtins.cc (mathfn_built_in_2): Use CASE_MATHFN_FLOATN instead of
CASE_MATHFN for various builtins in SEQ_OF_CASE_MATHFN macro.
(builtin_with_linkage_p): Add CASE_FLT_FN_FLOATN_NX for various
builtins next to CASE_FLT_FN.
* fold-const.cc (tree_call_nonnegative_warnv_p): Add CASE_CFN_*_FN:
next to CASE_CFN_*: for various builtins.
* tree-call-cdce.cc (can_test_argument_range): Add
CASE_FLT_FN_FLOATN_NX next to CASE_FLT_FN for various builtins.
(edom_only_function): Likewise.
* gcc.dg/torture/floatn-builtin.h: Add tests for newly added builtins.
|
|
Simplify several calls to build_string_literal by not requiring redundant
strlen or IDENTIFIER_* in the caller.
I also corrected a wrong comment on IDENTIFIER_LENGTH.
gcc/ChangeLog:
* tree.h (build_string_literal): New one-argument overloads that
take tree (identifier) and const char *.
* builtins.cc (fold_builtin_FILE)
(fold_builtin_FUNCTION)
* gimplify.cc (gimple_add_init_for_auto_var)
* vtable-verify.cc (verify_bb_vtables): Simplify calls.
gcc/cp/ChangeLog:
* cp-gimplify.cc (fold_builtin_source_location)
* vtable-class-hierarchy.cc (register_all_pairs): Simplify calls to
build_string_literal.
(build_string_from_id): Remove.
|
|
gcc/ChangeLog:
* builtins.cc (fold_builtin_inf): Convert use of real_info to dconstinf.
(fold_builtin_fpclassify): Same.
* fold-const-call.cc (fold_const_call_cc): Same.
* match.pd: Same.
* omp-low.cc (omp_reduction_init_op): Same.
* realmpfr.cc (real_from_mpfr): Same.
* tree.cc (build_complex_inf): Same.
|
|
The following patch implements a new builtin, __builtin_issignaling,
which can be used to implement the ISO/IEC TS 18661-1 issignaling
macro.
It is implemented as type-generic function, so there is just one
builtin, not many with various suffixes.
This patch doesn't address PR56831 nor PR58416, but I think compared to
using glibc issignaling macro could make some cases better (as
the builtin is expanded always inline and for SFmode/DFmode just
reinterprets a memory or pseudo register as SImode/DImode, so could
avoid some raising of exception + turning sNaN into qNaN before the
builtin can analyze it).
For floading point modes that do not have NaNs it will return 0,
otherwise I've tried to implement this for all the other supported
real formats.
It handles both the MIPS/PA floats where a sNaN has the mantissa
MSB set and the rest where a sNaN has it cleared, with the exception
of format which are known never to be in the MIPS/PA form.
The MIPS/PA floats are handled using a test like
(x & mask) == mask,
the other usually as
((x ^ bit) & mask) > val
where bit, mask and val are some constants.
IBM double double is done by doing DFmode test on the most significant
half, and Intel/Motorola extended (12 or 16 bytes) and IEEE quad are
handled by extracting 32-bit/16-bit words or 64-bit parts from the
value and testing those.
On x86, XFmode is handled by a special optab so that even pseudo numbers
are considered signaling, like in glibc and like the i386 specific testcase
tests.
2022-08-26 Jakub Jelinek <jakub@redhat.com>
gcc/
* builtins.def (BUILT_IN_ISSIGNALING): New built-in.
* builtins.cc (expand_builtin_issignaling): New function.
(expand_builtin_signbit): Don't overwrite target.
(expand_builtin): Handle BUILT_IN_ISSIGNALING.
(fold_builtin_classify): Likewise.
(fold_builtin_1): Likewise.
* optabs.def (issignaling_optab): New.
* fold-const-call.cc (fold_const_call_ss): Handle
BUILT_IN_ISSIGNALING.
* config/i386/i386.md (issignalingxf2): New expander.
* doc/extend.texi (__builtin_issignaling): Document.
(__builtin_isinf, __builtin_isnan): Clarify behavior with
-ffinite-math-only.
* doc/md.texi (issignaling<mode>2): Likewise.
gcc/c-family/
* c-common.cc (check_builtin_function_arguments): Handle
BUILT_IN_ISSIGNALING.
gcc/c/
* c-typeck.cc (convert_arguments): Handle BUILT_IN_ISSIGNALING.
gcc/fortran/
* f95-lang.cc (gfc_init_builtin_functions): Initialize
BUILT_IN_ISSIGNALING.
gcc/testsuite/
* gcc.dg/torture/builtin-issignaling-1.c: New test.
* gcc.dg/torture/builtin-issignaling-2.c: New test.
* gcc.dg/torture/float16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/float32-builtin-issignaling-1.c: New test.
* gcc.dg/torture/float32x-builtin-issignaling-1.c: New test.
* gcc.dg/torture/float64-builtin-issignaling-1.c: New test.
* gcc.dg/torture/float64x-builtin-issignaling-1.c: New test.
* gcc.dg/torture/float128-builtin-issignaling-1.c: New test.
* gcc.dg/torture/float128x-builtin-issignaling-1.c: New test.
* gcc.target/i386/builtin-issignaling-1.c: New test.
|
|
The testcase in the PR demonstrates how it is possible for one
__builtin_setjmp_receiver label to appear in
nonlocal_goto_handler_labels list twice (after the block with
__builtin_setjmp_setup referring to it was duplicated).
remove_node_from_insn_list did not account for this possibility and
removed only the first copy from the list. Add an assert verifying that
duplicates are not present.
To avoid adding a label to the list twice, move registration of the
label from __builtin_setjmp_setup handling to __builtin_setjmp_receiver.
gcc/ChangeLog:
PR rtl-optimization/101347
* builtins.cc (expand_builtin) [BUILT_IN_SETJMP_SETUP]: Move
population of nonlocal_goto_handler_labels from here ...
(expand_builtin) [BUILT_IN_SETJMP_RECEIVER]: ... to here.
* rtlanal.cc (remove_node_from_insn_list): Verify that a
duplicate is not present in the remainder of the list.
|