Age | Commit message (Collapse) | Author | Files | Lines |
|
Macros [PR114461]
This is an attempt to implement the https://wg21.link/p3034r1 paper,
but I'm afraid the wording in the paper is bad for multiple reasons.
I think I understand the intent, that the module name and partition
if any shouldn't come from macros so that they can be scanned for
without preprocessing, but on the other side doesn't want to disable
macro expansion in pp-module altogether, because e.g. the optional
attribute in module-declaration would be nice to come from macros
as which exact attribute is needed might need to be decided based on
preprocessor checks.
The paper added https://eel.is/c++draft/cpp.module#2
which uses partly the wording from https://eel.is/c++draft/cpp.module#1
The first issue I see is that using that "defined as an object-like macro"
from there means IMHO something very different in those 2 paragraphs.
As per https://eel.is/c++draft/cpp.pre#7.sentence-1 preprocessing tokens
in preprocessing directives aren't subject to macro expansion unless
otherwise stated, and so the export and module tokens aren't expanded
and so the requirement that they aren't defined as an object-like macro
makes perfect sense. The problem with the new paragraph is that
https://eel.is/c++draft/cpp.module#3.sentence-1 says that the rest of
the tokens are macro expanded and after macro expansion none of the
tokens can be defined as an object-like macro, if they would be, they'd
be expanded to that. So, I think either the wording needs to change
such that not all preprocessing tokens after module are macro expanded,
only those which are after the pp-module-name and if any pp-module-partition
tokens, or all tokens after module are macro expanded but none of the tokens in
pp-module-name and pp-module-partition if any must come from macro
expansion. The patch below implements it as if the former would be
specified (but see later), so essentially scans the preprocessing tokens
after module without expansion, if the first one is an identifier, it
disables expansion for it and then if followed by . or : expects another
such identifier (again with disabled expansion), but stops after second
: is seen.
Second issue is that while the global-module-fragment start is fine, matches
the syntax of the new paragraph where the pp-tokens[opt] aren't present,
there is also private-module-fragment in the syntax where module is
followed by : private ; and in that case the colon doesn't match the
pp-module-name grammar and appears now to be invalid. I think the
https://eel.is/c++draft/cpp.module#2
paragraph needs to change so that it allows also that pp-tokens of
a pp-module may also be : pp-tokens[opt] (and in that case, I think
the colon shouldn't come from a macro and private and/or ; can).
Third issue is that there are too many pp-tokens in
https://eel.is/c++draft/cpp.module , one is all the tokens between
module keyword and the semicolon and one is the optional extra tokens
after pp-module-partition (if any, if missing, after pp-module).
Perhaps introducing some other non-terminal would help talking about it?
So in "where the pp-tokens (if any) shall not begin with a ( preprocessing
token" it isn't obvious which pp-tokens it is talking about (my assumption
is the latter) and also whether ( can't appear there just before macro
expansion or also after expansion. The patch expects only before expansion,
so
#define F ();
export module foo F
would be valid during preprocessing but obviously invalid during
compilation, but
#define foo(n) n;
export module foo (3)
would be invalid already during preprocessing.
The last issue applies only if the first issue is resolved to allow
expansion of tokens after : if first token, or after pp-module-partition
if present or after pp-module-name if present. When non-preprocessing
scanner sees
export module foo.bar:baz.qux;
it knows nothing can come from preprocessing macros and is ok, but if it
sees
export module foo.bar:baz qux
then it can't know whether it will be
export module foo.bar:baz;
or
export module foo.bar:baz [[]];
or
export module foo.bar:baz.freddy.garply;
because qux could be validly a macro, which expands to ; or [[]];
or .freddy.garply; etc. So, either the non-preprocessing scanner would
need to note it as possible export of foo.bar:baz* module partitions
and preprocess if it needs to know the details or just compile, or if that
is not ok, the wording would need to rule out that the expansion of (the
second) pp-tokens if any can't start with . or : (colon would be only
problematic if it isn't present in the tokens before it already).
So, if e.g. defining qux above to . whatever is invalid, then the scanner
can rely it sees the whole module name and partition.
The patch below implements what is above described as the first variant
of the first issue resolution, i.e. disables expansion of as many tokens
as could be in the valid module name and module partition syntax, but
as soon as it e.g. sees two adjacent identifiers, the second one can be
macro expanded. If it is macro expanded though, the expansion can't
start with . or :, and if it expands to nothing, tokens after it (whether
they come from macro expansion or not) can't start with . or :.
So, effectively:
#define SEMI ;
export module SEMI
used to be valid and isn't anymore,
#define FOO bar
export module FOO;
isn't valid,
#define COLON :
export module COLON private;
isn't valid,
#define BAR baz
export module foo.bar:baz.qux.BAR;
isn't valid,
#define BAZ .qux
export module foo BAZ;
isn't valid,
#define FREDDY :garply
export module foo FREDDY;
isn't valid,
while
#define QUX [[]]
export module foo QUX;
or
#define GARPLY private
module : GARPLY;
etc. is.
2024-11-01 Jakub Jelinek <jakub@redhat.com>
PR c++/114461
libcpp/
* include/cpplib.h: Implement C++26 P3034R1
- Module Declarations Shouldn’t be Macros (or more precisely
its expected intent).
(NO_DOT_COLON): Define.
* internal.h (struct cpp_reader): Add diagnose_dot_colon_from_macro_p
member.
* lex.cc (cpp_maybe_module_directive): For pp-module, if
module keyword is followed by CPP_NAME, ensure all CPP_NAME
tokens possibly matching module name and module partition
syntax aren't expanded and aren't defined as object-like macros.
Verify first token after that doesn't start with open paren.
If the next token after module name/partition is CPP_NAME defined
as macro, set NO_DOT_COLON flag on it.
* macro.cc (cpp_get_token_1): Set
pfile->diagnose_dot_colon_from_macro_p if token to be expanded has
NO_DOT_COLON bit set in flags. Before returning, if
pfile->diagnose_dot_colon_from_macro_p is true and not returning
CPP_PADDING or CPP_COMMENT and not during macro expansion preparation,
set pfile->diagnose_dot_colon_from_macro_p to false and diagnose
if returning CPP_DOT or CPP_COLON.
gcc/testsuite/
* g++.dg/modules/cpp-7.C: New test.
* g++.dg/modules/cpp-8.C: New test.
* g++.dg/modules/cpp-9.C: New test.
* g++.dg/modules/cpp-10.C: New test.
* g++.dg/modules/cpp-11.C: New test.
* g++.dg/modules/cpp-12.C: New test.
* g++.dg/modules/cpp-13.C: New test.
* g++.dg/modules/cpp-14.C: New test.
* g++.dg/modules/cpp-15.C: New test.
* g++.dg/modules/cpp-16.C: New test.
* g++.dg/modules/cpp-17.C: New test.
* g++.dg/modules/cpp-18.C: New test.
* g++.dg/modules/cpp-19.C: New test.
* g++.dg/modules/cpp-20.C: New test.
* g++.dg/modules/pmp-4.C: New test.
* g++.dg/modules/pmp-5.C: New test.
* g++.dg/modules/pmp-6.C: New test.
* g++.dg/modules/token-6.C: New test.
* g++.dg/modules/token-7.C: New test.
* g++.dg/modules/token-8.C: New test.
* g++.dg/modules/token-9.C: New test.
* g++.dg/modules/token-10.C: New test.
* g++.dg/modules/token-11.C: New test.
* g++.dg/modules/token-12.C: New test.
* g++.dg/modules/token-13.C: New test.
* g++.dg/modules/token-14.C: New test.
* g++.dg/modules/token-15.C: New test.
* g++.dg/modules/token-16.C: New test.
* g++.dg/modules/dir-only-3.C: Expect an error.
* g++.dg/modules/dir-only-4.C: Expect an error.
* g++.dg/modules/dir-only-5.C: New test.
* g++.dg/modules/atom-preamble-2_a.C: In export module malcolm;
replace malcolm with kevin. Don't define malcolm macro.
* g++.dg/modules/atom-preamble-4.C: Expect an error.
* g++.dg/modules/atom-preamble-5.C: New test.
|
|
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/;
this part).
2024-10-24 Jakub Jelinek <jakub@redhat.com>
include/
* dyn-string.h: Remove trailing whitespace.
* libiberty.h: Likewise.
* xregex.h: Likewise.
* splay-tree.h: Likewise.
* partition.h: Likewise.
* plugin-api.h: Likewise.
* demangle.h: Likewise.
* vtv-change-permission.h: Likewise.
* fibheap.h: Likewise.
* hsa_ext_image.h: Likewise.
* hashtab.h: Likewise.
* libcollector.h: Likewise.
* sort.h: Likewise.
* symcat.h: Likewise.
* hsa_ext_amd.h: Likewise.
libcpp/
* directives.cc: Remove trailing whitespace.
* mkdeps.cc: Likewise.
* line-map.cc: Likewise.
* internal.h: Likewise.
* files.cc: Likewise.
* init.cc: Likewise.
* makeucnid.cc: Likewise.
* system.h: Likewise.
* include/line-map.h: Likewise.
* include/symtab.h: Likewise.
* include/cpplib.h: Likewise.
* expr.cc: Likewise.
* charset.cc: Likewise.
* macro.cc: Likewise.
* errors.cc: Likewise.
* lex.cc: Likewise.
* traditional.cc: Likewise.
libgcc/
* crtstuff.c: Remove trailing whitespace.
* libgcov.h: Likewise.
* config/alpha/crtfastmath.c: Likewise.
* config/alpha/vms-gcc_shell_handler.c: Likewise.
* config/alpha/vms-unwind.h: Likewise.
* config/pa/linux-atomic.c: Likewise.
* config/pa/linux-unwind.h: Likewise.
* config/pa/quadlib.c: Likewise.
* config/pa/fptr.c: Likewise.
* config/s390/32/_fixsfdi.c: Likewise.
* config/s390/32/_fixunssfdi.c: Likewise.
* config/s390/32/_fixunsdfdi.c: Likewise.
* config/c6x/pr-support.c: Likewise.
* config/lm32/_udivsi3.c: Likewise.
* config/lm32/libgcc_lm32.h: Likewise.
* config/lm32/_udivmodsi4.c: Likewise.
* config/lm32/_mulsi3.c: Likewise.
* config/lm32/_modsi3.c: Likewise.
* config/lm32/_umodsi3.c: Likewise.
* config/lm32/_divsi3.c: Likewise.
* config/darwin-crt3.c: Likewise.
* config/msp430/mpy.c: Likewise.
* config/ia64/tf-signs.c: Likewise.
* config/ia64/fde-vms.c: Likewise.
* config/ia64/unwind-ia64.c: Likewise.
* config/ia64/vms-unwind.h: Likewise.
* config/ia64/sfp-exceptions.c: Likewise.
* config/ia64/quadlib.c: Likewise.
* config/ia64/unwind-ia64.h: Likewise.
* config/rl78/vregs.h: Likewise.
* config/arm/bpabi.c: Likewise.
* config/arm/unwind-arm.c: Likewise.
* config/arm/pr-support.c: Likewise.
* config/arm/linux-atomic.c: Likewise.
* config/arm/bpabi-lib.h: Likewise.
* config/frv/frvend.c: Likewise.
* config/frv/cmovw.c: Likewise.
* config/frv/frvbegin.c: Likewise.
* config/frv/cmovd.c: Likewise.
* config/frv/cmovh.c: Likewise.
* config/aarch64/cpuinfo.c: Likewise.
* config/i386/crtfastmath.c: Likewise.
* config/i386/cygming-crtend.c: Likewise.
* config/i386/32/tf-signs.c: Likewise.
* config/i386/crtprec.c: Likewise.
* config/i386/sfp-exceptions.c: Likewise.
* config/i386/w32-unwind.h: Likewise.
* config/m32r/initfini.c: Likewise.
* config/sparc/crtfastmath.c: Likewise.
* config/gcn/amdgcn_veclib.h: Likewise.
* config/nios2/linux-atomic.c: Likewise.
* config/nios2/linux-unwind.h: Likewise.
* config/nios2/lib2-mul.c: Likewise.
* config/nios2/lib2-nios2.h: Likewise.
* config/xtensa/unwind-dw2-xtensa.c: Likewise.
* config/rs6000/darwin-fallback.c: Likewise.
* config/rs6000/ibm-ldouble.c: Likewise.
* config/rs6000/sfp-machine.h: Likewise.
* config/rs6000/darwin-asm.h: Likewise.
* config/rs6000/darwin-crt2.c: Likewise.
* config/rs6000/aix-unwind.h: Likewise.
* config/rs6000/sfp-exceptions.c: Likewise.
* config/gthr-vxworks.c: Likewise.
* config/riscv/atomic.c: Likewise.
* config/visium/memcpy.c: Likewise.
* config/darwin-crt-tm.c: Likewise.
* config/stormy16/lib2funcs.c: Likewise.
* config/arc/ieee-754/divtab-arc-sf.c: Likewise.
* config/arc/ieee-754/divtab-arc-df.c: Likewise.
* config/arc/initfini.c: Likewise.
* config/sol2/gmon.c: Likewise.
* config/microblaze/divsi3_table.c: Likewise.
* config/m68k/fpgnulib.c: Likewise.
* libgcov-driver.c: Likewise.
* unwind-dw2.c: Likewise.
* fp-bit.c: Likewise.
* dfp-bit.h: Likewise.
* dfp-bit.c: Likewise.
* libgcov-driver-system.c: Likewise.
libgcc/config/libbid/
* _le_td.c: Remove trailing whitespace.
* bid128_compare.c: Likewise.
* bid_div_macros.h: Likewise.
* bid64_to_bid128.c: Likewise.
* bid64_to_uint32.c: Likewise.
* bid128_to_uint64.c: Likewise.
* bid64_div.c: Likewise.
* bid128_round_integral.c: Likewise.
* bid_binarydecimal.c: Likewise.
* bid128_string.c: Likewise.
* bid_flag_operations.c: Likewise.
* bid128_to_int64.c: Likewise.
* _mul_sd.c: Likewise.
* bid64_mul.c: Likewise.
* bid128_noncomp.c: Likewise.
* _gt_dd.c: Likewise.
* bid64_add.c: Likewise.
* bid64_string.c: Likewise.
* bid_from_int.c: Likewise.
* bid128.c: Likewise.
* _ge_dd.c: Likewise.
* _ne_sd.c: Likewise.
* _dd_to_td.c: Likewise.
* _unord_sd.c: Likewise.
* bid64_to_uint64.c: Likewise.
* _gt_sd.c: Likewise.
* _sd_to_td.c: Likewise.
* _addsub_td.c: Likewise.
* _ne_td.c: Likewise.
* bid_dpd.c: Likewise.
* bid128_add.c: Likewise.
* bid128_next.c: Likewise.
* _lt_sd.c: Likewise.
* bid64_next.c: Likewise.
* bid128_mul.c: Likewise.
* _lt_dd.c: Likewise.
* _ge_td.c: Likewise.
* _unord_dd.c: Likewise.
* bid64_sqrt.c: Likewise.
* bid_sqrt_macros.h: Likewise.
* bid64_fma.c: Likewise.
* _sd_to_dd.c: Likewise.
* bid_conf.h: Likewise.
* bid64_noncomp.c: Likewise.
* bid_gcc_intrinsics.h: Likewise.
* _gt_td.c: Likewise.
* _ge_sd.c: Likewise.
* bid128_minmax.c: Likewise.
* bid128_quantize.c: Likewise.
* bid32_to_bid64.c: Likewise.
* bid_round.c: Likewise.
* _td_to_sd.c: Likewise.
* bid_inline_add.h: Likewise.
* bid128_fma.c: Likewise.
* _eq_td.c: Likewise.
* bid32_to_bid128.c: Likewise.
* bid64_rem.c: Likewise.
* bid128_2_str_tables.c: Likewise.
* _mul_dd.c: Likewise.
* _dd_to_sd.c: Likewise.
* bid128_div.c: Likewise.
* _lt_td.c: Likewise.
* bid64_compare.c: Likewise.
* bid64_to_int32.c: Likewise.
* _unord_td.c: Likewise.
* bid128_rem.c: Likewise.
* bid_internal.h: Likewise.
* bid64_to_int64.c: Likewise.
* _eq_dd.c: Likewise.
* _td_to_dd.c: Likewise.
* bid128_to_int32.c: Likewise.
* bid128_to_uint32.c: Likewise.
* _ne_dd.c: Likewise.
* bid64_quantize.c: Likewise.
* _le_dd.c: Likewise.
* bid64_round_integral.c: Likewise.
* _le_sd.c: Likewise.
* bid64_minmax.c: Likewise.
libgcc/config/avr/libf7/
* f7-renames.h: Remove trailing whitespace.
libstdc++-v3/
* include/debug/debug.h: Remove trailing whitespace.
* include/parallel/base.h: Likewise.
* include/parallel/types.h: Likewise.
* include/parallel/settings.h: Likewise.
* include/parallel/multiseq_selection.h: Likewise.
* include/parallel/partition.h: Likewise.
* include/parallel/random_number.h: Likewise.
* include/parallel/find_selectors.h: Likewise.
* include/parallel/partial_sum.h: Likewise.
* include/parallel/list_partition.h: Likewise.
* include/parallel/search.h: Likewise.
* include/parallel/algorithmfwd.h: Likewise.
* include/parallel/random_shuffle.h: Likewise.
* include/parallel/multiway_mergesort.h: Likewise.
* include/parallel/sort.h: Likewise.
* include/parallel/algobase.h: Likewise.
* include/parallel/numericfwd.h: Likewise.
* include/parallel/multiway_merge.h: Likewise.
* include/parallel/losertree.h: Likewise.
* include/bits/basic_ios.h: Likewise.
* include/bits/stringfwd.h: Likewise.
* include/bits/ostream_insert.h: Likewise.
* include/bits/stl_heap.h: Likewise.
* include/bits/unordered_map.h: Likewise.
* include/bits/hashtable_policy.h: Likewise.
* include/bits/stl_iterator_base_funcs.h: Likewise.
* include/bits/valarray_before.h: Likewise.
* include/bits/regex.h: Likewise.
* include/bits/postypes.h: Likewise.
* include/bits/stl_iterator.h: Likewise.
* include/bits/localefwd.h: Likewise.
* include/bits/stl_algo.h: Likewise.
* include/bits/ios_base.h: Likewise.
* include/bits/stl_function.h: Likewise.
* include/bits/basic_string.h: Likewise.
* include/bits/hashtable.h: Likewise.
* include/bits/valarray_after.h: Likewise.
* include/bits/char_traits.h: Likewise.
* include/bits/gslice.h: Likewise.
* include/bits/locale_facets_nonio.h: Likewise.
* include/bits/mask_array.h: Likewise.
* include/bits/specfun.h: Likewise.
* include/bits/random.h: Likewise.
* include/bits/slice_array.h: Likewise.
* include/bits/valarray_array.h: Likewise.
* include/tr1/float.h: Likewise.
* include/tr1/functional_hash.h: Likewise.
* include/tr1/math.h: Likewise.
* include/tr1/hashtable_policy.h: Likewise.
* include/tr1/stdio.h: Likewise.
* include/tr1/complex.h: Likewise.
* include/tr1/stdbool.h: Likewise.
* include/tr1/stdarg.h: Likewise.
* include/tr1/inttypes.h: Likewise.
* include/tr1/fenv.h: Likewise.
* include/tr1/stdlib.h: Likewise.
* include/tr1/wchar.h: Likewise.
* include/tr1/tgmath.h: Likewise.
* include/tr1/limits.h: Likewise.
* include/tr1/wctype.h: Likewise.
* include/tr1/stdint.h: Likewise.
* include/tr1/ctype.h: Likewise.
* include/tr1/random.h: Likewise.
* include/tr1/shared_ptr.h: Likewise.
* include/ext/mt_allocator.h: Likewise.
* include/ext/sso_string_base.h: Likewise.
* include/ext/debug_allocator.h: Likewise.
* include/ext/vstring_fwd.h: Likewise.
* include/ext/pointer.h: Likewise.
* include/ext/pod_char_traits.h: Likewise.
* include/ext/malloc_allocator.h: Likewise.
* include/ext/vstring.h: Likewise.
* include/ext/bitmap_allocator.h: Likewise.
* include/ext/pool_allocator.h: Likewise.
* include/ext/type_traits.h: Likewise.
* include/ext/ropeimpl.h: Likewise.
* include/ext/codecvt_specializations.h: Likewise.
* include/ext/throw_allocator.h: Likewise.
* include/ext/extptr_allocator.h: Likewise.
* include/ext/atomicity.h: Likewise.
* include/ext/concurrence.h: Likewise.
* include/c_compatibility/wchar.h: Likewise.
* include/c_compatibility/stdint.h: Likewise.
* include/backward/hash_fun.h: Likewise.
* include/backward/binders.h: Likewise.
* include/backward/hashtable.h: Likewise.
* include/backward/auto_ptr.h: Likewise.
* libsupc++/eh_arm.cc: Likewise.
* libsupc++/unwind-cxx.h: Likewise.
* libsupc++/si_class_type_info.cc: Likewise.
* libsupc++/vec.cc: Likewise.
* libsupc++/class_type_info.cc: Likewise.
* libsupc++/vmi_class_type_info.cc: Likewise.
* libsupc++/guard_error.cc: Likewise.
* libsupc++/bad_typeid.cc: Likewise.
* libsupc++/eh_personality.cc: Likewise.
* libsupc++/atexit_arm.cc: Likewise.
* libsupc++/pmem_type_info.cc: Likewise.
* libsupc++/vterminate.cc: Likewise.
* libsupc++/eh_terminate.cc: Likewise.
* libsupc++/bad_cast.cc: Likewise.
* libsupc++/exception_ptr.h: Likewise.
* libsupc++/eh_throw.cc: Likewise.
* libsupc++/bad_alloc.cc: Likewise.
* libsupc++/nested_exception.cc: Likewise.
* libsupc++/pointer_type_info.cc: Likewise.
* libsupc++/pbase_type_info.cc: Likewise.
* libsupc++/bad_array_new.cc: Likewise.
* libsupc++/pure.cc: Likewise.
* libsupc++/eh_exception.cc: Likewise.
* libsupc++/bad_array_length.cc: Likewise.
* libsupc++/cxxabi.h: Likewise.
* libsupc++/guard.cc: Likewise.
* libsupc++/eh_catch.cc: Likewise.
* libsupc++/cxxabi_forced.h: Likewise.
* libsupc++/tinfo.h: Likewise.
|
|
The following patch on top of the r15-4346 patch adds
-Wleading-whitespace= warning option.
This warning doesn't care how much one actually indents which line
in the source (that is something that can't be easily done in the
preprocessor without doing syntactic analysis), but just simple checks
on what kind of whitespace is used in the indentation.
I think it is still useful to get warnings about such issues early,
while git diagnoses some of it in patches (e.g. the tab after space
case), getting the warnings earlier might help avoiding such issues
sooner.
There are projects which ban use of tabs and require just spaces,
others which require indentation just with horizontal tabs, and finally
projects which want indentation with tabs for multiples of tabstop size
followed by spaces (fewer than tabstop size), like GCC.
For all 3 kinds the warning diagnoses indentation with '\v' or '\f'
characters (unless line contains just whitespace), and for the last one
also cases where a space in the indentation is followed by horizontal
tab or where there are N or more consecutive spaces in the indentation
(for -ftabstop=N).
BTW, for additional testing I've enabled the warnings (without -Werror
for them) in stage3. There are many warnings (both trailing and leading
whitespace), some of them something that can be easily fixed in the headers
or source files, but others with whitespace issues in generated sources,
so if we enable the warnings, either we'd need to adjust the generators
or disable the warnings in (some of the) generated files.
2024-10-23 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_leading_whitespace and cpp_tabstop members.
(enum cpp_warning_reason): Add CPP_W_LEADING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document new
line note kinds.
* init.cc (cpp_create_reader): Set cpp_tabstop to 8.
* lex.cc (find_leading_whitespace_issues): New function.
(_cpp_clean_line): Use it.
(_cpp_process_line_notes): Handle 'L', 'S' and 'T' line notes.
(lex_raw_string): Clear type on 'L', 'S' and 'T' line notes
inside of raw string literals.
gcc/
* doc/invoke.texi (Wleading-whitespace=): Document.
gcc/c-family/
* c.opt (Wleading-whitespace=): New option.
* c-opts.cc (c_common_post_options): Set cpp_opts->cpp_tabstop
to global_dc->m_tabstop.
gcc/testsuite/
* c-c++-common/cpp/Wleading-whitespace-1.c: New test.
* c-c++-common/cpp/Wleading-whitespace-2.c: New test.
* c-c++-common/cpp/Wleading-whitespace-3.c: New test.
* c-c++-common/cpp/Wleading-whitespace-4.c: New test.
|
|
The following patch partially implements the N3353 paper.
In particular, it adds support for the delimited escape sequences
(\u{123}, \x{123}, \o{123}) which were added already for C++23,
all I had to do is split the delimited escape sequence guarding from
named universal character escape sequence guards
(\N{LATIN CAPITAL LETTER C WITH CARON}), which C++23 has but C2Y doesn't
and emit different diagnostics for C from C++ for the delimited escape
sequences.
And it adds support for the new style of octal literals, 0o137 or 0O1777.
I have so far added that just for C and not C++, because I have no idea
whether C++ will want to handle it similarly.
What the patch doesn't do is any kind of diagnostics for obsoletion of
\137 or 0137, as discussed in the PR, I think it is way too early for that.
Perhaps some non-default warning later on.
2024-10-17 Jakub Jelinek <jakub@redhat.com>
PR c/117028
libcpp/
* include/cpplib.h (struct cpp_options): Add named_uc_escape_seqs,
octal_constants and cpp_warn_c23_c2y_compat members.
(enum cpp_warning_reason): Add CPP_W_C23_C2Y_COMPAT enumerator.
* init.cc (struct lang_flags): Add named_uc_escape_seqs and
octal_constants bit-fields.
(lang_defaults): Add initializers for them into the table.
(cpp_set_lang): Initialize named_uc_escape_seqs and octal_constants.
(cpp_create_reader): Initialize cpp_warn_c23_c2y_compat to -1.
* charset.cc (_cpp_valid_ucn): Test
CPP_OPTION (pfile, named_uc_escape_seqs) rather than
CPP_OPTION (pfile, delimited_escape_seqs) in \N{} related tests.
Change wording of C cpp_pedwarning for \u{} and emit
-Wc23-c2y-compat warning for it too if needed. Formatting fixes.
(convert_hex): Change wording of C cpp_pedwarning for \u{} and emit
-Wc23-c2y-compat warning for it too if needed.
(convert_oct): Likewise.
* expr.cc (cpp_classify_number): Handle C2Y 0o or 0O prefixed
octal constants.
(cpp_interpret_integer): Likewise.
gcc/c-family/
* c.opt (Wc23-c2y-compat): Add CPP and CppReason parameters.
* c-opts.cc (set_std_c2y): Use CLK_STDC2Y or CLK_GNUC2Y rather
than CLK_STDC23 and CLK_GNUC23. Formatting fix.
* c-lex.cc (interpret_integer): Handle C2Y 0o or 0O prefixed
and wb/WB/uwb/UWB suffixed octal constants.
gcc/testsuite/
* gcc.dg/bitint-112.c: New test.
* gcc.dg/c23-digit-separators-1.c: Add _Static_assert for
valid binary constant with digit separator.
* gcc.dg/c23-octal-constants-1.c: New test.
* gcc.dg/c23-octal-constants-2.c: New test.
* gcc.dg/c2y-digit-separators-1.c: New test.
* gcc.dg/c2y-digit-separators-2.c: New test.
* gcc.dg/c2y-octal-constants-1.c: New test.
* gcc.dg/c2y-octal-constants-2.c: New test.
* gcc.dg/c2y-octal-constants-3.c: New test.
* gcc.dg/cpp/c23-delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/c23-delimited-escape-seq-2.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-2.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-3.c: New test.
* gcc.dg/cpp/c2y-delimited-escape-seq-4.c: New test.
* gcc.dg/octal-constants-1.c: New test.
* gcc.dg/octal-constants-2.c: New test.
* gcc.dg/octal-constants-3.c: New test.
* gcc.dg/octal-constants-4.c: New test.
* gcc.dg/system-octal-constants-1.c: New test.
* gcc.dg/system-octal-constants-1.h: New file.
|
|
This patch actually optimizes #embed, so far in C.
For a simple testcase (for 494447200 bytes long cc1plus):
cat embed-11.c
unsigned char a[] = {
#embed "cc1plus"
};
time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c
real 0m13.647s
user 0m7.157s
sys 0m2.597s
time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c
real 0m28.649s
user 0m26.653s
sys 0m1.958s
and when configured against binutils with .base64 support
time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c
real 0m4.283s
user 0m2.288s
sys 0m0.859s
time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c
real 0m6.888s
user 0m5.876s
sys 0m1.002s
(all times with --enable-checking=yes,rtl,extra compiler).
Even just
./cc1plus -E -o embed-11.i embed-11.c
(which doesn't have this optimization yet and so preprocesses it as
1.3GB preprocessed file) needed almost 25GB of compile time RAM (but
preprocessed fine).
And compiling that embed-11.i with -std=c23 -O0 by unpatched gcc
I gave up after 400 seconds when it already ate 45GB of RAM and didn't
produce a single byte into embed-11.s yet.
The patch introduces a new CPP_EMBED token which contains raw memory image
virtually representing a sequence of int literals.
To simplify the parsing complexities, the preprocessor guarantees CPP_EMBED
is only emitted if there are 4+ (it actually does that for 64+ right now)
literals in the sequence and emits CPP_NUMBER CPP_COMMA CPP_EMBED CPP_COMMA
CPP_NUMBER tokens (with more CPP_EMBED separated by CPP_COMMA if it is
longer than 2GB, as STRING_CSTs in GCC and also the new RAW_DATA_CST etc.
are limited to INT_MAX elements). The main reason is that the preprocessor
doesn't really know in which context #embed directive appears, there could
be e.g.
{ 25 *
#embed "whatever"
* 2 - 15 }
or similar and dealing with this special case deep in the expression parsing
is undesirable.
With the CPP_NUMBERs around it, I believe in the C FE the only places which
need handling of the CPP_EMBED token are initializer parsing (that is the
only one which adds actual optimizations for it), comma expressions (I
believe nothing really cares whether it is 25,13,95 or
25,13,0,1,2,3,4,5,6,7,8,9,10,13,95 etc., so besides the 2 outer CPP_NUMBER
the parsing just adds one INTEGER_CST to the comma expression, I doubt users
want to be spammed with millions of -Wunused warnings per #embed),
whatever uses c_parser_expr_list (function calls, attribute arguments,
OpenMP sizes clause argument, OpenACC tile clause argument and whatever uses
c_parser_get_builtin_args (mainly for __builtin_shufflevector). Please correct
me if I'm wrong.
The patch introduces a RAW_DATA_CST tree code, which can then be used inside
of array CONSTRUCTOR elt values. In some sense RAW_DATA_CST is similar to
STRING_CST, but right now STRING_CST is used only if the whole array
initializer is that constant, while RAW_DATA_CST at index idx (should be
always INTEGER_CST index, another advantage of the CPP_NUMBER around is that
[30 ... 250] =
#embed "whatever"
really does what it would do with a integer sequence there) stands for
[idx] = RAW_DATA_POINTER (val)[0],
[idx+1] = RAW_DATA_POINTER (val)[1],
...
[idx+RAW_DATA_LENGTH (val)-1] = RAW_DATA_POINTER (val)[RAW_DATA_LENGTH (val)-1].
Another important thing is that unlike STRING_CST which has the data
embedded in it RAW_DATA_CST doesn't own the data, it has RAW_DATA_OWNER
which owns the data (that can be a STRING_CST, e.g. used for PCH or LTO
after reading LTO in) or another RAW_DATA_CST (with NULL RAW_DATA_OWNER,
standing for data owned by libcpp buffers). The advantage is that it can be
cheaply peeled off, or split into multiple smaller pieces, e.g. if one uses
designated initializer to store something into the middle of a 10GB #embed
array, in no case we need to actually copy data around for that.
Right now RAW_DATA_CST is only used in initializers of integral arrays where
the integer type has (host) CHAR_BIT precision, so usually char/signed
char/unsigned char (for C++ later maybe std::byte); in theory we could say
allocate 4 times as big buffer for conversions to int array and depending
on endianity and storage order reversal etc., but I'm not sure if that is
something that will be actually needed in the wild.
And an optimization inside of c-common.cc attempts to undo that CPP_NUMBER
CPP_EMBED CPP_NUMBER division in case one uses #embed the usual way and
doesn't use the boundary literals in weird ways and the values there match
the surrounding bytes in the owner buffer.
For LTO, in order to avoid copying perhaps gigabytes long data around,
the hacks in the streamer out/in cause the data owned by libcpp to be
streamed right into the stream and streamed back as a STRING_CST which
owns the data.
2024-10-16 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (TTYPE_TABLE): Add CPP_EMBED token type.
* files.cc (finish_embed): For limit >= 64 and C preprocessing
instead of emitting CPP_NUMBER CPP_COMMA separated sequence for the
whole embed emit it just for the first and last byte and in between
emit a CPP_EMBED token or tokens if too large.
gcc/
* treestruct.def (TS_RAW_DATA_CST): New.
* tree.def (RAW_DATA_CST): New tree code.
* tree-core.h (struct tree_raw_data): New type.
(union tree_node): Add raw_data_cst member.
* tree.h (RAW_DATA_LENGTH, RAW_DATA_POINTER, RAW_DATA_OWNER): Define.
(gt_ggc_mx, gt_pch_nx): Declare overloads for tree_raw_data *.
* tree.cc (tree_node_structure_for_code): Handle RAW_DATA_CST.
(initialize_tree_contains_struct): Handle TS_RAW_DATA_CST.
(tree_code_size): Handle RAW_DATA_CST.
(initializer_zerop): Likewise.
(gt_ggc_mx, gt_pch_nx): Define overloads for tree_raw_data *.
* gimplify.cc (gimplify_init_ctor_eval): Handle RAW_DATA_CST.
* fold-const.cc (operand_compare::operand_equal_p): Handle
RAW_DATA_CST. Formatting fix.
(operand_compare::hash_operand): Handle RAW_DATA_CST.
(native_encode_initializer): Likewise.
(get_array_ctor_element_at_index): Likewise.
(fold): Likewise.
* gimple-fold.cc (fold_array_ctor_reference): Likewise. Formatting
fix.
* varasm.cc (const_hash_1): Handle RAW_DATA_CST.
(initializer_constant_valid_p_1): Likewise.
(array_size_for_constructor): Likewise.
(output_constructor_regular_field): Likewise.
* expr.cc (categorize_ctor_elements_1): Likewise.
(expand_expr_real_1) <case ARRAY_REF>: Punt for RAW_DATA_CST.
* tree-streamer.cc (streamer_check_handled_ts_structures): Mark
TS_RAW_DATA_CST as handled.
* tree-streamer-in.cc (streamer_alloc_tree): Handle RAW_DATA_CST.
(lto_input_ts_raw_data_cst_tree_pointers): New function.
(streamer_read_tree_body): Call it for RAW_DATA_CST.
* tree-streamer-out.cc (write_ts_raw_data_cst_tree_pointers): New
function.
(streamer_write_tree_body): Call it for RAW_DATA_CST.
(streamer_write_tree_header): Handle RAW_DATA_CST.
* lto-streamer-out.cc (DFS::DFS_write_tree_body): Handle RAW_DATA_CST.
* tree-pretty-print.cc (dump_generic_node): Likewise.
gcc/c-family/
* c-ppoutput.cc (token_streamer::stream): Add special code to spell
CPP_EMBED token.
* c-lex.cc (c_lex_with_flags): Handle CPP_EMBED. Formatting fix.
* c-common.cc (c_parse_error): Handle CPP_EMBED.
(braced_list_to_string): Optimize RAW_DATA_CST surrounded by
INTEGER_CSTs which match some bytes before or after RAW_DATA_CST in
its owner.
gcc/c/
* c-parser.cc (c_parser_braced_init): Handle CPP_EMBED.
(c_parser_get_builtin_args): Likewise.
(c_parser_expression): Likewise.
(c_parser_expr_list): Likewise.
* c-typeck.cc (digest_init): Handle RAW_DATA_CST. Formatting fix.
(init_node_successor): New function.
(add_pending_init): Handle RAW_DATA_CST.
(set_nonincremental_init): Formatting fix.
(output_init_element): Handle RAW_DATA_CST. Formatting fixes.
(maybe_split_raw_data): New function.
(process_init_element): Use maybe_split_raw_data. Handle
RAW_DATA_CST.
gcc/testsuite/
* c-c++-common/cpp/embed-20.c: New test.
* c-c++-common/cpp/embed-21.c: New test.
* c-c++-common/cpp/embed-28.c: New test.
* gcc.dg/cpp/embed-8.c: New test.
* gcc.dg/cpp/embed-9.c: New test.
* gcc.dg/cpp/embed-10.c: New test.
* gcc.dg/cpp/embed-11.c: New test.
* gcc.dg/cpp/embed-12.c: New test.
* gcc.dg/cpp/embed-13.c: New test.
* gcc.dg/cpp/embed-14.c: New test.
* gcc.dg/cpp/embed-15.c: New test.
* gcc.dg/cpp/embed-16.c: New test.
* gcc.dg/pch/embed-1.c: New test.
* gcc.dg/pch/embed-1.hs: New test.
* gcc.dg/lto/embed-1_0.c: New test.
* gcc.dg/lto/embed-1_1.c: New test.
|
|
Trailing blanks is something even git diff diagnoses; while it is a coding
style issue, if it is so common that git diff diagnoses it, I think it could
be useful to various projects to check that at compile time.
Dunno if it should be included in -Wextra, currently it isn't, and due to
tons of trailing whitespace in our sources, haven't enabled it for when
building gcc itself either.
Note, git diff also diagnoses indentation with tab following space, wonder
if we couldn't have trivial warning options where one would simply ask for
checking of indentation with no tabs, just spaces vs. indentation with
tabs followed by spaces (but never tab width or more spaces in the
indentation). I think that would be easy to do also on the libcpp side.
Checking how much something should be exactly indented requires syntax
analysis (at least some limited one) and can consider columns of first token
on line, but what the exact indentation blanks were is something only libcpp
knows.
On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> Generally I like diagnosing this early. For the above I'd say -Wtrailing-whitespace=
> with a set of things to diagnose (and a sane default - just spaces and tabs - for
> -Wtrailiing-whitespace) would be nice. As for naming possibly follow the
> is{space,blank,cntrl} character classifications? If those are a good
> fit, that is.
The patch currently allows blank (' ' '\t') and space (' ' '\t' '\f' '\v'),
cntrl not yet added, not anything non-ASCII, but in theory could
be added later (though, non-ASCII would be just for inside of comments,
say non-breaking space etc. in the source is otherwise an error).
2024-10-15 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_trailing_whitespace member.
(enum cpp_warning_reason): Add CPP_W_TRAILING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document 'W' line note.
* lex.cc (_cpp_clean_line): Add 'W' line note for trailing whitespace
except for trailing whitespace after backslash. Formatting fix.
(_cpp_process_line_notes): Emit -Wtrailing-whitespace diagnostics.
Formatting fixes.
(lex_raw_string): Clear type on 'W' notes.
gcc/
* doc/invoke.texi (Wtrailing-whitespace): Document.
gcc/c-family/
* c.opt (Wtrailing-whitespace=): New option.
(Wtrailing-whitespace): New alias.
* c.opt.urls: Regenerate.
gcc/testsuite/
* c-c++-common/cpp/Wtrailing-whitespace-1.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-2.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-3.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-4.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-5.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-6.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-7.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-8.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-9.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-10.c: New test.
|
|
The implementation of #pragma push_macro and #pragma pop_macro has to date
made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
identifier out of a string. When support was added for extended characters
in identifiers ($, UCNs, or UTF-8), that support was added only for the
"normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
not for the ad-hoc way. Consequently, extended identifiers are not usable
with these pragmas.
The logic for lexing identifiers has become more complicated than it was
when _cpp_lex_identifier() was written -- it now handles things like \N{}
escapes in C++, for instance -- and it no longer seems practical to maintain
a redundant code path for lexing identifiers. Address the issue by changing
the implementation of #pragma {push,pop}_macro to lex identifiers in the
expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
there.
The existing implementation has some quirks because of the ad-hoc parsing
logic. For example:
#pragma push_macro("X ")
...
#pragma pop_macro("X")
will not restore macro X (note the extra space in the first string). However:
#pragma push_macro("X ")
...
#pragma pop_macro("X ")
actually does sucessfully restore "X". This is because the key for looking
up the saved macro on the push stack is the original string passed, so the
string passed to pop_macro needs to match it exactly. It is not that easy to
reproduce this logic in the world of extended characters, given that for
example it should be valid to pass a UCN to push_macro, and the
corresponding UTF-8 to pop_macro. Given that this aspect of the existing
behavior seems unintentional and has no tests (and does not match other
implementations), I opted to make the new logic more straightforward. The
string passed needs to lex to one token, which must be a valid identifier,
or else no action is taken and no error is generated. Any diagnostics
encountered during lexing (e.g., due to a UTF-8 character not permitted to
appear in an identifier) are also suppressed.
It could be nice (for GCC 15) to also add a warning if a pop_macro does not
match a previous push_macro.
libcpp/ChangeLog:
PR preprocessor/109704
* include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
* errors.cc
(cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
function.
(cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
function.
* charset.cc (noop_diagnostic_cb): Remove.
(cpp_interpret_string_ranges): Refactor diagnostic suppression logic
into new class cpp_auto_suppress_diagnostics.
(count_source_chars): Likewise.
* directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
(lex_identifier_from_string): New static helper function.
(push_pop_macro_common): Refactor common logic from
do_pragma_push_macro and do_pragma_pop_macro; use
lex_identifier_from_string instead of _cpp_lex_identifier.
(do_pragma_push_macro): Reimplement using push_pop_macro_common.
(do_pragma_pop_macro): Likewise.
* internal.h (_cpp_lex_identifier): Remove.
* lex.cc (lex_identifier_intern): Remove.
(_cpp_lex_identifier): Remove.
gcc/testsuite/ChangeLog:
PR preprocessor/109704
* c-c++-common/cpp/pragma-push-pop-utf8.c: New test.
* g++.dg/pch/pushpop-2.C: New test.
* g++.dg/pch/pushpop-2.Hs: New test.
* gcc.dg/pch/pushpop-2.c: New test.
* gcc.dg/pch/pushpop-2.hs: New test.
|
|
When working on #embed support, or -Wheader-guard or other recent libcpp
changes, I've been annoyed by the libcpp diagnostics being visually
different from normal gcc diagnostics, especially in the area of quoting
stuff in the diagnostic messages.
Normall GCC diagnostics is gcc_diag/gcc_tdiag, one can use
%</%>, %qs etc. in there, while libcpp diagnostics was marked as printf
and in libcpp we've been very creative with quoting stuff, either
no quotes at all, or "something" quoting, or 'something' quoting, or
`something' quoting (but in none of the cases it used colors consistently
with the rest of the compiler).
Now, libcpp diagnostics is always emitted using a callback,
pfile->cb.diagnostic. On the gcc/ side, this callback is initialized with
genmatch.cc: cb->diagnostic = diagnostic_cb;
c-family/c-opts.cc: cb->diagnostic = c_cpp_diagnostic;
fortran/cpp.cc: cb->diagnostic = cb_cpp_diagnostic;
where the latter two just use diagnostic_report_diagnostic, so actually
support all the gcc_diag stuff, only the genmatch.cc case didn't.
So, the following patch changes genmatch.cc to use pp_format* instead
of vfprintf so that it supports the gcc_diag formatting (pretty-print.o
unfortunately has various dependencies, so had to link genmatch with
libcommon.a libbacktrace.a and tweak Makefile.in so that there are no
circular dependencies) and marks the libcpp diagnostic routines as
gcc_diag rather than printf. That change resulted in hundreds of
-Wformat-diag new warnings (most of them useful and resulting IMHO in
better diagnostics), so the rest of the patch is changing the format
strings to make -Wformat-diag happy and adjusting the testsuite for
the differences in how is the diagnostic reformatted.
Dunno if some out of GCC tree projects use libcpp, that case would
make it harder because one couldn't use vfprintf in the diagnostic
callback anymore, but there is always David's libdiagnostic which could
be used for that purpose IMHO.
2024-10-12 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (ATTRIBUTE_CPP_PPDIAG): Define.
(struct cpp_callbacks): Use ATTRIBUTE_CPP_PPDIAG instead of
ATTRIBUTE_FPTR_PRINTF on diagnostic callback.
(cpp_error, cpp_warning, cpp_pedwarning, cpp_warning_syshdr): Use
ATTRIBUTE_CPP_PPDIAG (3, 4) instead of ATTRIBUTE_PRINTF_3.
(cpp_warning_at, cpp_pedwarning_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5)
instead of ATTRIBUTE_PRINTF_4.
(cpp_error_with_line, cpp_warning_with_line, cpp_pedwarning_with_line,
cpp_warning_with_line_syshdr): Use ATTRIBUTE_CPP_PPDIAG (5, 6)
instead of ATTRIBUTE_PRINTF_5.
(cpp_error_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of
ATTRIBUTE_PRINTF_4.
* Makefile.in (po/$(PACKAGE).pot): Use --language=GCC-source rather
than --language=c.
* errors.cc (cpp_diagnostic_at, cpp_diagnostic,
cpp_diagnostic_with_line): Use ATTRIBUTE_CPP_PPDIAG instead of
-ATTRIBUTE_FPTR_PRINTF.
* charset.cc (cpp_host_to_exec_charset, _cpp_valid_ucn, convert_hex,
convert_oct, convert_escape): Fix up -Wformat-diag warnings.
(cpp_interpret_string_ranges, count_source_chars): Use
ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF.
(narrow_str_to_charconst): Fix up -Wformat-diag warnings.
* directives.cc (check_eol_1, directive_diagnostics, lex_macro_node,
do_undef, glue_header_name, parse_include, do_include_common,
do_include_next, _cpp_parse_embed_params, do_embed, read_flag,
do_line, do_linemarker, register_pragma_1, do_pragma_once,
do_pragma_push_macro, do_pragma_pop_macro, do_pragma_poison,
do_pragma_system_header, do_pragma_warning_or_error, _cpp_do__Pragma,
do_else, do_elif, do_endif, parse_answer, do_assert,
cpp_define_unused): Likewise.
* expr.cc (cpp_classify_number, parse_defined, eval_token,
_cpp_parse_expr, reduce, check_promotion): Likewise.
* files.cc (_cpp_find_file, finish_base64_embed,
_cpp_pop_file_buffer): Likewise.
* init.cc (sanity_checks): Likewise.
* lex.cc (_cpp_process_line_notes, maybe_warn_bidi_on_char,
_cpp_warn_invalid_utf8, _cpp_skip_block_comment,
warn_about_normalization, forms_identifier_p, maybe_va_opt_error,
identifier_diagnostics_on_lex, cpp_maybe_module_directive): Likewise.
* macro.cc (class vaopt_state, builtin_has_include_1,
builtin_has_include, builtin_has_embed, _cpp_warn_if_unused_macro,
_cpp_builtin_macro_text, builtin_macro, stringify_arg,
_cpp_arguments_ok, collect_args, enter_macro_context,
_cpp_save_parameter, parse_params, create_iso_definition,
_cpp_create_definition, check_trad_stringification): Likewise.
* pch.cc (cpp_valid_state): Likewise.
* traditional.cc (_cpp_scan_out_logical_line, recursive_macro):
Likewise.
gcc/
* Makefile.in (generated_files): Remove {gimple,generic}-match*.
(generated_match_files): New variable. Add a dependency of
$(filter-out $(OBJS-libcommon),$(ALL_HOST_OBJS)) files on those.
(build/genmatch$(build_exeext)): Depend on and link against
libcommon.a and $(LIBBACKTRACE).
* genmatch.cc: Include pretty-print.h and input.h.
(ggc_internal_cleared_alloc, ggc_free): Remove.
(fatal): New function.
(line_table): Remove.
(linemap_client_expand_location_to_spelling_point): Remove.
(diagnostic_cb): Use gcc_diag rather than printf format. Use
pp_format_verbatim on a temporary pretty_printer instead of
vfprintf.
(fatal_at, warning_at): Use gcc_diag rather than printf format.
(output_line_directive): Rename location_hash to loc_hash.
(parser::eat_ident, parser::parse_operation, parser::parse_expr,
parser::parse_pattern, parser::finish_match_operand): Fix up
-Wformat-diag warnings.
gcc/c-family/
* c-lex.cc (c_common_has_attribute,
c_common_lex_availability_macro): Fix up -Wformat-diag warnings.
gcc/testsuite/
* c-c++-common/cpp/counter-2.c: Adjust expected diagnostics for
libcpp diagnostic formatting changes.
* c-c++-common/cpp/embed-3.c: Likewise.
* c-c++-common/cpp/embed-4.c: Likewise.
* c-c++-common/cpp/embed-16.c: Likewise.
* c-c++-common/cpp/embed-18.c: Likewise.
* c-c++-common/cpp/eof-2.c: Likewise.
* c-c++-common/cpp/eof-3.c: Likewise.
* c-c++-common/cpp/fmax-include-depth.c: Likewise.
* c-c++-common/cpp/has-builtin.c: Likewise.
* c-c++-common/cpp/line-2.c: Likewise.
* c-c++-common/cpp/line-3.c: Likewise.
* c-c++-common/cpp/macro-arg-count-1.c: Likewise.
* c-c++-common/cpp/macro-arg-count-2.c: Likewise.
* c-c++-common/cpp/macro-ranges.c: Likewise.
* c-c++-common/cpp/named-universal-char-escape-4.c: Likewise.
* c-c++-common/cpp/named-universal-char-escape-5.c: Likewise.
* c-c++-common/cpp/pr88974.c: Likewise.
* c-c++-common/cpp/va-opt-error.c: Likewise.
* c-c++-common/cpp/va-opt-pedantic.c: Likewise.
* c-c++-common/cpp/Wheader-guard-2.c: Likewise.
* c-c++-common/cpp/Wheader-guard-3.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-1.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-2.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-3.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c:
Likewise.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c:
Likewise.
* c-c++-common/pr68833-3.c: Likewise.
* c-c++-common/raw-string-directive-1.c: Likewise.
* gcc.dg/analyzer/named-constants-Wunused-macros.c: Likewise.
* gcc.dg/binary-constants-4.c: Likewise.
* gcc.dg/builtin-redefine.c: Likewise.
* gcc.dg/cpp/19951025-1.c: Likewise.
* gcc.dg/cpp/c11-warning-1.c: Likewise.
* gcc.dg/cpp/c11-warning-2.c: Likewise.
* gcc.dg/cpp/c11-warning-3.c: Likewise.
* gcc.dg/cpp/c23-elifdef-2.c: Likewise.
* gcc.dg/cpp/c23-warning-2.c: Likewise.
* gcc.dg/cpp/embed-2.c: Likewise.
* gcc.dg/cpp/embed-3.c: Likewise.
* gcc.dg/cpp/embed-4.c: Likewise.
* gcc.dg/cpp/expr.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-2.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-3.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-4.c: Likewise.
* gcc.dg/cpp/gnu11-warning-1.c: Likewise.
* gcc.dg/cpp/gnu11-warning-2.c: Likewise.
* gcc.dg/cpp/gnu11-warning-3.c: Likewise.
* gcc.dg/cpp/gnu23-warning-2.c: Likewise.
* gcc.dg/cpp/include6.c: Likewise.
* gcc.dg/cpp/pr35322.c: Likewise.
* gcc.dg/cpp/tr-warn6.c: Likewise.
* gcc.dg/cpp/undef2.c: Likewise.
* gcc.dg/cpp/warn-comments.c: Likewise.
* gcc.dg/cpp/warn-comments-2.c: Likewise.
* gcc.dg/cpp/warn-comments-3.c: Likewise.
* gcc.dg/cpp/warn-cxx-compat.c: Likewise.
* gcc.dg/cpp/warn-cxx-compat-2.c: Likewise.
* gcc.dg/cpp/warn-deprecated.c: Likewise.
* gcc.dg/cpp/warn-deprecated-2.c: Likewise.
* gcc.dg/cpp/warn-long-long.c: Likewise.
* gcc.dg/cpp/warn-long-long-2.c: Likewise.
* gcc.dg/cpp/warn-normalized-1.c: Likewise.
* gcc.dg/cpp/warn-normalized-2.c: Likewise.
* gcc.dg/cpp/warn-normalized-3.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-unicode.c: Likewise.
* gcc.dg/cpp/warn-redefined.c: Likewise.
* gcc.dg/cpp/warn-redefined-2.c: Likewise.
* gcc.dg/cpp/warn-traditional.c: Likewise.
* gcc.dg/cpp/warn-traditional-2.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-1.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-2.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-3.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-4.c: Likewise.
* gcc.dg/cpp/warn-undef.c: Likewise.
* gcc.dg/cpp/warn-undef-2.c: Likewise.
* gcc.dg/cpp/warn-unused-macros.c: Likewise.
* gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
* gcc.dg/pch/counter-2.c: Likewise.
* g++.dg/cpp0x/udlit-error1.C: Likewise.
* g++.dg/cpp23/named-universal-char-escape1.C: Likewise.
* g++.dg/cpp23/named-universal-char-escape2.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-1.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-2.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-3.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-4.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-5.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-6.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-7.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-8.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-9.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-10.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-11.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-12.C: Likewise.
* g++.dg/cpp/elifdef-3.C: Likewise.
* g++.dg/cpp/elifdef-5.C: Likewise.
* g++.dg/cpp/elifdef-6.C: Likewise.
* g++.dg/cpp/elifdef-7.C: Likewise.
* g++.dg/cpp/embed-1.C: Likewise.
* g++.dg/cpp/embed-2.C: Likewise.
* g++.dg/cpp/pedantic-errors.C: Likewise.
* g++.dg/cpp/warning-1.C: Likewise.
* g++.dg/cpp/warning-2.C: Likewise.
* g++.dg/ext/bitint1.C: Likewise.
* g++.dg/ext/bitint2.C: Likewise.
|
|
This patch adds a warning switch for "#pragma once in main file". The
warning option name is Wpragma-once-outside-header, which is the same
as Clang provides.
PR preprocessor/89808
gcc/c-family/ChangeLog:
* c.opt (Wpragma_once_outside_header): Define new option.
* c.opt.urls: Regenerate.
gcc/ChangeLog:
* doc/invoke.texi (Warning Options): Document
-Wno-pragma-once-outside-header.
libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Define
CPP_W_PRAGMA_ONCE_OUTSIDE_HEADER.
* directives.cc (do_pragma_once): Use
CPP_W_PRAGMA_ONCE_OUTSIDE_HEADER.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wno-pragma-once-outside-header.C: New test.
* g++.dg/warn/Wpragma-once-outside-header.C: New test.
Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Marek Polacek <polacek@redhat.com>
|
|
The following patch implements the clang -Wheader-guard warning, which warns
if a valid multiple inclusion header guard's #ifndef/#if !defined directive
is immediately (no other non-line directives nor other (non-comment)
tokens in between) followed by #define directive for some different macro,
which in get_suggestion rules is close enough to the actual header guard
macro (i.e. likely misspelling), the #define is object-like with empty
definition (I've followed what clang implements) and the macro isn't defined
later on (at least not on the final #endif at the end of a header).
In this case it emits a warning, so that
#ifndef STDIO_H
#define STDOI_H
...
#endif
or similar misspellings can be caught.
clang enables this warning by default, but I've put it into -Wall instead
as it still seems to be a style warning, nothing more severe; if a header
doesn't survive multiple inclusion because of the misspelling, users will
get different diagnostics.
2024-10-02 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/96842
libcpp/
* include/cpplib.h (struct cpp_options): Add warn_header_guard member.
(enum cpp_warning_reason): Add CPP_W_HEADER_GUARD enumerator.
* internal.h (struct cpp_reader): Add mi_def_cmacro, mi_loc and
mi_def_loc members.
(_cpp_defined_macro_p): Constify type pointed by argument type.
Formatting fix.
* init.cc (cpp_create_reader): Clear
CPP_OPTION (pfile, warn_header_guard).
* directives.cc (struct if_stack): Add def_loc and mi_def_cmacro
members.
(DIRECTIVE_TABLE): Add IF_COND flag to define.
(do_define): Set ifs->mi_def_cmacro on a define immediately following
#ifndef directive for the guard. Clear pfile->mi_valid. Formatting
fix.
(do_endif): Copy over pfile->mi_def_cmacro and pfile->mi_def_loc
if ifs->mi_def_cmacro is set and pfile->mi_cmacro isn't a defined
macro.
(push_conditional): Clear mi_def_cmacro and mi_def_loc members.
* files.cc (_cpp_pop_file_buffer): Emit -Wheader-guard diagnostics.
gcc/
* doc/invoke.texi (Wheader-guard): Document.
gcc/c-family/
* c.opt (Wheader-guard): New option.
* c.opt.urls: Regenerated.
* c-ppoutput.cc (init_pp_output): Initialize also cb->get_suggestion.
gcc/testsuite/
* c-c++-common/cpp/Wheader-guard-1.c: New test.
* c-c++-common/cpp/Wheader-guard-1-1.h: New test.
* c-c++-common/cpp/Wheader-guard-1-2.h: New test.
* c-c++-common/cpp/Wheader-guard-1-3.h: New test.
* c-c++-common/cpp/Wheader-guard-1-4.h: New test.
* c-c++-common/cpp/Wheader-guard-1-5.h: New test.
* c-c++-common/cpp/Wheader-guard-1-6.h: New test.
* c-c++-common/cpp/Wheader-guard-1-7.h: New test.
* c-c++-common/cpp/Wheader-guard-1-8.h: New test.
* c-c++-common/cpp/Wheader-guard-1-9.h: New test.
* c-c++-common/cpp/Wheader-guard-1-10.h: New test.
* c-c++-common/cpp/Wheader-guard-1-11.h: New test.
* c-c++-common/cpp/Wheader-guard-1-12.h: New test.
* c-c++-common/cpp/Wheader-guard-2.c: New test.
* c-c++-common/cpp/Wheader-guard-2.h: New test.
* c-c++-common/cpp/Wheader-guard-3.c: New test.
* c-c++-common/cpp/Wheader-guard-3.h: New test.
|
|
Using cpp_pedwarning (CPP_W_PEDANTIC instead of if (CPP_PEDANTIC cpp_error
lets users suppress these diagnostics with
#pragma GCC diagnostic ignored "-Wpedantic".
This patch changes all instances of the cpp_error (CPP_DL_PEDWARN to
cpp_pedwarning. In cases where the extension appears in a later C++
revision, we now condition the warning on the relevant -Wc++??-extensions
flag instead of -Wpedantic; in such cases often the if (CPP_PEDANTIC) check
is retained to preserve the default non-warning behavior.
I didn't attempt to adjust the warning flags for the C compiler, since it
seems to follow a different system than C++.
The CPP_PEDANTIC check is also kept in _cpp_lex_direct to avoid an ICE in
the self-tests from cb.diagnostics not being initialized.
While working on testcases for these changes I noticed that the c-c++-common
tests are not run with -pedantic-errors by default like the gcc.dg and
g++.dg directories are. And if I specify -pedantic-errors with dg-options,
the default -std= changes from c++?? to gnu++??, which interferes with some
other pedwarns. So two of the tests are C++-only.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_warning_reason): Add
CPP_W_CXX{14,17,20,23}_EXTENSIONS.
* charset.cc (_cpp_valid_ucn, convert_hex, convert_oct)
(convert_escape, narrow_str_to_charconst): Use cpp_pedwarning
instead of cpp_error for pedwarns.
* directives.cc (directive_diagnostics, _cpp_handle_directive)
(do_line, do_elif): Likewise.
* expr.cc (cpp_classify_number, eval_token): Likewise.
* lex.cc (skip_whitespace, maybe_va_opt_error)
(_cpp_lex_direct): Likewise.
* macro.cc (_cpp_arguments_ok): Likewise.
(replace_args): Use -Wvariadic-macros for pedwarn about
empty macro arguments.
gcc/c-family/ChangeLog:
* c.opt: Add CppReason for Wc++{14,17,20,23}-extensions.
* c-pragma.cc (handle_pragma_diagnostic_impl): Don't check
OPT_Wc__23_extensions.
gcc/testsuite/ChangeLog:
* c-c++-common/pragma-diag-17.c: New test.
* g++.dg/cpp0x/va-opt1.C: New test.
* g++.dg/cpp23/named-universal-char-escape3.C: New test.
|
|
The following patch implements the C23 N3017 "#embed - a scannable,
tooling-friendly binary resource inclusion mechanism" paper.
The implementation is intentionally dumb, in that it doesn't significantly
speed up compilation of larger initializers and doesn't make it possible
to use huge #embeds (like several gigabytes large, that is compile time
and memory still infeasible).
There are 2 reasons for this. One is that I think like it is implemented
now in the patch is how we should use it for the smaller #embed sizes,
dunno with which boundary, whether 32 bytes or 64 or something like that,
certainly handling the single byte cases which is something that can appear
anywhere in the source where constant integer literal can appear is
desirable and I think for a few bytes it isn't worth it to come up with
something smarter and users would like to e.g. see it in -E readably as
well (perhaps the slow vs. fast boundary should be determined by command
line option). And the other one is to be able to more easily find
regressions in behavior caused by the optimizations, so we have something
to get back in git to compare against.
I'm definitely willing to work on the optimizations (likely introduce a new
CPP_* token type to refer to a range of libcpp owned memory (start + size)
and similarly some tree which can do the same, and can be at any time e.g.
split into 2 subparts + say INTEGER_CST in between if needed say for
const unsigned char d[] = {
#embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
}; still without having to copy around huge amounts of data; STRING_CST
owns the memory it points to and can be only 2GB in size), but would
like to do that incrementally.
And would like to first include some extensions also not included in
this patch, like gnu::offset (off) parameter to allow to skip certain
constant amount of bytes at the start of the files, plus
gnu::base64 ("base64_encoded_data") parameter to add something which can
store more efficiently large amounts of the #embed data in preprocessed
source.
I've been cross-checking all the tests also against the LLVM implementation
https://github.com/llvm/llvm-project/pull/68620
which has been for a few hours even committed to LLVM trunk but reverted
afterwards. LLVM now has the support committed and I admit I haven't
rechecked whether the behavior on the below mentioned spots have been fixed
in it already or not yet.
The patch uses --embed-dir= option that clang plans to add above and doesn't
use other variants on the search directories yet, plus there are no
default directories at least for the time being where to search for embed
files. So, #embed "..." works if it is found in the same directory (or
relative to the current file's directory) and #embed "/..." or #embed </...>
work always, but relative #embed <...> doesn't unless at least one
--embed-dir= is specified. There is no reason to differentiate between
system and non-system directories, so we don't need -isystem like
counterpart, perhaps -iquote like counterpart could be useful in the future,
dunno what else. It has --embed-directory=dir and --embed-directory dir
as aliases.
There are some differences beyond clang ICEs, so I'd like to point them out
to make sure there is agreement on the choices in the patch. They are also
mentioned in the comments of the llvm pull request.
The most important is that the GCC patch (as well as the original thephd.dev
LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
a mere sequence of numbers like 123,2,35,26 rather then what clang
effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
char)35,(unsigned char)26 but only does that when using integrated
preprocessor, not when using -save-temps where it acts as GCC.
JeanHeyd as the original author agrees that is how it is currently worded in
C23.
Another difference (not tested in the testsuite, not sure how to check for
effective target /dev/urandom nor am sure it is desirable to check that
during testsuite) is how to treat character devices, named pipes etc.
(block devices are errored on). The original paper uses /dev/urandom
in various examples and seems to assume that unlike regular files the
devices aren't really cached, so
#embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
#embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
usually results in a != b. That is what the godbolt thephd.dev branch
implements too and what this patch does as well, but clang actually seems
to just go from st.st_size == 0, ergo it must be zero-sized resource and
so just copies over if_empty if present. It is really questionable
what to do about the character devices/named pipes with __has_embed, for
regular files the patch doesn't read anything from them, relies on
st.st_size + limit for whether it is empty or non-empty. But I don't know
of a way to check if read on say a character device would read anything
or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
if we read something, that would be better cached for later because
#embed later if it reads again could read no further data even when it
first read something. So, the patch currently for __has_embed just
always returns 2 on the non-regular files, like the thephd.dev
branch does as well and like the clang pull request as well.
A question is also what to do for gnu::offset on the non-regular files
even for #embed, those aren't seekable and do we want to just read and throw
away the offset bytes each time we see it used?
clang also chokes on the
#if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
__if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
#error "__has_embed fail"
#endif
in embed-1.c, but thephd.dev branch accepts it and I don't see why
it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
sequence and the file isn't empty, so it should just be parsed and
discarded.
clang also IMHO mishandles
const unsigned char w[] = {
#embed __FILE__ prefix([0] = 42, [15] =) limit(32)
};
but again only without -save-temps, seems like it
treats it as
[0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
rather than
[0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
and warns on it for -Wunused-value and just compiles it as
[0] = 42, [15] = 98
And also
void foo (int, int, int, int);
void bar (void) { foo (
#embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
); }
is treated as
172 + (118, 111, 105, 100) + 2
rather than
172 + 118, 111, 105, 100 + 2
which clang -save-temps or GCC treats it like, so results
in just one argument passed rather than 4.
if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
in the testcase fails as well, but in that case calling it in gdb succeeds:
p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
$2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
so I guess they are just trying to constant evaluate strstr and do it
incorrectly.
They started with making the optimizations together in the initial patch
set, so they don't have the luxury to compare if it is just because of
the optimization they are trying to do or because that is how the
feature works for them. At least unless they use -save-temps for now.
There is also different behavior between clang and gcc on -M or other
dependency generating options. Seems clang includes the __has_embed
searched files in dependencies, while my patch doesn't. But so does
clang for __has_include and GCC doesn't. Emitting a hard dependency
on some header just because there was __has_include/__has_embed for it
seems wrong to me, because (at least when properly written) the source
likely doesn't mind if the file is missing, it will do something else,
so a hard error from make because of it doesn't seem right. Does
make have some weaker dependencies, such that if some file can be remade
it is but if it doesn't exist, it isn't fatal?
I wonder whether #embed <non-existent-file> really needs to be fatal
or whether we could simply after diagnosing it pretend the file exists
and is empty. For #include I think fatal errors make tons of sense,
but perhaps for #embed which is more localized we'd get better error
reporting if we didn't bail out immediately. Note, both GCC and clang
currently treat those as fatal errors.
clang also added -dE option which with -E instead of preprocessing
the #embed directives keeps them as is, but the preprocessed source
then isn't self-contained. That option looks more harmful than useful to
me.
Also, it isn't clear to me from C23 whether it is possible to have
__has_include/__has_c_attribute/__has_embed expressions inside of
the limit #embed/__has_embed argument.
6.10.3.2/2 says that defined should not appear there (and the patch
diagnoses it and testsuite tests), but for __has_include/__has_embed
etc. 6.10.1/11 says:
"The identifiers __has_include, __has_embed, and __has_c_attribute
shall not appear in any context not mentioned in this subclause."
If that subclause in that case means 6.10.1, then it presumably shouldn't
appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
rules. Haven't included tests like
#if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
or
#embed __FILE__ limit (__has_include (__FILE__))
into the testsuite because of the doubts but I think the patch should
handle those right now.
The reason I've used Magna Carta text in some of the testcases is that
I hope it shouldn't be copyrighted after the centuries and I'd strongly
prefer not to have binary blobs in git after the xz backdoor lesson
and wanted something larger which doesn't change all the time.
Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
if (f_source == NULL);
return 1;
(note the spurious semicolon after closing paren), has that been fixed
already?
Like the thephd.dev and clang implementations, the patch always macro
expands the whole #embed and __has_embed directives except for the
embed keyword. That is most likely not what C23 says, my limited
understanding right now is that in #embed one needs to parse the whole
directive line with macro expansion disabled and check if it satisfies the
grammar, if not, the whole directive is macro expanded, if yes, only
the limit parameter argument is macro expanded and the prefix/suffix/if_empty
arguments are maybe macro expanded when actually used (and not at all if
unused). And I think __has_embed macro expansion has conflicting rules.
2024-09-12 Jakub Jelinek <jakub@redhat.com>
PR c/105863
libcpp/
* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
tooling-friendly binary resource inclusion mechanism paper.
(struct cpp_options): Add embed member.
(enum cpp_builtin_type): Add BT_HAS_EMBED.
(cpp_set_include_chains): Add another cpp_dir * argument to
the declaration.
* internal.h (enum include_type): Add IT_EMBED.
(struct cpp_reader): Add embed_include member.
(struct cpp_embed_params_tokens): New type.
(struct cpp_embed_params): New type.
(_cpp_get_token_no_padding): Declare.
(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(_cpp_stack_embed): Declare.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument.
(_cpp_parse_embed_params): Declare.
* directives.cc (DIRECTIVE_TABLE): Add embed entry.
(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
directives-only mode.
(parse_include): Don't Call check_eol for T_EMBED directive.
(skip_balanced_token_seq): New function.
(EMBED_PARAMS): Define.
(enum embed_param_kind): New type.
(embed_params): New variable.
(_cpp_parse_embed_params): New function.
(do_embed): New function.
(do_if): Adjust _cpp_parse_expr caller.
(do_elif): Likewise.
* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
parameters.
(_cpp_parse_expr): Change return type to cpp_num_part instead of
bool, change second argument from bool to const char * and add third
argument. Adjust function comment. For #embed/__has_embed parameters
add an artificial CPP_OPEN_PAREN. Use the second argument DIR
directly instead of string literals conditional on IS_IF.
For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
matching the artificial one. Diagnose negative or too large embed
parameter operands.
(num_binary_op): Use #embed instead of #if for diagnostics if inside
#embed/__has_embed parameter.
(num_div_op): Likewise.
* files.cc (struct _cpp_file): Add limit member and embed bitfield.
(search_cache): Add IS_EMBED argument, formatting fix. Skip over
files with different file->embed from the argument.
(find_file_in_dir): Don't call pch_open_file if file->embed.
(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
(read_file_guts): Formatting fix.
(has_unique_contents): Ignore file->embed files.
(search_path_head): Handle IT_EMBED type.
(_cpp_stack_embed): New function.
(_cpp_get_file_stat): Formatting fix.
(cpp_set_include_chains): Add embed argument, save it to
pfile->embed_include and compute lens for the chain.
* init.cc (struct lang_flags): Add embed member.
(lang_defaults): Add embed initializers.
(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
(builtin_array): Add __has_embed entry.
(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
* lex.cc (cpp_directive_only_process): Handle #embed.
* macro.cc (cpp_get_token_no_padding): Rename to ...
(_cpp_get_token_no_padding): ... this. No longer static.
(builtin_has_include_1): New function.
(builtin_has_include): Use it. Use _cpp_get_token_no_padding
instead of cpp_get_token_no_padding.
(builtin_has_embed): New function.
(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
gcc/
* doc/cppdiropts.texi (--embed-dir=): Document.
* doc/cpp.texi (Binary Resource Inclusion): New chapter.
(__has_embed): Document.
* doc/invoke.texi (Directory Options): Mention --embed-dir=.
* gcc.cc (cpp_unique_options): Add %{-embed*}.
* genmatch.cc (main): Adjust cpp_set_include_chains caller.
* incpath.h (enum incpath_kind): Add INC_EMBED.
* incpath.cc (merge_include_chains): Handle INC_EMBED.
(register_include_chains): Adjust cpp_set_include_chains caller.
gcc/c-family/
* c.opt (-embed-dir=): New option.
(-embed-directory): New alias.
(-embed-directory=): New alias.
* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
gcc/testsuite/
* c-c++-common/cpp/embed-1.c: New test.
* c-c++-common/cpp/embed-2.c: New test.
* c-c++-common/cpp/embed-3.c: New test.
* c-c++-common/cpp/embed-4.c: New test.
* c-c++-common/cpp/embed-5.c: New test.
* c-c++-common/cpp/embed-6.c: New test.
* c-c++-common/cpp/embed-7.c: New test.
* c-c++-common/cpp/embed-8.c: New test.
* c-c++-common/cpp/embed-9.c: New test.
* c-c++-common/cpp/embed-10.c: New test.
* c-c++-common/cpp/embed-11.c: New test.
* c-c++-common/cpp/embed-12.c: New test.
* c-c++-common/cpp/embed-13.c: New test.
* c-c++-common/cpp/embed-14.c: New test.
* c-c++-common/cpp/embed-25.c: New test.
* c-c++-common/cpp/embed-26.c: New test.
* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
* c-c++-common/cpp/embed-dir/embed-3.c: New test.
* c-c++-common/cpp/embed-dir/embed-4.c: New test.
* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
* gcc.dg/cpp/embed-1.c: New test.
* gcc.dg/cpp/embed-2.c: New test.
* gcc.dg/cpp/embed-3.c: New test.
* gcc.dg/cpp/embed-4.c: New test.
* g++.dg/cpp/embed-1.C: New test.
* g++.dg/cpp/embed-2.C: New test.
* g++.dg/cpp/embed-3.C: New test.
|
|
(§3.3.4)
This patch adds support to our SARIF output for cases where
rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.
In such cases, the pertinent SARIF "location" object gains a property
bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
within the location's physical location's snippet" gains a "rendered"
property (§3.3.4) that escapes non-ASCII text in the snippet, such as:
"rendered": {"text":
where "text" has a string value such as (for a "trojan source" attack):
"9 | /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
" | ~~~~~~~~ ~~~~~~~~ ^\n"
" | | | |\n"
" | | | end of bidirectional context\n"
" | U+202E (RIGHT-TO-LEFT OVERRIDE) U+2066 (LEFT-TO-RIGHT ISOLATE)\n"
where the escaping is affected by -fdiagnostics-escape-format=; with
-fdiagnostics-escape-format=bytes, the rendered text of the above is:
"9 | /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
" | ~~~~~~~~~~~~ ~~~~~~~~~~~~ ^\n"
" | | | |\n"
" | U+202E (RIGHT-TO-LEFT OVERRIDE) U+2066 (LEFT-TO-RIGHT ISOLATE) end of bidirectional context\n"
The patch also refactors/adds enough selftest machinery to be able to
test the snippet generation from within the selftest framework, rather
than just within DejaGnu (where the regex-based testing isn't
sophisticated enough to verify such properties as the above).
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add selftest-json.o.
* diagnostic-format-sarif.cc: Include "selftest.h",
"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
"selftest-json.h", and "text-range-label.h".
(class content_renderer): New.
(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
(sarif_builder::make_location_object): Add class
escape_nonascii_renderer. If rich_loc.escape_on_output_p (),
pass a nonnull escape_nonascii_renderer to
maybe_make_physical_location_object as its snippet_renderer, and
add a property bag property "gcc/escapeNonAscii" to the SARIF
location object. For other overloads of make_location_object,
pass nullptr for the snippet_renderer.
(sarif_builder::maybe_make_region_object_for_context): Add
"snippet_renderer" param and pass it to
maybe_make_artifact_content_object.
(sarif_builder::make_tool_object): Drop "const".
(sarif_builder::make_driver_tool_component_object): Likewise.
Use typesafe unique_ptr variant of object::set for setting "rules"
property on driver_obj.
(sarif_builder::maybe_make_artifact_content_object): Add param "r"
and use it to potentially set the "rendered" property (§3.3.4).
(selftest::test_make_location_object): New.
(selftest::diagnostic_format_sarif_cc_tests): New.
* diagnostic-show-locus.cc: Include "text-range-label.h" and
"selftest-diagnostic-show-locus.h".
(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
New.
(selftests::test_layout_x_offset_display_utf8): Use
diagnostic_show_locus_fixture to simplify and consolidate setup
code.
(selftests::test_diagnostic_show_locus_one_liner): Likewise.
(selftests::test_one_liner_colorized_utf8): Likewise.
(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
* gcc-rich-location.h (class text_range_label): Move to new file
text-range-label.h.
* selftest-diagnostic-show-locus.h: New file, based on material in
diagnostic-show-locus.cc.
* selftest-json.cc: New file.
* selftest-json.h: New file.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::diagnostic_format_sarif_cc_tests.
* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
that we have a property bag with property "gcc/escapeNonAscii": true.
Verify that we have a "rendered" property for a snippet.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
"text-range-label.h".
gcc/ChangeLog:
* text-range-label.h: New file, taking class text_range_label from
gcc-rich-location.h.
libcpp/ChangeLog:
* include/rich-location.h
(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
(rich_location::rich_location): Remove "= delete" from decl of
copy ctor. Add deleted decl of move ctor.
(rich_location::operator=): Remove "= delete" from decl of
copy assignment. Add deleted decl of move assignment.
(fixit_hint::fixit_hint): Add copy ctor decl. Add deleted decl of
move.
(fixit_hint::operator=): Add copy assignment decl. Add deleted
decl of move assignment.
* line-map.cc (rich_location::rich_location): New copy ctor.
(fixit_hint::fixit_hint): New copy ctor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Since r6-4582-g8a64515099e645 (which added class rich_location), ranges
of quoted source code have been colorized using the following rules:
- the primary range used the same color of the kind of the diagnostic
i.e. "error" vs "warning" etc (defaulting to bold red and bold magenta
respectively)
- secondary ranges alternate between "range1" and "range2" (defaulting
to green and blue respectively)
This works for cases with large numbers of highlighted ranges, but is
suboptimal for common cases.
The following patch adds a pair of color names: "highlight-a" and
"highlight-b", and uses them whenever it makes sense to highlight and
contrast two different things in the source code (e.g. a type mismatch).
These are used by diagnostic-show-locus.cc for highlighting quoted
source. In addition the patch adds colorization to fragments within the
corresponding diagnostic messages themselves, using consistent
colorization between the message and the quoted source code for the two
different things being contrasted.
For example, consider:
demo.c: In function ‘test_bad_format_string_args’:
../../src/demo.c:25:18: warning: format ‘%i’ expects argument of
type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=]
25 | printf("hello %i", msg);
| ~^ ~~~
| | |
| int const char *
| %s
Previously, the types within the message in quotes would be in bold but
not colorized, and the labelled ranges of quoted source code would use
bold magenta for the "int" and non-bold green for the "const char *".
With this patch:
- the "%i" and "int" in the message and the "int" in the quoted source
are all colored bold green
- the "const char *" in the message and in the quoted source are both
colored bold blue
so that the consistent use of contrasting color draws the reader's eyes
to the relationships between the diagnostic message and the source.
I've tried this with gnome-terminal with many themes, including a
variety of light versus dark backgrounds, solarized versus non-solarized
themes, etc, and it was readable in all.
My initial version of the patch used the existing %r and %R facilities
within pretty-print.cc for the messages, but this turned out to be very
uncomfortable, leading to error-prone format strings such as:
error_at (richloc,
"invalid operands to binary %s (have %<%r%T%R%> and %<%r%T%R%>)",
opname,
"highlight-a", type0,
"highlight-b", type1);
To avoid requiring monstrosities such as the above, the patch adds a new
"%e" format code to pretty-print.cc, which expects a pp_element *, where
pp_element is a new abstract base class (actually a pp_markup::element),
along with various useful subclasses. This lets the above be written
as:
pp_markup::element_quoted_type element_0 (type0, highlight_colors::lhs);
pp_markup::element_quoted_type element_1 (type1, highlight_colors::rhs);
error_at (richloc,
"invalid operands to binary %s (have %e and %e)",
opname, &element_0, &element_1);
which I feel is maintainable and clear to translators; the use of %e and
pp_element * captures the type-unsafe part of the variadic call, and the
subclasses allow for type-safety (so e.g. an element_quoted_type expects
a type and a highlighting color). This approach allows for some nice
simplifications within c-format.cc.
The patch also extends -Wformat to "teach" it about the new %e and
pp_element *. Doing so requires c-format.cc to be able to determine
if a T * is a pp_element * (i.e. if T is a subclass). To do so I added
a new comp_types callback for comparing types, where the C++ frontend
supplies a suitable implementation (and %e will always be wrong for C).
I've manually tested this on many diagnostics with both C and C++ and it
seems a subtle but significant improvement in readability.
I've added a new option -fno-diagnostics-show-highlight-colors in case
people prefer the old behavior.
gcc/c-family/ChangeLog:
* c-common.cc: Include "tree-pretty-print-markup.h".
(binary_op_error): Use pp_markup::element_quoted_type and %e.
(check_function_arguments): Add "comp_types" param and pass it to
check_function_format.
* c-common.h (check_function_arguments): Add "comp_types" param.
(check_function_format): Likewise.
* c-format.cc: Include "tree-pretty-print-markup.h".
(local_pp_element_ptr_node): New.
(PP_FORMAT_CHAR_TABLE): Add entry for %e.
(struct format_check_context): Add "m_comp_types" field.
(check_function_format): Add "comp_types" param and pass it to
check_format_info.
(check_format_info): Likewise, passing it to format_ctx's ctor.
(check_format_arg): Extract m_comp_types from format_ctx and
pass it to check_format_info_main.
(check_format_info_main): Add "comp_types" param and pass it to
arg_parser's ctor.
(class argument_parser): Add "m_comp_types" field.
(argument_parser::check_argument_type): Pass m_comp_types to
check_format_types.
(handle_subclass_of_pp_element_p): New.
(check_format_types): Add "comp_types" param, and use it to
call handle_subclass_of_pp_element_p.
(class element_format_substring): New.
(class element_expected_type_with_indirection): New.
(format_type_warning): Use element_expected_type_with_indirection
to unify the if (wanted_type_name) branches, reducing from four
emit_warning calls to two. Simplify these further using %e.
Doing so also gives suitable colorization of the text within the
diagnostics.
(init_dynamic_diag_info): Initialize local_pp_element_ptr_node.
(selftest::test_type_mismatch_range_labels): Add nullptr for new
param of gcc_rich_location label overload.
* c-format.h (T_PP_ELEMENT_PTR): New.
* c-type-mismatch.cc: Include "diagnostic-highlight-colors.h".
(binary_op_rich_location::binary_op_rich_location): Use
highlight_colors::lhs and highlight_colors::rhs for the ranges.
* c-type-mismatch.h (class binary_op_rich_location): Add comment
about highlight_colors.
gcc/c/ChangeLog:
* c-objc-common.cc: Include "tree-pretty-print-markup.h".
(print_type): Add optional "highlight_color" param and use it
to show highlight colors in "aka" text.
(pp_markup::element_quoted_type::print_type): New.
* c-typeck.cc: Include "tree-pretty-print-markup.h".
(comp_parm_types): New.
(build_function_call_vec): Pass it to check_function_arguments.
(inform_for_arg): Use %e and highlight colors to contrast actual
versus expected.
(convert_for_assignment): Use highlight_colors::actual for the
rhs_label.
(build_binary_op): Use highlight_colors::lhs and highlight_colors::rhs
for the ranges.
gcc/ChangeLog:
* common.opt (fdiagnostics-show-highlight-colors): New option.
* common.opt.urls: Regenerate.
* coretypes.h (pp_markup::element): New forward decl.
(pp_element): New typedef.
* diagnostic-color.cc (gcc_color_defaults): Add "highlight-a"
and "highlight-b".
* diagnostic-format-json.cc (diagnostic_output_format_init_json):
Disable highlight colors.
* diagnostic-format-sarif.cc (diagnostic_output_format_init_sarif):
Likewise.
* diagnostic-highlight-colors.h: New file.
* diagnostic-path.cc (struct event_range): Pass nullptr for
highlight color of m_rich_loc.
* diagnostic-show-locus.cc (colorizer::set_range): Handle ranges
with m_highlight_color.
(colorizer::STATE_NAMED_COLOR): New.
(colorizer::m_richloc): New field.
(colorizer::colorizer): Add richloc param for initializing
m_richloc.
(colorizer::set_named_color): New.
(colorizer::begin_state): Add case STATE_NAMED_COLOR.
(layout::layout): Pass richloc to m_colorizer's ctor.
(selftest::test_one_liner_labels): Pass nullptr for new param of
gcc_rich_location ctor for labels.
(selftest::test_one_liner_labels_utf8): Likewise.
* diagnostic.h (diagnostic_context::set_show_highlight_colors):
New.
* doc/invoke.texi: Add option -fdiagnostics-show-highlight-colors
and highlight-a and highlight-b color caps.
* doc/ux.texi
(Use color consistently when highlighting mismatches): New
subsection.
* gcc-rich-location.cc (gcc_rich_location::add_expr): Add
"highlight_color" param.
(gcc_rich_location::maybe_add_expr): Likewise.
* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
Split out into a pair of ctors, where if a range_label is supplied
the caller must also supply a highlight color.
(gcc_rich_location::add_expr): Add "highlight_color" param.
(gcc_rich_location::maybe_add_expr): Likewise.
* gcc.cc (driver_handle_option): Handle
OPT_fdiagnostics_show_highlight_colors.
* lto-wrapper.cc (merge_and_complain): Likewise.
(append_compiler_options): Likewise.
(append_diag_options): Likewise.
(run_gcc): Likewise.
* opts-common.cc (decode_cmdline_options_to_array): Add comment
about -fno-diagnostics-show-highlight-colors.
* opts-global.cc (init_options_once): Preserve
pp_show_highlight_colors in case the global_dc's printer is
recreated.
* opts.cc (common_handle_option): Handle
OPT_fdiagnostics_show_highlight_colors.
(gen_command_line_string): Likewise.
* pretty-print-markup.h: New file.
* pretty-print.cc: Include "pretty-print-markup.h" and
"diagnostic-highlight-colors.h".
(pretty_printer::format): Handle %e.
(pretty_printer::pretty_printer): Handle new field
m_show_highlight_colors.
(pp_string_n): New.
(pp_markup::context::begin_quote): New.
(pp_markup::context::end_quote): New.
(pp_markup::context::begin_color): New.
(pp_markup::context::end_color): New.
(highlight_colors::expected): New.
(highlight_colors::actual): New.
(highlight_colors::lhs): New.
(highlight_colors::rhs): New.
(class selftest::test_element): New.
(selftest::test_pp_format): Add tests of %e.
(selftest::test_urlification): Likewise.
* pretty-print.h (pp_markup::context): New forward decl.
(class chunk_info): Add friend class pp_markup::context.
(class pretty_printer): Add friend pp_show_highlight_colors.
(pretty_printer::m_show_highlight_colors): New field.
(pp_show_highlight_colors): New inline function.
(pp_string_n): New decl.
* substring-locations.cc: Include "diagnostic-highlight-colors.h".
(format_string_diagnostic_t::highlight_color_format_string): New.
(format_string_diagnostic_t::highlight_color_param): New.
(format_string_diagnostic_t::emit_warning_n_va): Use highlight
colors.
* substring-locations.h
(format_string_diagnostic_t::highlight_color_format_string): New.
(format_string_diagnostic_t::highlight_color_param): New.
* toplev.cc (general_init): Initialize global_dc's
show_highlight_colors.
* tree-pretty-print-markup.h: New file.
gcc/cp/ChangeLog:
* call.cc: Include "tree-pretty-print-markup.h".
(implicit_conversion_error): Use highlight_colors::percent_h for
the labelled range.
(op_error_string): Split out into...
(concat_op_error_string): ...this.
(binop_error_string): New.
(op_error): Use %e, binop_error_string, highlight_colors::lhs,
and highlight_colors::rhs.
(maybe_inform_about_fndecl_for_bogus_argument_init): Add
"highlight_color" param; use it for the richloc.
(convert_like_internal): Use highlight_colors::percent_h for the
labelled_range, and highlight_colors::percent_i for the call to
maybe_inform_about_fndecl_for_bogus_argument_init.
(build_over_call): Pass cp_comp_parm_types for new "comp_types"
param of check_function_arguments.
(complain_about_bad_argument): Use highlight_colors::percent_h for
the labelled_range, and highlight_colors::percent_i for the call
to maybe_inform_about_fndecl_for_bogus_argument_init.
* cp-tree.h (maybe_inform_about_fndecl_for_bogus_argument_init):
Add optional highlight_color param.
(cp_comp_parm_types): New decl.
(highlight_colors::const percent_h): New decl.
(highlight_colors::const percent_i): New decl.
* error.cc: Include "tree-pretty-print-markup.h".
(highlight_colors::const percent_h): New defn.
(highlight_colors::const percent_i): New defn.
(type_to_string): Add param "highlight_color" and use it.
(print_nonequal_arg): Likewise.
(print_template_differences): Add params "highlight_color_a" and
"highlight_color_b".
(type_to_string_with_compare): Add params "this_highlight_color"
and "peer_highlight_color".
(print_template_tree_comparison): Add params "highlight_color_a"
and "highlight_color_b".
(cxx_format_postprocessor::handle):
Use highlight_colors::percent_h and highlight_colors::percent_i.
(pp_markup::element_quoted_type::print_type): New.
(range_label_for_type_mismatch::get_text): Pass nullptr for new
params of type_to_string_with_compare.
* typeck.cc (cp_comp_parm_types): New.
(cp_build_function_call_vec): Pass it to check_function_arguments.
(convert_for_assignment): Use highlight_colors::percent_h for the
labelled_range.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/bad-binary-ops-highlight-colors.C: New test.
* g++.dg/diagnostic/bad-binary-ops-no-highlight-colors.C: New test.
* g++.dg/plugin/plugin.exp (plugin_test_list): Add
show-template-tree-color-no-highlight-colors.C to
show_template_tree_color_plugin.c.
* g++.dg/plugin/show-template-tree-color-labels.C: Update expected
output to reflect use of highlight-a and highlight-b to contrast
mismatches.
* g++.dg/plugin/show-template-tree-color-no-elide-type.C:
Likewise.
* g++.dg/plugin/show-template-tree-color-no-highlight-colors.C:
New test.
* g++.dg/plugin/show-template-tree-color.C: Update expected output
to reflect use of highlight-a and highlight-b to contrast
mismatches.
* g++.dg/warn/Wformat-gcc_diag-1.C: New test.
* g++.dg/warn/Wformat-gcc_diag-2.C: New test.
* g++.dg/warn/Wformat-gcc_diag-3.C: New test.
* gcc.dg/bad-binary-ops-highlight-colors.c: New test.
* gcc.dg/format/colors.c: New test.
* gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Pass
nullptr for new param of gcc_rich_location::add_expr.
libcpp/ChangeLog:
* include/rich-location.h (location_range::m_highlight_color): New
field.
(rich_location::rich_location): Add optional label_highlight_color
param.
(rich_location::set_highlight_color): New decl.
(rich_location::add_range): Add optional label_highlight_color
param.
(rich_location::set_range): Likewise.
* line-map.cc (rich_location::rich_location): Add
"label_highlight_color" param and pass it to add_range.
(rich_location::set_highlight_color): New.
(rich_location::add_range): Add "label_highlight_color" param.
(rich_location::set_range): Add "highlight_color" param.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.
Specifically, in
c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
"message": {"text": "invalid UTF-8 character <bf>"}},
on line 25 with no column information.
Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet). We were
handling this column override for plain text output, but not for .sarif
output.
Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.
We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.
This patch fixes these various issues.
gcc/ChangeLog:
PR testsuite/109360
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Pass any column override
from rich_loc to maybe_make_physical_location_object.
(sarif_builder::maybe_make_physical_location_object): Add
"column_override" param and pass it to maybe_make_region_object.
(sarif_builder::maybe_make_region_object): Add "column_override"
param and use it when the location has 0 for a column. Don't
add "startLine", "startColumn", "endLine", or "endColumn" if
the values aren't positive.
(sarif_builder::maybe_make_region_object_for_context): Don't
add "startLine" or "endLine" if the values aren't positive.
libcpp/ChangeLog:
PR testsuite/109360
* include/rich-location.h (rich_location::get_column_override):
New accessor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
The first new C2Y feature, _Generic where the controlling operand is a
type name rather than an expression (as defined in N3260), was voted
into C2Y today. (In particular, this form of _Generic allows
distinguishing qualified and unqualified versions of a type.) This
feature also includes allowing the generic associations to specify
incomplete and function types.
Add this feature to GCC, along with the -std=c2y, -std=gnu2y and
-Wc23-c2y-compat options to control when and how it is diagnosed. As
usual, the feature is allowed by default in older standards modes,
subject to diagnosis with -pedantic, -pedantic-errors or
-Wc23-c2y-compat.
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/
* doc/cpp.texi (__STDC_VERSION__): Document C2Y handling.
* doc/invoke.texi (-Wc23-c2y-compat, -std=c2y, -std=gnu2y):
Document options.
(-std=gnu23): Update documentation.
* doc/standards.texi (C Language): Document C2Y. Update C23
description.
* config/rl78/rl78.cc (rl78_option_override): Handle "GNU C2Y"
language name.
* dwarf2out.cc (highest_c_language, gen_compile_unit_die):
Likewise.
gcc/c-family/
* c-common.cc (flag_isoc2y): New.
(flag_isoc99, flag_isoc11, flag_isoc23): Update comments.
* c-common.h (flag_isoc2y): New.
(clk_c, flag_isoc23): Update comments.
* c-opts.cc (set_std_c2y): New.
(c_common_handle_option): Handle OPT_std_c2y and OPT_std_gnu2y.
(set_std_c89, set_std_c99, set_std_c11, set_std_c17, set_std_c23):
Set flag_isoc2y.
(set_std_c23): Update comment.
* c.opt (Wc23-c2y-compat, std=c2y, std=gnu2y): New.
* c.opt.urls: Regenerate.
gcc/c/
* c-errors.cc (pedwarn_c23): New.
* c-parser.cc (disable_extension_diagnostics)
(restore_extension_diagnostics): Save and restore
warn_c23_c2y_compat.
(c_parser_generic_selection): Handle type name as controlling
operand. Allow incomplete and function types subject to
pedwarn_c23 calls.
* c-tree.h (pedwarn_c23): New.
gcc/testsuite/
* gcc.dg/c23-generic-1.c, gcc.dg/c23-generic-2.c,
gcc.dg/c23-generic-3.c, gcc.dg/c23-generic-4.c,
gcc.dg/c2y-generic-1.c, gcc.dg/c2y-generic-2.c,
gcc.dg/c2y-generic-3.c, gcc.dg/gnu2y-generic-1.c: New tests.
* gcc.dg/c23-tag-6.c: Use -pedantic-errors.
libcpp/
* include/cpplib.h (CLK_GNUC2Y, CLK_STDC2Y): New.
* init.cc (lang_defaults): Add GNUC2Y and STDC2Y entries.
(cpp_init_builtins): Define __STDC_VERSION__ to 202500L for GNUC2Y
and STDC2Y.
|
|
No functional change intended.
libcpp/ChangeLog:
* Makefile.in (TAGS_SOURCES): Add include/label-text.h.
* include/label-text.h: New file.
* include/rich-location.h: Include "label-text.h".
(class label_text): Move to label-text.h.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This patch adds some ability for links between labelled ranges when
quoting the user's source code, and uses this to add links between
events when printing diagnostic_paths, chopping them up further into
event ranges that can be printed together.
It adds links to the various "from..." - "...to" events in the
analyzer.
For example, previously we emitted this for
c-c++-common/analyzer/infinite-loop-linked-list.c's
while_loop_missing_next':
infinite-loop-linked-list.c:30:10: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
30 | while (n)
| ^
'while_loop_missing_next': events 1-5
30 | while (n)
| ^
| |
| (1) infinite loop here
| (2) when 'n' is non-NULL: always following 'true' branch...
| (5) ...to here
31 | {
32 | sum += n->val;
| ~~~~~~~~~~~~~
| | |
| | (3) ...to here
| (4) looping back...
whereas with the patch we now emit:
infinite-loop-linked-list.c:30:10: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
30 | while (n)
| ^
'while_loop_missing_next': events 1-3
30 | while (n)
| ^
| |
| (1) infinite loop here
| (2) when 'n' is non-NULL: always following 'true' branch... ->-+
| |
| |
|+------------------------------------------------------------------------+
31 || {
32 || sum += n->val;
|| ~~~~~~
|| |
|+------------->(3) ...to here
'while_loop_missing_next': event 4
32 | sum += n->val;
| ~~~~^~~~~~~~~
| |
| (4) looping back... ->-+
| |
'while_loop_missing_next': event 5
| |
|+---------------------------------+
30 || while (n)
|| ^
|| |
|+-------->(5) ...to here
which I believe is easier to understand.
The patch also implements the use of unicode characters and colorization
for the lines (not shown in the above example).
There is a new option -fno-diagnostics-show-event-links for getting
back the old behavior (added to -fdiagnostics-plain-output).
gcc/analyzer/ChangeLog:
* checker-event.h (checker_event::connect_to_next_event_p):
Implement new diagnostic_event::connect_to_next_event_p vfunc.
(start_cfg_edge_event::connect_to_next_event_p): Likewise.
(start_consolidated_cfg_edges_event::connect_to_next_event_p):
Likewise.
* infinite-loop.cc (class looping_back_event): New subclass.
(infinite_loop_diagnostic::add_final_event): Use it.
gcc/ChangeLog:
* common.opt (fdiagnostics-show-event-links): New option.
* diagnostic-label-effects.h: New file.
* diagnostic-path.h (diagnostic_event::connect_to_next_event_p):
New pure virtual function.
(simple_diagnostic_event::connect_to_next_event_p): Implement it.
(simple_diagnostic_event::connect_to_next_event): New.
(simple_diagnostic_event::m_connected_to_next_event): New field.
(simple_diagnostic_path::connect_to_next_event): New decl.
* diagnostic-show-locus.cc: Include "text-art/theme.h" and
"diagnostic-label-effects.h".
(colorizer::set_cfg_edge): New.
(layout::m_fallback_theme): New field.
(layout::m_theme): New field.
(layout::m_effect_info): New field.
(layout::m_link_lhs_state): New enum and field.
(layout::m_link_rhs_column): New field.
(layout_range::has_in_edge): New.
(layout_range::has_out_edge): New.
(layout::layout): Add "effect_info" optional param. Initialize
m_theme, m_link_lhs_state, and m_link_rhs_column.
(layout::maybe_add_location_range): Remove stray "FIXME" from
leading comment.
(layout::print_source_line): Replace space after margin with a
call to print_leftmost_column.
(layout::print_leftmost_column): New.
(layout::start_annotation_line): Make non-const. Gain
responsibility for printing the leftmost column after the margin.
(layout::print_annotation_line): Drop pp_space, as this is now
added by start_annotation_line.
(line_label::line_label): Add "has_in_edge" and "has_out_edge"
params and initialize...
(line_label::m_has_in_edge): New field.
(line_label::m_has_out_edge): New field.
(layout::print_any_labels): Pass edge information to line_label
ctor. Keep track of in-edges and out-edges, adding visualizations
of these links between labels.
(layout::print_leading_fixits): Drop pp_character, as this is now
added by start_annotation_line.
(layout::print_trailing_fixits): Fix off-by-one errors in column
calculation.
(layout::move_to_column): Add comment about debugging.
(layout::show_ruler): Make non-const. Drop pp_space calls, as
this is now added by start_annotation_line.
(layout::print_line): Call print_any_right_to_left_edge_lines.
(layout::print_any_right_to_left_edge_lines): New.
(layout::update_any_effects): New.
(gcc_rich_location::add_location_if_nearby): Initialize
loc_range.m_label.
(diagnostic_context::maybe_show_locus): Add "effects" param and
pass it to diagnostic_context::show_locus.
(diagnostic_context::show_locus): Add "effects" param, passing it
to layout's ctor. Call update_any_effects on the layout after
printing the lines.
(selftest::test_layout_x_offset_display_utf8): Update expected
result for eliminated trailing newline.
(selftest::test_layout_x_offset_display_utf8): Likewise.
(selftest::test_layout_x_offset_display_tab): Likewise.
* diagnostic.cc (diagnostic_context::initialize): Initialize
m_source_printing.show_event_links_p.
(simple_diagnostic_path::connect_to_next_event): New.
(simple_diagnostic_event::simple_diagnostic_event): Initialize
m_connected_to_next_event.
* diagnostic.h (class diagnostic_source_effect_info): New forward
decl.
(diagnostic_source_printing_options::show_event_links_p): New
field.
(diagnostic_context::maybe_show_locus): Add optional "effect_info"
param.
(diagnostic_context::show_locus): Add "effect_info" param.
(diagnostic_show_locus): Add optional "effect_info" param.
* doc/invoke.texi: Add -fno-diagnostics-show-event-links.
* lto-wrapper.cc (merge_and_complain): Add
OPT_fdiagnostics_show_event_links to switch.
(append_compiler_options): Likewise.
(append_diag_options): Likewise.
* opts-common.cc (decode_cmdline_options_to_array): Add
"-fno-diagnostics-show-event-links" to -fdiagnostics-plain-output.
* opts.cc (common_handle_option): Add case for
OPT_fdiagnostics_show_event_links.
* text-art/theme.cc (ascii_theme::get_cppchar): Handle
cell_kind::CFG_*.
(unicode_theme::get_cppchar): Likewise.
* text-art/theme.h (theme::cell_kind): Add CFG_*.
* toplev.cc (general_init): Initialize
global_dc->m_source_printing.show_event_links_p.
* tree-diagnostic-path.cc: Define INCLUDE_ALGORITHM,
INCLUDE_MEMORY, and INCLUDE_STRING. Include
"diagnostic-label-effects.h".
(path_label::path_label): Initialize m_effects.
(path_label::get_effects): New.
(class path_label::path_label_effects): New.
(path_label::m_effects): New field.
(class per_thread_summary): Add "friend struct event_range;".
(per_thread_summary::per_thread_summary): Initialize m_last_event.
(per_thread_summary::m_last_event): New field.
(struct event_range::per_source_line_info): New.
(event_range::event_range): Make "t" non-const. Add
"show_event_links" param and use it to initialize
m_show_event_links. Add info for initial event.
(event_range::get_per_source_line_info): New.
(event_range::maybe_add_event): Verify compatibility of the new
label and existing labels with respect to the link-printing code.
Update per-source-line info when an event is added.
(event_range::print): Add"effect_info" param and pass to
diagnostic_show_locus.
(event_range::m_per_thread_summary): Make non-const.
(event_range::m_source_line_info_map): New field.
(event_range::m_show_event_links): New field.
(path_summary::path_summary): Add "show_event_links" optional
param, passing it to event_range ctor calls. Update
pts.m_last_event.
(thread_event_printer::print_swimlane_for_event_range): Add
"effect_info" param and pass it to range->print.
(print_path_summary_as_text): Keep track of the column for any
out-edges at the end of printing each event_range and use as
the leading in-edge for the next event_range.
(default_tree_diagnostic_path_printer): Pass in show_event_links_p
to path_summary ctor.
(selftest::path_events_have_column_data_p): New.
(class selftest::control_flow_test): New.
(selftest::test_control_flow_1): New.
(selftest::test_control_flow_2): New.
(selftest::test_control_flow_3): New.
(selftest::assert_cfg_edge_path_streq): New.
(ASSERT_CFG_EDGE_PATH_STREQ): New macro.
(selftest::test_control_flow_4): New.
(selftest::test_control_flow_5): New.
(selftest::test_control_flow_6): New.
(selftest::control_flow_tests): New.
(selftest::tree_diagnostic_path_cc_tests): Disable colorization on
global_dc's printer. Convert event_pp to a std::unique_ptr. Call
control_flow_tests via for_each_line_table_case.
(gen_command_line_string): Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/event-links-ascii.c: New test.
* gcc.dg/analyzer/event-links-color.c: New test.
* gcc.dg/analyzer/event-links-disabled.c: New test.
* gcc.dg/analyzer/event-links-unicode.c: New test.
libcpp/ChangeLog:
* include/rich-location.h (class label_effects): New forward decl.
(range_label::get_effects): New vfunc.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
changes in -std=c11 etc. modes [PR114007]
We aren't able to parse __has_attribute (vendor::attr) (and __has_c_attribute
and __has_cpp_attribute) in strict C < C23 modes. While in -std=gnu* modes
or in -std=c23 there is CPP_SCOPE token, in -std=c* (except for -std=c23)
there are is just a pair of CPP_COLON tokens.
The c-lex.cc hunk adds support for that.
That leads to a question if we should return 1 or 0 from
__has_attribute (gnu::unused) or not, because while
[[gnu::unused]] is parsed fine in -std=gnu*/-std=c23 modes (sure, with
pedwarn for < C23), we do not parse it at all in -std=c* (except for
-std=c23), we only parse [[__extension__ gnu::unused]] there. While
the __extension__ in there helps to avoid the pedwarn, I think it is
better to be consistent between GNU and strict C < C23 modes and
parse [[gnu::unused]] too; on the other side, I think parsing
[[__extension__ gnu : : unused]] is too weird and undesirable.
So, the following patch adds a flag during preprocessing at the point
where we normally create CPP_SCOPE tokens out of 2 consecutive colons
on the first CPP_COLON to mark the consecutive case (as we are tight
on the bits, I've reused the PURE_ZERO flag, which is used just by the
C++ FE and only ever set (both C and C++) on CPP_NUMBER tokens, this
new flag has the same value and is only ever used on CPP_COLON tokens)
and instead of checking loose_scope_p argument (i.e. whether it is
[[__extension__ ...]] or not), it just parses CPP_SCOPE or CPP_COLON
with CLONE_SCOPE flag followed by another CPP_COLON the same.
The latter will never appear in >= C23 or -std=gnu* modes, though
guarding its use say with flag_iso && !flag_isoc23 && doesn't really
work because the __extension__ case temporarily clears flag_iso flag.
This makes the -std=c11 etc. behavior more similar to -std=gnu11 or
-std=c23, the only difference I'm aware of are the
#define JOIN2(A, B) A##B
[[vendor JOIN2(:,:) attr]]
[[__extension__ vendor JOIN2(:,:) attr]]
cases, which are accepted in the latter modes, but results in error
in -std=c11; but the error is during preprocessing that :: doesn't
form a valid preprocessing token, which is true, so just don't do that if
you try to have __STRICT_ANSI__ && __STDC_VERSION__ <= 201710L
compatibility.
2024-02-22 Jakub Jelinek <jakub@redhat.com>
PR c/114007
gcc/
* doc/extend.texi: (__extension__): Remove comments about scope
tokens vs. two colons.
gcc/c-family/
* c-lex.cc (c_common_has_attribute): Parse 2 CPP_COLONs with
the first one with COLON_SCOPE flag the same as CPP_SCOPE.
gcc/c/
* c-parser.cc (c_parser_std_attribute): Remove loose_scope_p argument.
Instead of checking it, parse 2 CPP_COLONs with the first one with
COLON_SCOPE flag the same as CPP_SCOPE.
(c_parser_std_attribute_list): Remove loose_scope_p argument, don't
pass it to c_parser_std_attribute.
(c_parser_std_attribute_specifier): Adjust c_parser_std_attribute_list
caller.
gcc/testsuite/
* gcc.dg/c23-attr-syntax-6.c: Adjust testcase for :: being valid
even in -std=c11 even without __extension__ and : : etc. not being
valid anymore even with __extension__.
* gcc.dg/c23-attr-syntax-7.c: Likewise.
* gcc.dg/c23-attr-syntax-8.c: New test.
libcpp/
* include/cpplib.h (COLON_SCOPE): Define to PURE_ZERO.
* lex.cc (_cpp_lex_direct): When lexing CPP_COLON with another
colon after it, if !CPP_OPTION (pfile, scope) set COLON_SCOPE
flag on the first CPP_COLON token.
|
|
This commit adds a new function intended for checking the XID properties
of a possibly unicode character, as well as the accompanying enum
describing the possible properties.
libcpp/ChangeLog:
* charset.cc (cpp_check_xid_property): New.
* include/cpplib.h
(cpp_check_xid_property): New.
(enum cpp_xid_property): New.
Signed-off-by: Raiki Tamura <tamaron1203@gmail.com>
|
|
|
|
This patch implements clang's __has_feature and __has_extension in GCC.
Currently the patch aims to implement all documented features (and some
undocumented ones) following the documentation at
https://clang.llvm.org/docs/LanguageExtensions.html with the exception
of the legacy features for C++ type traits. These are omitted, since as
the clang documentation notes, __has_builtin is the correct "modern" way
to query for these (which GCC already implements).
gcc/c-family/ChangeLog:
PR c++/60512
* c-common.cc (struct hf_feature_info): New.
(c_common_register_feature): New.
(init_has_feature): New.
(has_feature_p): New.
* c-common.h (c_common_has_feature): New.
(c_family_register_lang_features): New.
(c_common_register_feature): New.
(has_feature_p): New.
* c-lex.cc (init_c_lex): Plumb through has_feature callback.
(c_common_has_builtin): Generalize and move common part ...
(c_common_lex_availability_macro): ... here.
(c_common_has_feature): New.
* c-ppoutput.cc (init_pp_output): Plumb through has_feature.
gcc/c/ChangeLog:
PR c++/60512
* c-lang.cc (c_family_register_lang_features): New.
* c-objc-common.cc (struct c_feature_info): New.
(c_register_features): New.
* c-objc-common.h (c_register_features): New.
gcc/cp/ChangeLog:
PR c++/60512
* cp-lang.cc (c_family_register_lang_features): New.
* cp-objcp-common.cc (struct cp_feature_selector): New.
(cp_feature_selector::has_feature): New.
(struct cp_feature_info): New.
(cp_register_features): New.
* cp-objcp-common.h (cp_register_features): New.
gcc/ChangeLog:
PR c++/60512
* doc/cpp.texi: Document __has_{feature,extension}.
gcc/objc/ChangeLog:
PR c++/60512
* objc-act.cc (struct objc_feature_info): New.
(objc_nonfragile_abi_p): New.
(objc_common_register_features): New.
* objc-act.h (objc_common_register_features): New.
* objc-lang.cc (c_family_register_lang_features): New.
gcc/objcp/ChangeLog:
PR c++/60512
* objcp-lang.cc (c_family_register_lang_features): New.
libcpp/ChangeLog:
PR c++/60512
* include/cpplib.h (struct cpp_callbacks): Add has_feature.
(enum cpp_builtin_type): Add BT_HAS_{FEATURE,EXTENSION}.
* init.cc: Add __has_{feature,extension}.
* macro.cc (_cpp_builtin_macro_text): Handle
BT_HAS_{FEATURE,EXTENSION}.
gcc/testsuite/ChangeLog:
PR c++/60512
* c-c++-common/has-feature-common.c: New test.
* c-c++-common/has-feature-pedantic.c: New test.
* g++.dg/ext/has-feature.C: New test.
* gcc.dg/asan/has-feature-asan.c: New test.
* gcc.dg/has-feature.c: New test.
* gcc.dg/ubsan/has-feature-ubsan.c: New test.
* obj-c++.dg/has-feature.mm: New test.
* objc.dg/has-feature.m: New test.
Co-Authored-By: Iain Sandoe <iain@sandoe.co.uk>
|
|
The various decls relating to rich_location are in
libcpp/include/line-map.h, but they don't relate to line maps.
Split them out to their own header: libcpp/include/rich-location.h
No functional change intended.
gcc/ChangeLog:
* Makefile.in (CPPLIB_H): Add libcpp/include/rich-location.h.
* coretypes.h (class rich_location): New forward decl.
gcc/analyzer/ChangeLog:
* analyzer.h: Include "rich-location.h".
gcc/c-family/ChangeLog:
* c-lex.cc: Include "rich-location.h".
gcc/cp/ChangeLog:
* mapper-client.cc: Include "rich-location.h".
gcc/ChangeLog:
* diagnostic.h: Include "rich-location.h".
* edit-context.h (class fixit_hint): New forward decl.
* gcc-rich-location.h: Include "rich-location.h".
* genmatch.cc: Likewise.
* pretty-print.h: Likewise.
gcc/rust/ChangeLog:
* rust-location.h: Include "rich-location.h".
libcpp/ChangeLog:
* Makefile.in (TAGS_SOURCES): Add "include/rich-location.h".
* include/cpplib.h (class rich_location): New forward decl.
* include/line-map.h (class range_label)
(enum range_display_kind, struct location_range)
(class semi_embedded_vec, class rich_location, class label_text)
(class range_label, class fixit_hint): Move to...
* include/rich-location.h: ...this new file.
* internal.h: Include "rich-location.h".
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Reduce implicit usage of line_table global, and move source printing to
within diagnostic_context.
gcc/ChangeLog:
* diagnostic-show-locus.cc (layout::m_line_table): New field.
(compatible_locations_p): Convert to...
(layout::compatible_locations_p): ...this, replacing uses of
line_table global with m_line_table.
(layout::layout): Convert "richloc" param from a pointer to a
const reference. Initialize m_line_table member.
(layout::maybe_add_location_range): Replace uses of line_table
global with m_line_table. Pass the latter to
linemap_client_expand_location_to_spelling_point.
(layout::print_leading_fixits): Pass m_line_table to
affects_line_p.
(layout::print_trailing_fixits): Likewise.
(gcc_rich_location::add_location_if_nearby): Update for change
to layout ctor params.
(diagnostic_show_locus): Convert to...
(diagnostic_context::maybe_show_locus): ...this, converting
richloc param from a pointer to a const reference. Make "loc"
const. Split out printing part of function to...
(diagnostic_context::show_locus): ...this.
(selftest::test_offset_impl): Update for change to layout ctor
params.
(selftest::test_layout_x_offset_display_utf8): Likewise.
(selftest::test_layout_x_offset_display_tab): Likewise.
(selftest::test_tab_expansion): Likewise.
* diagnostic.h (diagnostic_context::maybe_show_locus): New decl.
(diagnostic_context::show_locus): New decl.
(diagnostic_show_locus): Convert from a decl to an inline function.
* gdbinit.in (break-on-diagnostic): Update from a breakpoint
on diagnostic_show_locus to one on
diagnostic_context::maybe_show_locus.
* genmatch.cc (linemap_client_expand_location_to_spelling_point):
Add "set" param and use it in place of line_table global.
* input.cc (expand_location_1): Likewise.
(expand_location): Update for new param of expand_location_1.
(expand_location_to_spelling_point): Likewise.
(linemap_client_expand_location_to_spelling_point): Add "set"
param and use it in place of line_table global.
* tree-diagnostic-path.cc (event_range::print): Pass line_table
for new param of linemap_client_expand_location_to_spelling_point.
libcpp/ChangeLog:
* include/line-map.h (rich_location::get_expanded_location): Make
const.
(rich_location::get_line_table): New accessor.
(rich_location::m_line_table): Make the pointer be const.
(rich_location::m_have_expanded_location): Make mutable.
(rich_location::m_expanded_location): Likewise.
(fixit_hint::affects_line_p): Add const line_maps * param.
(linemap_client_expand_location_to_spelling_point): Likewise.
* line-map.cc (rich_location::get_expanded_location): Make const.
Pass m_line_table to
linemap_client_expand_location_to_spelling_point.
(rich_location::maybe_add_fixit): Likewise.
(fixit_hint::affects_line_p): Add set param and pass to
linemap_client_expand_location_to_spelling_point.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Continuing the move to refer to C23 in place of C2X throughout the
source tree, update documentation, diagnostics, comments, variable and
function names, etc., to use the C23 name.
Testsuite updates are left for a future patch, except for testcases
that test diagnostics that previously mentioned C2X (but in those
testcases, sometimes other comments are updated, not just the
diagnostic expectations).
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/
* builtins.def (DEF_C2X_BUILTIN): Rename to DEF_C23_BUILTIN and
use flag_isoc23 and function_c23_misc.
* config/rl78/rl78.cc (rl78_option_override): Compare
lang_hooks.name with "GNU C23" not "GNU C2X".
* coretypes.h (function_c2x_misc): Rename to function_c23_misc.
* doc/cpp.texi (@code{__has_attribute}): Refer to C23 instead of
C2x.
* doc/extend.texi: Likewise.
* doc/invoke.texi: Likewise.
* dwarf2out.cc (highest_c_language, gen_compile_unit_die): Compare
against and return "GNU C23" language string instead of "GNU C2X".
* ginclude/float.h: Refer to C23 instead of C2X in comments.
* ginclude/stdint-gcc.h: Likewise.
* glimits.h: Likewise.
* tree.h: Likewise.
gcc/ada/
* gcc-interface/utils.cc (flag_isoc2x): Rename to flag_isoc23.
gcc/c-family/
* c-common.cc (flag_isoc2x): Rename to flag_isoc23.
(c_common_reswords): Use D_C23 instead of D_C2X.
* c-common.h: Refer throughout to C23 instead of C2X in comments.
(D_C2X): Rename to D_C23.
(flag_isoc2x): Rename to flag_isoc23.
* c-cppbuiltin.cc (builtin_define_float_constants): Use
flag_isoc23 instead of flag_isoc2x. Refer to C23 instead of C2x
in comments.
* c-format.cc: Use STD_C23 instead of STD_C2X and flag_isoc23
instead of flag_isoc2x. Refer to C23 instead of C2X in comments.
* c-format.h: Use STD_C23 instead of STD_C2X.
* c-lex.cc: Use warn_c11_c23_compat instead of warn_c11_c2x_compat
and flag_isoc23 instead of flag_isoc2x. Refer to C23 instead of
C2X in diagnostics.
* c-opts.cc: Use flag_isoc23 instead of flag_isoc2x. Refer to C23
instead of C2X in comments.
(set_std_c2x): Rename to set_std_c23.
* c.opt (Wc11-c23-compat): Use CPP(cpp_warn_c11_c23_compat)
CppReason(CPP_W_C11_C23_COMPAT) Var(warn_c11_c23_compat) instead
of CPP(cpp_warn_c11_c2x_compat) CppReason(CPP_W_C11_C2X_COMPAT)
Var(warn_c11_c2x_compat).
gcc/c/
* c-decl.cc: Use flag_isoc23 instead of flag_isoc2x and c23_auto_p
instead of c2x_auto_p. Refer to C23 instead of C2X in diagnostics
and comments.
* c-errors.cc: Use flag_isoc23 instead of flag_isoc2x and
warn_c11_c23_compat instead of warn_c11_c2x_compat. Refer to C23
instead of C2X in comments.
* c-parser.cc: Use flag_isoc23 instead of flag_isoc2x,
warn_c11_c23_compat instead of warn_c11_c2x_compat, c23_auto_p
instead of c2x_auto_p and D_C23 instead of D_C2X. Refer to C23
instead of C2X in diagnostics and comments.
* c-tree.h: Refer to C23 instead of C2X in comments.
(struct c_declspecs): Rename c2x_auto_p to c23_auto_p.
* c-typeck.cc: Use flag_isoc23 instead of flag_isoc2x and
warn_c11_c23_compat instead of warn_c11_c2x_compat. Refer to C23
instead of C2X in diagnostics and comments.
gcc/fortran/
* gfortran.h (gfc_real_info): Refer to C23 instead of C2X in
comment.
gcc/lto/
* lto-lang.cc (flag_isoc2x): Rename to flag_isoc23.
gcc/testsuite/
* gcc.dg/binary-constants-2.c: Refer to C23 instead of C2X.
* gcc.dg/binary-constants-3.c: Likewise.
* gcc.dg/bitint-23.c: Likewise.
* gcc.dg/bitint-26.c: Likewise.
* gcc.dg/bitint-27.c: Likewise.
* gcc.dg/c11-attr-syntax-1.c: Likewise.
* gcc.dg/c11-attr-syntax-2.c: Likewise.
* gcc.dg/c11-floatn-1.c: Likewise.
* gcc.dg/c11-floatn-2.c: Likewise.
* gcc.dg/c11-floatn-3.c: Likewise.
* gcc.dg/c11-floatn-4.c: Likewise.
* gcc.dg/c11-floatn-5.c: Likewise.
* gcc.dg/c11-floatn-6.c: Likewise.
* gcc.dg/c11-floatn-7.c: Likewise.
* gcc.dg/c11-floatn-8.c: Likewise.
* gcc.dg/c2x-attr-syntax-4.c: Likewise.
* gcc.dg/c2x-attr-syntax-6.c: Likewise.
* gcc.dg/c2x-attr-syntax-7.c: Likewise.
* gcc.dg/c2x-binary-constants-2.c: Likewise.
* gcc.dg/c2x-floatn-5.c: Likewise.
* gcc.dg/c2x-floatn-6.c: Likewise.
* gcc.dg/c2x-floatn-7.c: Likewise.
* gcc.dg/c2x-floatn-8.c: Likewise.
* gcc.dg/c2x-nullptr-4.c: Likewise.
* gcc.dg/c2x-qual-2.c: Likewise.
* gcc.dg/c2x-qual-3.c: Likewise.
* gcc.dg/c2x-qual-6.c: Likewise.
* gcc.dg/cpp/c11-warning-1.c: Likewise.
* gcc.dg/cpp/c11-warning-2.c: Likewise.
* gcc.dg/cpp/c11-warning-3.c: Likewise.
* gcc.dg/cpp/c2x-warning-2.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-3.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-4.c: Likewise.
* gcc.dg/cpp/gnu11-warning-1.c: Likewise.
* gcc.dg/cpp/gnu11-warning-2.c: Likewise.
* gcc.dg/cpp/gnu11-warning-3.c: Likewise.
* gcc.dg/cpp/gnu2x-warning-2.c: Likewise.
* gcc.dg/dfp/c11-constants-1.c: Likewise.
* gcc.dg/dfp/c11-constants-2.c: Likewise.
* gcc.dg/dfp/c2x-constants-2.c: Likewise.
* gcc.dg/dfp/constants-pedantic.c: Likewise.
* gcc.dg/pr30260.c: Likewise.
* gcc.dg/system-binary-constants-1.c: Likewise.
libcpp/
* directives.cc: Refer to C23 instead of C2X in diagnostics and
comments.
(STDC2X): Rename to STDC23.
* expr.cc: Use cpp_warn_c11_c23_compat instead of
cpp_warn_c11_c2x_compat and CPP_W_C11_C23_COMPAT instead of
CPP_W_C11_C2X_COMPAT. Refer to C23 instead of C2X in diagnostics
and comments.
* include/cpplib.h: Refer to C23 instead of C2X in diagnostics and
comments.
(CLK_GNUC2X): Rename to CLK_GNUC23.
(CLK_STDC2X): Rename to CLK_STDC23.
(CPP_W_C11_C2X_COMPAT): Rename to CPP_W_C11_C23_COMPAT.
* init.cc: Use GNUC23 instead of GNUC2X, STDC23 instead of STDC2X
and cpp_warn_c11_c23_compat instead of cpp_warn_c11_c2x_compat.
* lex.cc (maybe_va_opt_error): Refer to C23 instead of C2X in
diagnostic.
* macro.cc (_cpp_arguments_ok): Refer to C23 instead of C2X in
comment.
|
|
The following patch implements C++26 unevaluated-string.
As it seems to me just extra pedanticity, it is implemented only for
-std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.
Nothing is done for inline asm, while the spec changes those, it changes it
to a balanced token sequence with implementation defined rules on what is
and isn't allowed (so pedantically accepting asm ("" : "+m" (x));
was accepts-invalid before C++26, but we didn't diagnose anything).
For the other spots mentioned in the paper, static_assert message,
linkage specification, deprecated/nodiscard attributes it enforces the
requirements (no prefixes, udlit suffixes, no octal/hexadecimal escapes
(conditional escape sequences were rejected with pedantic already before).
For the deprecated operator "" identifier case I've kept things as is,
because everything seems to have been diagnosed already (a lot being implied
from the string having to be empty).
2023-11-02 Jakub Jelinek <jakub@redhat.com>
PR c++/110342
gcc/cp/
* parser.cc: Implement C++26 P2361R6 - Unevaluated strings.
(uneval_string_attr): New enumerator.
(cp_parser_string_literal_common): Add UNEVAL argument. If true,
pass CPP_UNEVAL_STRING rather than CPP_STRING to
cpp_interpret_string_notranslate.
(cp_parser_string_literal, cp_parser_userdef_string_literal): Adjust
callers of cp_parser_string_literal_common.
(cp_parser_unevaluated_string_literal): New function.
(cp_parser_parenthesized_expression_list): Handle uneval_string_attr.
(cp_parser_linkage_specification): Use
cp_parser_unevaluated_string_literal for C++26.
(cp_parser_static_assert): Likewise.
(cp_parser_std_attribute): Use uneval_string_attr for standard
deprecated and nodiscard attributes.
gcc/testsuite/
* g++.dg/cpp26/unevalstr1.C: New test.
* g++.dg/cpp26/unevalstr2.C: New test.
* g++.dg/cpp0x/udlit-error1.C (lol): Expect an error for C++26
about user-defined literal in deprecated attribute.
libcpp/
* include/cpplib.h (TTYPE_TABLE): Add CPP_UNEVAL_STRING literal
entry. Use C++11 instead of C++-0x in comments.
* charset.cc (convert_escape): Add UNEVAL argument, if true,
pedantically diagnose numeric escape sequences.
(cpp_interpret_string_1): Formatting fix. Adjust convert_escape
caller.
(cpp_interpret_string): Formatting string.
(cpp_interpret_string_notranslate): Pass type through to
cpp_interpret_string if it is CPP_UNEVAL_STRING.
|
|
This patch eliminates the function "MACRO_MAP_EXPANSION_POINT_LOCATION"
(which hasn't been a macro since r6-739-g0501dbd932a7e9) in favor of
a new line_map_macro::get_expansion_point_location accessor.
No functional change intended.
gcc/c-family/ChangeLog:
* c-warn.cc (warn_for_multistatement_macros): Update for removal
of MACRO_MAP_EXPANSION_POINT_LOCATION.
gcc/cp/ChangeLog:
* module.cc (ordinary_loc_of): Update for removal of
MACRO_MAP_EXPANSION_POINT_LOCATION.
(module_state::note_location): Update for renaming of field.
(module_state::write_macro_maps): Likewise.
gcc/ChangeLog:
* input.cc (dump_location_info): Update for removal of
MACRO_MAP_EXPANSION_POINT_LOCATION.
* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc):
Likewise.
libcpp/ChangeLog:
* include/line-map.h
(line_map_macro::get_expansion_point_location): New accessor.
(line_map_macro::expansion): Rename field to...
(line_map_macro::mexpansion): Rename field to...
(MACRO_MAP_EXPANSION_POINT_LOCATION): Delete this function.
* line-map.cc (linemap_enter_macro): Update for renaming of field.
(linemap_macro_map_loc_to_exp_point): Update for removal of
MACRO_MAP_EXPANSION_POINT_LOCATION.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
The PR requests an enhancement to the diagnostic issued for the use of a
poisoned identifier. Currently, we show the location of the usage, but not
the location which requested the poisoning, which would be helpful for the
user if the decision to poison an identifier was made externally, such as
in a library header.
In order to output this information, we need to remember a location_t for
each identifier that has been poisoned, and that data needs to be preserved
as well in a PCH. One option would be to add a field to struct cpp_hashnode,
but there is no convenient place to add it without increasing the size of
the struct for all identifiers. Given this facility will be needed rarely,
it seemed better to add a second hash map, which is handled PCH-wise the
same as the current one in gcc/stringpool.cc. This hash map associates a new
struct cpp_hashnode_extra with each identifier that needs one. Currently
that struct only contains the new location_t, but it could be extended in
the future if there is other ancillary data that may be convenient to put
there for other purposes.
libcpp/ChangeLog:
PR preprocessor/36887
* directives.cc (do_pragma_poison): Store in the extra hash map the
location from which an identifier has been poisoned.
* lex.cc (identifier_diagnostics_on_lex): When issuing a diagnostic
for the use of a poisoned identifier, also add a note indicating the
location from which it was poisoned.
* identifiers.cc (alloc_node): Convert to template function.
(_cpp_init_hashtable): Handle the new extra hash map.
(_cpp_destroy_hashtable): Likewise.
* include/cpplib.h (struct cpp_hashnode_extra): New struct.
(cpp_create_reader): Update prototype to...
* init.cc (cpp_create_reader): ...accept an argument for the extra
hash table and pass it to _cpp_init_hashtable.
* include/symtab.h (ht_lookup): New overload for convenience.
* internal.h (struct cpp_reader): Add EXTRA_HASH_TABLE member.
(_cpp_init_hashtable): Adjust prototype.
gcc/c-family/ChangeLog:
PR preprocessor/36887
* c-opts.cc (c_common_init_options): Pass new extra hash map
argument to cpp_create_reader().
gcc/ChangeLog:
PR preprocessor/36887
* toplev.h (ident_hash_extra): Declare...
* stringpool.cc (ident_hash_extra): ...this new global variable.
(init_stringpool): Handle ident_hash_extra as well as ident_hash.
(ggc_mark_stringpool): Likewise.
(ggc_purge_stringpool): Likewise.
(struct string_pool_data_extra): New struct.
(spd2): New GC root variable.
(gt_pch_save_stringpool): Use spd2 to handle ident_hash_extra,
analogous to how spd is used to handle ident_hash.
(gt_pch_restore_stringpool): Likewise.
gcc/testsuite/ChangeLog:
PR preprocessor/36887
* c-c++-common/cpp/diagnostic-poison.c: New test.
* g++.dg/pch/pr36887.C: New test.
* g++.dg/pch/pr36887.Hs: New test.
|
|
libcpp/ChangeLog:
* include/line-map.h (LINEMAPS_ORDINARY_MAPS): Delete.
(LINEMAPS_MACRO_MAPS): Delete.
* line-map.cc (linemap_tracks_macro_expansion_locs_p): Update for
deletion of LINEMAPS_MACRO_MAPS.
(linemap_get_statistics): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
It's simpler to use field access than to go through these inline
functions that look as if they are macros.
No functional change intended.
libcpp/ChangeLog:
* include/line-map.h (maps_info_ordinary::cache): Rename to...
(maps_info_ordinary::m_cache): ...this.
(maps_info_macro::cache): Rename to...
(maps_info_macro::m_cache): ...this.
(LINEMAPS_CACHE): Delete.
(LINEMAPS_ORDINARY_CACHE): Delete.
(LINEMAPS_MACRO_CACHE): Delete.
* init.cc (read_original_filename): Update for adding "m_" prefix.
* line-map.cc (linemap_add): Eliminate LINEMAPS_ORDINARY_CACHE in
favor of a simple field access.
(linemap_enter_macro): Likewise for LINEMAPS_MACRO_CACHE.
(linemap_ordinary_map_lookup): Likewise for
LINEMAPS_ORDINARY_CACHE, twice.
(linemap_lookup_macro_index): Likewise for LINEMAPS_MACRO_CACHE.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Nothing uses these; delete them.
libcpp/ChangeLog:
* include/line-map.h (LINEMAPS_LAST_ALLOCATED_MAP): Delete.
(LINEMAPS_LAST_ALLOCATED_ORDINARY_MAP): Delete.
(LINEMAPS_LAST_ALLOCATED_MACRO_MAP): Delete.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This patch eliminates the function "COMBINE_LOCATION_DATA" (which hasn't
been a macro since r6-739-g0501dbd932a7e9) and the function
"get_combined_adhoc_loc" in favor of a new
line_maps::get_or_create_combined_loc member function.
No functional change intended.
gcc/cp/ChangeLog:
* module.cc (module_state::read_location): Update for renaming of
get_combined_adhoc_loc.
gcc/ChangeLog:
* genmatch.cc (main): Update for "m_" prefix of some fields of
line_maps.
* input.cc (make_location): Update for removal of
COMBINE_LOCATION_DATA.
(dump_line_table_statistics): Update for "m_" prefix of some
fields of line_maps.
(location_with_discriminator): Update for removal of
COMBINE_LOCATION_DATA.
(line_table_test::line_table_test): Update for "m_" prefix of some
fields of line_maps.
* toplev.cc (general_init): Likewise.
* tree.cc (set_block): Update for removal of
COMBINE_LOCATION_DATA.
(set_source_range): Likewise.
libcpp/ChangeLog:
* include/line-map.h (line_maps::reallocator): Rename to...
(line_maps::m_reallocator): ...this.
(line_maps::round_alloc_size): Rename to...
(line_maps::m_round_alloc_size): ...this.
(line_maps::location_adhoc_data_map): Rename to...
(line_maps::m_location_adhoc_data_map): ...this.
(line_maps::num_optimized_ranges): Rename to...
(line_maps::m_num_optimized_ranges): ..this.
(line_maps::num_unoptimized_ranges): Rename to...
(line_maps::m_num_unoptimized_ranges): ...this.
(get_combined_adhoc_loc): Delete decl.
(COMBINE_LOCATION_DATA): Delete.
* lex.cc (get_location_for_byte_range_in_cur_line): Update for
removal of COMBINE_LOCATION_DATA.
(warn_about_normalization): Likewise.
(_cpp_lex_direct): Likewise.
* line-map.cc (line_maps::~line_maps): Update for "m_" prefix of
some fields of line_maps.
(rebuild_location_adhoc_htab): Likewise.
(can_be_stored_compactly_p): Convert to...
(line_maps::can_be_stored_compactly_p): ...this private member
function.
(get_combined_adhoc_loc): Convert to...
(line_maps::get_or_create_combined_loc): ...this public member
function.
(line_maps::make_location): Update for removal of
COMBINE_LOCATION_DATA.
(get_data_from_adhoc_loc): Update for "m_" prefix of some fields
of line_maps.
(get_discriminator_from_adhoc_loc): Likewise.
(get_location_from_adhoc_loc): Likewise.
(get_range_from_adhoc_loc): Convert to...
(line_maps::get_range_from_adhoc_loc): ...this private member
function.
(line_maps::get_range_from_loc): Update for conversion of
get_range_from_adhoc_loc to a member function.
(linemap_init): Update for "m_" prefix of some fields of
line_maps.
(line_map_new_raw): Likewise.
(linemap_enter_macro): Likewise.
(linemap_get_statistics): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
No functional change intended.
gcc/ChangeLog:
* input.cc (make_location): Move implementation to
line_maps::make_location.
libcpp/ChangeLog:
* include/line-map.h (line_maps::pure_location_p): New decl.
(line_maps::get_pure_location): New decl.
(line_maps::get_range_from_loc): New decl.
(line_maps::get_start): New.
(line_maps::get_finish): New.
(line_maps::make_location): New decl.
(get_range_from_loc): Make line_maps param const.
(get_discriminator_from_loc): Likewise.
(pure_location_p): Likewise.
(get_pure_location): Likewise.
(linemap_check_files_exited): Likewise.
(linemap_tracks_macro_expansion_locs_p): Likewise.
(linemap_location_in_system_header_p): Likewise.
(linemap_location_from_macro_definition_p): Likewise.
(linemap_macro_map_loc_unwind_toward_spelling): Likewise.
(linemap_included_from_linemap): Likewise.
(first_map_in_common): Likewise.
(linemap_compare_locations): Likewise.
(linemap_location_before_p): Likewise.
(linemap_resolve_location): Likewise.
(linemap_unwind_toward_expansion): Likewise.
(linemap_unwind_to_first_non_reserved_loc): Likewise.
(linemap_expand_location): Likewise.
(linemap_get_file_highest_location): Likewise.
(linemap_get_statistics): Likewise.
(linemap_dump_location): Likewise.
(linemap_dump): Likewise.
(line_table_dump): Likewise.
* internal.h (linemap_get_expansion_line): Likewise.
(linemap_get_expansion_filename): Likewise.
* line-map.cc (can_be_stored_compactly_p): Likewise.
(get_data_from_adhoc_loc): Drop redundant "class".
(get_discriminator_from_adhoc_loc): Likewise.
(get_location_from_adhoc_loc): Likewise.
(get_range_from_adhoc_loc): Likewise.
(get_range_from_loc): Make const and move implementation to...
(line_maps::get_range_from_loc): ...this new function.
(get_discriminator_from_loc): Make line_maps param const.
(pure_location_p): Make const and move implementation to...
(line_maps::pure_location_p): ...this new function.
(get_pure_location): Make const and move implementation to...
(line_maps::get_pure_location): ...this new function.
(linemap_included_from_linemap): Make line_maps param const.
(linemap_check_files_exited): Likewise.
(linemap_tracks_macro_expansion_locs_p): Likewise.
(linemap_macro_map_loc_unwind_toward_spelling): Likewise.
(linemap_get_expansion_line): Likewise.
(linemap_get_expansion_filename): Likewise.
(linemap_location_in_system_header_p): Likewise.
(first_map_in_common_1): Likewise.
(linemap_compare_locations): Likewise.
(linemap_macro_loc_to_spelling_point): Likewise.
(linemap_macro_loc_to_def_point): Likewise.
(linemap_macro_loc_to_exp_point): Likewise.
(linemap_resolve_location): Likewise.
(linemap_location_from_macro_definition_p): Likewise.
(linemap_unwind_toward_expansion): Likewise.
(linemap_unwind_to_first_non_reserved_loc): Likewise.
(linemap_expand_location): Likewise.
(linemap_dump): Likewise.
(linemap_dump_location): Likewise.
(linemap_get_file_highest_location): Likewise.
(linemap_get_statistics): Likewise.
(line_table_dump): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.
Support is communicated through the following three new flags:
- `-fdeps-format=` specifies the format for the output. Currently named
`p1689r5`.
- `-fdeps-file=` specifies the path to the file to write the format to.
- `-fdeps-target=` specifies the `.o` that will be written for the TU
that is scanned. This is required so that the build system can
correlate the dependency output with the actual compilation that will
occur.
CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.
Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.
[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst
TODO:
- header-unit information fields
Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.
- non-utf8 paths
The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously representable in UTF-8" case).
- figure out why junk gets placed at the end of the file
Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.
libcpp/
* include/cpplib.h: Add cpp_fdeps_format enum.
(cpp_options): Add fdeps_format field
(cpp_finish): Add structured dependency fdeps_stream parameter.
* include/mkdeps.h (deps_add_module_target): Add flag for
whether a module is exported or not.
(fdeps_add_target): Add function.
(deps_write_p1689r5): Add function.
* init.cc (cpp_finish): Add new preprocessor parameter used for C++
module tracking.
* mkdeps.cc (mkdeps): Implement P1689R5 output.
gcc/
* doc/invoke.texi: Document -fdeps-format=, -fdeps-file=, and
-fdeps-target= flags.
* gcc.cc: add defaults for -fdeps-target= and -fdeps-file= when
only -fdeps-format= is specified.
* json.h: Add a TODO item to refactor out to share with
`libcpp/mkdeps.cc`.
gcc/c-family/
* c-opts.cc (c_common_handle_option): Add fdeps_file variable and
-fdeps-format=, -fdeps-file=, and -fdeps-target= parsing.
* c.opt: Add -fdeps-format=, -fdeps-file=, and -fdeps-target=
flags.
gcc/cp/
* module.cc (preprocessed_module): Pass whether the module is
exported to dependency tracking.
gcc/testsuite/
* g++.dg/modules/depflags-f-MD.C: New test.
* g++.dg/modules/depflags-f.C: New test.
* g++.dg/modules/depflags-fi.C: New test.
* g++.dg/modules/depflags-fj-MD.C: New test.
* g++.dg/modules/depflags-fj.C: New test.
* g++.dg/modules/depflags-fjo-MD.C: New test.
* g++.dg/modules/depflags-fjo.C: New test.
* g++.dg/modules/depflags-fo-MD.C: New test.
* g++.dg/modules/depflags-fo.C: New test.
* g++.dg/modules/depflags-j-MD.C: New test.
* g++.dg/modules/depflags-j.C: New test.
* g++.dg/modules/depflags-jo-MD.C: New test.
* g++.dg/modules/depflags-jo.C: New test.
* g++.dg/modules/depflags-o-MD.C: New test.
* g++.dg/modules/depflags-o.C: New test.
* g++.dg/modules/p1689-1.C: New test.
* g++.dg/modules/p1689-1.exp.ddi: New test expectation.
* g++.dg/modules/p1689-2.C: New test.
* g++.dg/modules/p1689-2.exp.ddi: New test expectation.
* g++.dg/modules/p1689-3.C: New test.
* g++.dg/modules/p1689-3.exp.ddi: New test expectation.
* g++.dg/modules/p1689-4.C: New test.
* g++.dg/modules/p1689-4.exp.ddi: New test expectation.
* g++.dg/modules/p1689-5.C: New test.
* g++.dg/modules/p1689-5.exp.ddi: New test expectation.
* g++.dg/modules/modules.exp: Load new P1689 library routines.
* g++.dg/modules/test-p1689.py: New tool for validating P1689 output.
* lib/modules.exp: Support for validating P1689 outputs.
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
This patch adds the C FE support, c-family support, small libcpp change
so that 123wb and 42uwb suffixes are handled plus glimits.h change
to define BITINT_MAXWIDTH macro.
The previous patches really do nothing without this, which enables
all the support.
2023-09-06 Jakub Jelinek <jakub@redhat.com>
PR c/102989
gcc/
* glimits.h (BITINT_MAXWIDTH): Define if __BITINT_MAXWIDTH__ is
predefined.
gcc/c-family/
* c-common.cc (c_common_reswords): Add _BitInt as keyword.
(unsafe_conversion_p): Handle BITINT_TYPE like INTEGER_TYPE.
(c_common_signed_or_unsigned_type): Handle BITINT_TYPE.
(c_common_truthvalue_conversion, c_common_get_alias_set,
check_builtin_function_arguments): Handle BITINT_TYPE like
INTEGER_TYPE.
(sync_resolve_size): Add ORIG_FORMAT argument. If
FETCH && !ORIG_FORMAT, type is BITINT_TYPE, return -1 if size isn't
one of 1, 2, 4, 8 or 16 or if it is 16 but TImode is not supported.
(atomic_bitint_fetch_using_cas_loop): New function.
(resolve_overloaded_builtin): Adjust sync_resolve_size caller. If
-1 is returned, use atomic_bitint_fetch_using_cas_loop to lower it.
Formatting fix.
(keyword_begins_type_specifier): Handle RID_BITINT.
* c-common.h (enum rid): Add RID_BITINT enumerator.
* c-cppbuiltin.cc (c_cpp_builtins): For C call
targetm.c.bitint_type_info and predefine __BITINT_MAXWIDTH__
and for -fbuilding-libgcc also __LIBGCC_BITINT_LIMB_WIDTH__ and
__LIBGCC_BITINT_ORDER__ macros if _BitInt is supported.
* c-lex.cc (interpret_integer): Handle CPP_N_BITINT.
* c-pretty-print.cc (c_pretty_printer::simple_type_specifier,
c_pretty_printer::direct_abstract_declarator,
c_pretty_printer::direct_declarator, c_pretty_printer::declarator):
Handle BITINT_TYPE.
(pp_c_integer_constant): Handle printing of large precision wide_ints
which would buffer overflow digit_buffer.
* c-warn.cc (conversion_warning, warnings_for_convert_and_check,
warnings_for_convert_and_check): Handle BITINT_TYPE like
INTEGER_TYPE.
gcc/c/
* c-convert.cc (c_convert): Handle BITINT_TYPE like INTEGER_TYPE.
* c-decl.cc (check_bitfield_type_and_width): Allow BITINT_TYPE
bit-fields.
(finish_struct): Prefer to use BITINT_TYPE for BITINT_TYPE bit-fields
if possible.
(declspecs_add_type): Formatting fixes. Handle cts_bitint. Adjust
for added union in *specs. Handle RID_BITINT.
(finish_declspecs): Handle cts_bitint. Adjust for added union
in *specs.
* c-parser.cc (c_keyword_starts_typename, c_token_starts_declspecs,
c_parser_declspecs, c_parser_gnu_attribute_any_word): Handle
RID_BITINT.
(c_parser_omp_clause_schedule): Handle BITINT_TYPE like INTEGER_TYPE.
* c-tree.h (enum c_typespec_keyword): Mention _BitInt in comment.
Add cts_bitint enumerator.
(struct c_declspecs): Move int_n_idx and floatn_nx_idx into a union
and add bitint_prec there as well.
* c-typeck.cc (c_common_type, comptypes_internal):
Handle BITINT_TYPE.
(perform_integral_promotions): Promote BITINT_TYPE bit-fields to
their declared type.
(build_array_ref, build_unary_op, build_conditional_expr,
build_c_cast, convert_for_assignment, digest_init, build_binary_op):
Handle BITINT_TYPE.
* c-fold.cc (c_fully_fold_internal): Handle BITINT_TYPE like
INTEGER_TYPE.
* c-aux-info.cc (gen_type): Handle BITINT_TYPE.
libcpp/
* expr.cc (interpret_int_suffix): Handle wb and WB suffixes.
* include/cpplib.h (CPP_N_BITINT): Define.
|
|
We're (currently) not aware of any actual use of 'ht_identifier's with NUL
characters embedded; its 'len' field appears to exist for optimization
purposes, since "forever". Before 'struct ht_identifier' was added in
commit 2a967f3d3a45294640e155381ef549e0b8090ad4 (Subversion r42334), we had in
'gcc/cpplib.h:struct cpp_hashnode': 'unsigned short len', or earlier 'length',
earlier in 'gcc/cpphash.h:struct hashnode': 'unsigned short length', earlier
'size_t length' with comment: "length of token, for quick comparison", earlier
'int length', ever since the 'gcc/cpp*' files were added in
commit 7f2935c734c36f84ab62b20a04de465e19061333 (Subversion r9191).
This amends commit f3b957ea8b9dadfb1ed30f24f463529684b7a36a
"pch: Fix streaming of strings with embedded null bytes".
gcc/
* doc/gty.texi (GTY Options) <string_length>: Enhance.
libcpp/
* include/symtab.h (struct ht_identifier): Document different
rationale.
|
|
It seems prudent to add C++26 now that the first C++26 papers have been
approved. I followed commit r11-6920 as well as r8-3237.
Since C++23 is essentially finished and its __cplusplus value has
settled to 202302L, I've updated cpp_init_builtins and marked
-std=c++2b Undocumented and made -std=c++23 no longer Undocumented.
As for __cplusplus, I've chosen 202400L:
$ xg++ -std=c++26 -dM -E -x c++ - < /dev/null | grep cplusplus
#define __cplusplus 202400L
I've verified the patch with a simple test, exercising the new
directives. Don't forget to update your GXX_TESTSUITE_STDS!
This patch does not add -Wc++26-extensions.
gcc/c-family/ChangeLog:
* c-common.h (cxx_dialect): Add cxx26 as a dialect.
* c-opts.cc (set_std_cxx26): New.
(c_common_handle_option): Set options when -std={c,gnu}++2{c,6} is
enabled.
(c_common_post_options): Adjust comments.
* c.opt: Add options for -std=c++26, std=c++2c, -std=gnu++26,
and -std=gnu++2c.
(std=c++2b): Mark as Undocumented.
(std=c++23): No longer Undocumented.
gcc/ChangeLog:
* doc/cpp.texi (__cplusplus): Document value for -std=c++26 and
-std=gnu++26. Document that for C++23, its value is 202302L.
* doc/invoke.texi: Document -std=c++26 and -std=gnu++26.
* dwarf2out.cc (highest_c_language): Handle GNU C++26.
(gen_compile_unit_die): Likewise.
libcpp/ChangeLog:
* include/cpplib.h (c_lang): Add CXX26 and GNUCXX26.
* init.cc (lang_defaults): Add rows for CXX26 and GNUCXX26.
(cpp_init_builtins): Set __cplusplus to 202400L for C++26.
Set __cplusplus to 202302L for C++23.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_c++23): Return
1 also if check_effective_target_c++26.
(check_effective_target_c++23_down): New.
(check_effective_target_c++26_only): New.
(check_effective_target_c++26): New.
* g++.dg/cpp23/cplusplus.C: Adjust expected value.
* g++.dg/cpp26/cplusplus.C: New test.
|
|
Existing text output in GCC has to be implemented by writing
sequentially to a pretty_printer instance. This makes it
hard to implement some kinds of diagnostic output (see e.g.
diagnostic-show-locus.cc).
This patch adds more flexible ways of creating text output:
- a canvas class, which can be "painted" to via random-access (rather
that sequentially)
- a table class for 2D grid layout, supporting items that span
multiple rows/columns
- a widget class for organizing diagrams hierarchically.
The patch also expands GCC's diagnostics subsystem so that diagnostics
can have "text art" diagrams - think ASCII art, but potentially
including some Unicode characters, such as box-drawing chars.
The new code is in a new "gcc/text-art" subdirectory and "text_art"
namespace.
The patch adds a new "-fdiagnostics-text-art-charset=VAL" option, with
values:
- "none": don't emit diagrams (added to -fdiagnostics-plain-output)
- "ascii": use pure ASCII in diagrams
- "unicode": allow for conservative use of unicode drawing characters
(such as box-drawing characters).
- "emoji" (the default): as "unicode", but potentially allow for
conservative use of emoji in the output (such as U+26A0 WARNING SIGN).
I made it possible to disable emoji separately from unicode as I believe
there's a generation gap in acceptance of these characters (some older
programmers have a visceral reaction against them, whereas younger
programmers may have no problem with them).
Diagrams are emitted to stderr by default. With SARIF output they are
captured as a location in "relatedLocations", with the diagram as a
code block in Markdown within a "markdown" property of a message.
This patch doesn't add any such diagram usage to GCC, saving that for
followups, apart from adding a plugin to the test suite to exercise the
functionality.
contrib/ChangeLog:
* unicode/gen-box-drawing-chars.py: New file.
* unicode/gen-combining-chars.py: New file.
* unicode/gen-printable-chars.py: New file.
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add text-art/box-drawing.o,
text-art/canvas.o, text-art/ruler.o, text-art/selftests.o,
text-art/style.o, text-art/styled-string.o, text-art/table.o,
text-art/theme.o, and text-art/widget.o.
* color-macros.h (COLOR_FG_BRIGHT_BLACK): New.
(COLOR_FG_BRIGHT_RED): New.
(COLOR_FG_BRIGHT_GREEN): New.
(COLOR_FG_BRIGHT_YELLOW): New.
(COLOR_FG_BRIGHT_BLUE): New.
(COLOR_FG_BRIGHT_MAGENTA): New.
(COLOR_FG_BRIGHT_CYAN): New.
(COLOR_FG_BRIGHT_WHITE): New.
(COLOR_BG_BRIGHT_BLACK): New.
(COLOR_BG_BRIGHT_RED): New.
(COLOR_BG_BRIGHT_GREEN): New.
(COLOR_BG_BRIGHT_YELLOW): New.
(COLOR_BG_BRIGHT_BLUE): New.
(COLOR_BG_BRIGHT_MAGENTA): New.
(COLOR_BG_BRIGHT_CYAN): New.
(COLOR_BG_BRIGHT_WHITE): New.
* common.opt (fdiagnostics-text-art-charset=): New option.
(diagnostic-text-art.h): New SourceInclude.
(diagnostic_text_art_charset) New Enum and EnumValues.
* configure: Regenerate.
* configure.ac (gccdepdir): Add text-art to loop.
* diagnostic-diagram.h: New file.
* diagnostic-format-json.cc (json_emit_diagram): New.
(diagnostic_output_format_init_json): Wire it up to
context->m_diagrams.m_emission_cb.
* diagnostic-format-sarif.cc: Include "diagnostic-diagram.h" and
"text-art/canvas.h".
(sarif_result::on_nested_diagnostic): Move code to...
(sarif_result::add_related_location): ...this new function.
(sarif_result::on_diagram): New.
(sarif_builder::emit_diagram): New.
(sarif_builder::make_message_object_for_diagram): New.
(sarif_emit_diagram): New.
(diagnostic_output_format_init_sarif): Set
context->m_diagrams.m_emission_cb to sarif_emit_diagram.
* diagnostic-text-art.h: New file.
* diagnostic.cc: Include "diagnostic-text-art.h",
"diagnostic-diagram.h", and "text-art/theme.h".
(diagnostic_initialize): Initialize context->m_diagrams and
call diagnostics_text_art_charset_init.
(diagnostic_finish): Clean up context->m_diagrams.m_theme.
(diagnostic_emit_diagram): New.
(diagnostics_text_art_charset_init): New.
* diagnostic.h (text_art::theme): New forward decl.
(class diagnostic_diagram): Likewise.
(diagnostic_context::m_diagrams): New field.
(diagnostic_emit_diagram): New decl.
* doc/invoke.texi (Diagnostic Message Formatting Options): Add
-fdiagnostics-text-art-charset=.
(-fdiagnostics-plain-output): Add
-fdiagnostics-text-art-charset=none.
* gcc.cc: Include "diagnostic-text-art.h".
(driver_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
* opts-common.cc (decode_cmdline_options_to_array): Add
"-fdiagnostics-text-art-charset=none" to expanded_args for
-fdiagnostics-plain-output.
* opts.cc: Include "diagnostic-text-art.h".
(common_handle_option): Handle OPT_fdiagnostics_text_art_charset_.
* pretty-print.cc (pp_unicode_character): New.
* pretty-print.h (pp_unicode_character): New decl.
* selftest-run-tests.cc: Include "text-art/selftests.h".
(selftest::run_tests): Call text_art_tests.
* text-art/box-drawing-chars.inc: New file, generated by
contrib/unicode/gen-box-drawing-chars.py.
* text-art/box-drawing.cc: New file.
* text-art/box-drawing.h: New file.
* text-art/canvas.cc: New file.
* text-art/canvas.h: New file.
* text-art/ruler.cc: New file.
* text-art/ruler.h: New file.
* text-art/selftests.cc: New file.
* text-art/selftests.h: New file.
* text-art/style.cc: New file.
* text-art/styled-string.cc: New file.
* text-art/table.cc: New file.
* text-art/table.h: New file.
* text-art/theme.cc: New file.
* text-art/theme.h: New file.
* text-art/types.h: New file.
* text-art/widget.cc: New file.
* text-art/widget.h: New file.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-text-art-ascii-bw.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-ascii-color.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-none.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-unicode-bw.c: New test.
* gcc.dg/plugin/diagnostic-test-text-art-unicode-color.c: New test.
* gcc.dg/plugin/diagnostic_plugin_test_text_art.c: New test plugin.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add them.
libcpp/ChangeLog:
* charset.cc (get_cppchar_property): New function template, based
on...
(cpp_wcwidth): ...this function. Rework to use the above.
Include "combining-chars.inc".
(cpp_is_combining_char): New function
Include "printable-chars.inc".
(cpp_is_printable_char): New function
* combining-chars.inc: New file, generated by
contrib/unicode/gen-combining-chars.py.
* include/cpplib.h (cpp_is_combining_char): New function decl.
(cpp_is_printable_char): New function decl.
* printable-chars.inc: New file, generated by
contrib/unicode/gen-printable-chars.py.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
PR analyzer/109098 notes that the SARIF spec mandates that .sarif
files are UTF-8 encoded, but -fdiagnostics-format=sarif-file naively
assumes that the source files are UTF-8 encoded when quoting source
artefacts in the .sarif output, which can lead to us writing out
.sarif files with non-UTF-8 bytes in them (which break my reporting
scripts).
The root cause is that sarif_builder::maybe_make_artifact_content_object
was using maybe_read_file to load the file content as bytes, and
assuming they were UTF-8 encoded.
This patch reworks both overloads of this function (one used for the
whole file, the other for snippets of quoted lines) so that they go
through input.cc's file cache, which attempts to decode the input files
according to the input charset, and then encode as UTF-8. They also
check that the result actually is UTF-8, for cases where the input
charset is missing, or incorrectly specified, and omit the quoted
source for such awkward cases.
Doing so fixes all of the cases I've encountered.
The patch adds a new:
{ dg-final { verify-sarif-file } }
directive to all SARIF test cases in the test suite, which verifies
that the output is UTF-8 encoded, and is valid JSON. In particular
it verifies that when we complain about encoding problems, the .sarif
report we emit is itself correctly encoded.
gcc/ChangeLog:
PR analyzer/109098
* diagnostic-format-sarif.cc (read_until_eof): Delete.
(maybe_read_file): Delete.
(sarif_builder::maybe_make_artifact_content_object): Use
get_source_file_content rather than maybe_read_file.
Reject it if it's not valid UTF-8.
* input.cc (file_cache_slot::get_full_file_content): New.
(get_source_file_content): New.
(selftest::check_cpp_valid_utf8_p): New.
(selftest::test_cpp_valid_utf8_p): New.
(selftest::input_cc_tests): Call selftest::test_cpp_valid_utf8_p.
* input.h (get_source_file_content): New prototype.
gcc/testsuite/ChangeLog:
PR analyzer/109098
* c-c++-common/diagnostic-format-sarif-file-1.c: Add
verify-sarif-file directive.
* c-c++-common/diagnostic-format-sarif-file-2.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-3.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-4.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: New
test case, adapted from Wbidi-chars-1.c.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c:
New test case.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-2.c:
New test case.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c:
New test case, adapted from cpp/Winvalid-utf8-1.c.
* c-c++-common/diagnostic-format-sarif-file-valid-CP850.c: New
test case, adapted from gcc.dg/diagnostic-input-charset-1.c.
* gcc.dg/plugin/crash-test-ice-sarif.c: Add verify-sarif-file
directive.
* gcc.dg/plugin/crash-test-write-though-null-sarif.c: Likewise.
* gcc.dg/plugin/diagnostic-test-paths-5.c: Likewise.
* lib/scansarif.exp (verify-sarif-file): New procedure.
* lib/verify-sarif-file.py: New support script.
libcpp/ChangeLog:
PR analyzer/109098
* charset.cc (cpp_valid_utf8_p): New function.
* include/cpplib.h (cpp_valid_utf8_p): New prototype.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
|
|
When a GTY'ed struct is streamed to PCH, any plain char* pointers it contains
(whether they live in GC-controlled memory or not) will be marked for PCH
output by the routine gt_pch_note_object in ggc-common.cc. This routine
special-cases plain char* strings, and in particular it uses strlen() to get
their length. Thus it does not handle strings with embedded null bytes, but it
is possible for something PCH cares about (such as a string literal token in a
macro definition) to contain such embedded nulls. To fix that up, add a new
GTY option "string_length" so that gt_pch_note_object can be informed the
actual length it ought to use, and use it in the relevant libcpp structs
(cpp_string and ht_identifier) accordingly.
gcc/ChangeLog:
* gengtype.cc (output_escaped_param): Add missing const.
(get_string_option): Add missing check for option type.
(walk_type): Support new "string_length" GTY option.
(write_types_process_field): Likewise.
* ggc-common.cc (gt_pch_note_object): Add optional length argument.
* ggc.h (gt_pch_note_object): Adjust prototype for new argument.
(gt_pch_n_S2): Declare...
* stringpool.cc (gt_pch_n_S2): ...new function.
* doc/gty.texi: Document new GTY((string_length)) option.
libcpp/ChangeLog:
* include/cpplib.h (struct cpp_string): Use new "string_length" GTY.
* include/symtab.h (struct ht_identifier): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/pch/pch-string-nulls.C: New test.
* g++.dg/pch/pch-string-nulls.Hs: New test.
|
|
C2x has, like C++, adopted rules for identifiers based directly on an
unversioned normative reference to Unicode. Make libcpp follow those
rules for c2x / gnu2x standards (this involves bringing back a flag
separate from the C++ one for whether to use these identifier rules,
but this time enabled for all C++ language versions since that was the
conclusion adopted for C++ identifier handling).
There is one change here that affects C++. I believe the new
normative requirement for NFC only applies to identifiers, not to the
use of identifier-continue characters in pp-numbers, where there is no
such requirement and so the diagnostic ought to be a warning not a
pedwarn in pp-numbers, and that this is the case for both C and C++.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
* charset.cc (ucn_valid_in_identifier): Check xid_identifiers not
cplusplus to determine whether to use CXX23 and NXX23 flags.
* include/cpplib.h (struct cpp_options): Add xid_identifiers.
* init.cc (struct lang_flags, lang_defaults): Add xid_identifiers.
(cpp_set_lang): Set xid_identifiers.
* lex.cc (warn_about_normalization): Add parameter identifier.
Only pedwarn about non-NFC for identifiers, not pp-numbers.
(_cpp_lex_direct): Update calls to warn_about_normalization.
gcc/testsuite/
* gcc.dg/cpp/c2x-ucnid-1-utf8.c, gcc.dg/cpp/c2x-ucnid-1.c: New
tests.
|
|
Here is a complete patch to add std::bfloat16_t support on
x86 (AArch64 and ARM left for later). Almost no BFmode optabs
are added by the patch, so for binops/unops it extends to SFmode
first and then truncates back to BFmode.
For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations
of all those conversions so that we avoid double rounding, for
BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much
it emits BFmode -> SFmode conversion first and then converts to the even
wider mode, neither step should be imprecise.
For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion
and then SFmode -> HFmode, because neither format is subset or superset
of the other, while SFmode is superset of both.
expr.cc then contains a -ffast-math optimization of the BF -> SF and
SF -> BF conversions if we don't optimize for space (and for the latter
if -frounding-math isn't enabled either).
For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16
but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math
|| !flag_unsafe_math_optimizations, because I think the insn doesn't
raise on sNaNs, hardcodes round to nearest and flushes denormals to zero.
By default (unless x86 -fexcess-precision=16) we use float excess
precision for BFmode, so truncate only on explicit casts and assignments.
The patch introduces a single __bf16 builtin - __builtin_nansf16b,
because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN,
and uses f16b suffix instead of bf16 because there would be ambiguity on
log vs. logb - __builtin_logbf16 could be either log with bf16 suffix
or logb with f16 suffix. In other cases libstdc++ should mostly use
__builtin_*f for std::bfloat16_t overloads (we have a problem with
std::nextafter though but that one we have also for std::float16_t).
2022-10-14 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions. If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
* builtins.def (BUILT_IN_NANSF16B): New builtin.
* fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
* config/i386/i386.cc (classify_argument): Handle E_BCmode.
(ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
for -msse2.
(ix86_mangle_type): Mangle BFmode as DF16b.
(ix86_invalid_conversion, ix86_invalid_unary_op,
ix86_invalid_binary_op): Remove.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
TARGET_INVALID_BINARY_OP): Don't redefine.
* config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
(ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
ix86_bf16_type_node, only create it if still NULL.
* config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
* config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
predefine __BFLT16_*__ macros and for C++23 also
__STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related
macros for -fbuilding-libgcc.
* c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote __bf16 to
double.
gcc/cp/
* cp-tree.h (extended_float_type_p): Return true for
bfloat16_type_node.
* typeck.cc (cp_compare_floating_point_conversion_ranks): Set
extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bfloat16,
check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
New.
* gcc.dg/torture/bfloat16-basic.c: New test.
* gcc.dg/torture/bfloat16-builtin.c: New test.
* gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/bfloat16-complex.c: New test.
* gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
from bfloat16-builtin-issignaling-1.c.
* gcc.dg/torture/floatn-basic.h: Allow to be includable from
bfloat16-basic.c.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
diagnostics.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
libcpp/
* include/cpplib.h (CPP_N_BFLOAT16): Define.
* expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
C++.
libgcc/
* config/i386/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
(CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
-msse2.
* config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
* config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
* config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
* config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
* soft-fp/brain.h: New file.
* soft-fp/truncsfbf2.c: New file.
* soft-fp/truncdfbf2.c: New file.
* soft-fp/truncxfbf2.c: New file.
* soft-fp/trunctfbf2.c: New file.
* soft-fp/trunchfbf2.c: New file.
* soft-fp/truncbfhf2.c: New file.
* soft-fp/extendbfsf2.c: New file.
libiberty/
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
* cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
entry.
(cplus_demangle_type): Demangle DF16b.
* testsuite/demangle-expected (_Z3xxxDF16b): New test.
|
|
This is the first in a series of patches to enable discriminator support
in AutoFDO.
This patch switches to tracking discriminators per statement/instruction
instead of per basic block. Tracking per basic block was problematic since
not all statements in a basic block needed a discriminator and, also, later
optimizations could move statements between basic blocks making correlation
during AutoFDO compilation unreliable. Tracking per statement also allows
us to assign different discriminators to multiple function calls in the same
basic block. A subsequent patch will add that support.
The idea of this patch is based on commit 4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different
approach. In Dehao's work special (normally unused) location ids and side tables
were used to keep track of locations with discriminators. Things have changed
since then and I don't think we have unused location ids anymore. Instead,
I made discriminators a part of ad-hoc locations.
The difference from Dehao's work also includes support for discriminator
reading/writing in lto streaming and in modules.
Tested on x86_64-pc-linux-gnu.
gcc/ChangeLog:
* basic-block.h: Remove discriminator from basic blocks.
* cfghooks.cc (split_block_1): Remove discriminator from basic blocks.
* final.cc (final_start_function_1): Switch from per-bb to per statement
discriminator.
(final_scan_insn_1): Don't keep track of basic block discriminators.
(compute_discriminator): Switch from basic block discriminators to
instruction discriminators.
(insn_discriminator): New function to return instruction discriminator.
(notice_source_line): Use insn_discriminator.
* gimple-pretty-print.cc (dump_gimple_bb_header): Remove dumping of
basic block discriminators.
* gimple-streamer-in.cc (input_bb): Remove reading of basic block
discriminators.
* gimple-streamer-out.cc (output_bb): Remove writing of basic block
discriminators.
* input.cc (make_location): Pass 0 discriminator to COMBINE_LOCATION_DATA.
(location_with_discriminator): New function to combine locus with
a discriminator.
(has_discriminator): New function to check if a location has a discriminator.
(get_discriminator_from_loc): New function to get the discriminator
from a location.
* input.h: Declarations of new functions.
* lto-streamer-in.cc (cmp_loc): Use discriminators in location comparison.
(apply_location_cache): Keep track of current discriminator.
(input_location_and_block): Read discriminator from stream.
* lto-streamer-out.cc (clear_line_info): Set current discriminator to
UINT_MAX.
(lto_output_location_1): Write discriminator to stream.
* lto-streamer.h: Add discriminator to cached_location.
Add current_discr to lto_location_cache.
Add current_discr to output_block.
* print-rtl.cc (print_rtx_operand_code_i): Print discriminator.
* rtl.h: Add extern declaration of insn_discriminator.
* tree-cfg.cc (assign_discriminator): New function to assign a unique
discriminator value to all statements in a basic block that have the given
line number.
(assign_discriminators): Assign discriminators to statement locations.
* tree-pretty-print.cc (dump_location): Dump discriminators.
* tree.cc (set_block): Preserve discriminator when setting block.
(set_source_range): Preserve discriminator when setting source range.
gcc/cp/ChangeLog:
* module.cc (write_location): Write discriminator.
(read_location): Read discriminator.
libcpp/ChangeLog:
* include/line-map.h: Add discriminator to location_adhoc_data.
(get_combined_adhoc_loc): Add discriminator parameter.
(get_discriminator_from_adhoc_loc): Add external declaration.
(get_discriminator_from_loc): Add external declaration.
(COMBINE_LOCATION_DATA): Add discriminator parameter.
* lex.cc (get_location_for_byte_range_in_cur_line) Pass 0 discriminator
in a call to COMBINE_LOCATION_DATA.
(warn_about_normalization): Pass 0 discriminator in a call to
COMBINE_LOCATION_DATA.
(_cpp_lex_direct): Pass 0 discriminator in a call to
COMBINE_LOCATION_DATA.
* line-map.cc (location_adhoc_data_hash): Use discriminator compute
location_adhoc_data hash.
(location_adhoc_data_eq): Use discriminator when comparing
location_adhoc_data.
(can_be_stored_compactly_p): Check discriminator to determine
compact storage.
(get_combined_adhoc_loc): Add discriminator parameter.
(get_discriminator_from_adhoc_loc): New function to get the discriminator
from an ad-hoc location.
(get_discriminator_from_loc): New function to get the discriminator
from a location.
gcc/testsuite/ChangeLog:
* c-c++-common/ubsan/pr85213.c: Pass -gno-statement-frontiers.
|
|
C2x follows C++ in making alignas, alignof, bool, false,
static_assert, thread_local and true keywords; implement this
accordingly. This implementation makes them normal keywords in C2x
mode just like any other keyword (C2x leaves open the possibility of
implementation using predefined macros instead - thus, there aren't
any testcases asserting that they aren't macros). As in C++ and
previous versions of C, true and false are handled like signed 1 and 0
in #if (there was an intermediate state in some C2x drafts where they
had different macro expansions that were unsigned in #if).
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
As with the removal of unprototyped functions, this change has a high
risk of breaking some old code and people doing GNU/Linux distribution
builds may wish to see how much is broken in a build with a -std=gnu2x
default.
gcc/
* ginclude/stdalign.h [defined __STDC_VERSION__ &&
__STDC_VERSION__ > 201710L]: Disable all content.
* ginclude/stdbool.h [defined __STDC_VERSION__ && __STDC_VERSION__
> 201710L] (bool, true, false): Do not define.
gcc/c-family/
* c-common.cc (c_common_reswords): Use D_C2X instead of D_CXXONLY
for alignas, alignof, bool, false, static_assert, thread_local and
true.
gcc/c/
* c-parser.cc (c_parser_static_assert_declaration_no_semi)
(c_parser_alignas_specifier, c_parser_alignof_expression): Allow
for C2x spellings of keywords.
(c_parser_postfix_expression): Handle RID_TRUE and RID_FALSE.
gcc/testsuite/
* gcc.dg/c11-keywords-1.c, gcc.dg/c2x-align-1.c,
gcc.dg/c2x-align-6.c, gcc.dg/c2x-bool-2.c,
gcc.dg/c2x-static-assert-3.c, gcc.dg/c2x-static-assert-4.c,
gcc.dg/c2x-thread-local-1.c: New tests.
* gcc.dg/c2x-bool-1.c: Update expectations.
libcpp/
* include/cpplib.h (struct cpp_options): Add true_false.
* expr.cc (eval_token): Check true_false not cplusplus to
determine whether to handle true and false keywords.
* init.cc (struct lang_flags): Add true_false.
(lang_defaults): Update.
(cpp_set_lang): Set true_false.
|
|
On Tue, Aug 30, 2022 at 09:10:37PM +0000, Joseph Myers wrote:
> I'm seeing build failures of glibc for powerpc64, as illustrated by the
> following C code:
>
> #if 0
> \NARG
> #endif
>
> (the actual sysdeps/powerpc/powerpc64/sysdep.h code is inside #ifdef
> __ASSEMBLER__).
>
> This shows some problems with this feature - and with delimited escape
> sequences - as it affects C. It's fine to accept it as an extension
> inside string and character literals, because \N or \u{...} would be
> invalid in the absence of the feature (i.e. the syntax for such literals
> fails to match, meaning that the rule about undefined behavior for a
> single ' or " as a pp-token applies). But outside string and character
> literals, the usual lexing rules apply, the \ is a pp-token on its own and
> the code is valid at the preprocessing level, and with expansion of macros
> appearing before or after the \ (e.g. u defined as a macro in the \u{...}
> case) it may be valid code at the language level as well. I don't know
> what older C++ versions say about this, but for C this means e.g.
>
> #define z(x) 0
> #define a z(
> int x = a\NARG);
>
> needs to be accepted as expanding to "int x = 0;", not interpreted as
> using the \N feature in an identifier and produce an error.
The following patch changes this, so that:
1) outside of string/character literals, \N without following { is never
treated as an error nor warning, it is silently treated as \ separate
token followed by whatever is after it
2) \u{123} and \N{LATIN SMALL LETTER A WITH ACUTE} are not handled as
extension at all outside of string/character literals in the strict
standard modes (-std=c*) except for -std=c++{23,2b}, only in the
-std=gnu* modes, because it changes behavior on valid sources, e.g.
#define z(x) 0
#define a z(
int x = a\u{123});
int y = a\N{LATIN SMALL LETTER A WITH ACUTE});
3) introduces -Wunicode warning (on by default) and warns for cases
of what looks like invalid delimited escape sequence or named
universal character escape outside of string/character literals
and is treated as separate tokens
2022-09-07 Jakub Jelinek <jakub@redhat.com>
libcpp/
* include/cpplib.h (struct cpp_options): Add cpp_warn_unicode member.
(enum cpp_warning_reason): Add CPP_W_UNICODE.
* init.cc (cpp_create_reader): Initialize cpp_warn_unicode.
* charset.cc (_cpp_valid_ucn): In possible identifier contexts, don't
handle \u{ or \N{ specially in -std=c* modes except -std=c++2{3,b}.
In possible identifier contexts, don't emit an error and punt
if \N isn't followed by {, or if \N{} surrounds some lower case
letters or _. In possible identifier contexts when not C++23, don't
emit an error but warning about unknown character names and treat as
separate tokens. When treating as separate tokens \u{ or \N{, emit
warnings.
gcc/
* doc/invoke.texi (-Wno-unicode): Document.
gcc/c-family/
* c.opt (Winvalid-utf8): Use ObjC instead of objC. Remove
" in comments" from description.
(Wunicode): New option.
gcc/testsuite/
* c-c++-common/cpp/delimited-escape-seq-4.c: New test.
* c-c++-common/cpp/delimited-escape-seq-5.c: New test.
* c-c++-common/cpp/delimited-escape-seq-6.c: New test.
* c-c++-common/cpp/delimited-escape-seq-7.c: New test.
* c-c++-common/cpp/named-universal-char-escape-5.c: New test.
* c-c++-common/cpp/named-universal-char-escape-6.c: New test.
* c-c++-common/cpp/named-universal-char-escape-7.c: New test.
* g++.dg/cpp23/named-universal-char-escape1.C: New test.
* g++.dg/cpp23/named-universal-char-escape2.C: New test.
|
|
PR c/90885 notes various places in real-world code where people have
written C/C++ code that uses ^ (exclusive or) where presumbably they
meant exponentiation.
For example
https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=2%5E32&search=Search
currently finds 11 places using "2^32", and all of them appear to be
places where the user means 2 to the power of 32, rather than 2
exclusive-orred with 32 (which is 34).
This patch adds a new -Wxor-used-as-pow warning to the C and C++
frontends to complain about ^ when the left-hand side is the decimal
constant 2 or the decimal constant 10.
This is the same name as the corresponding clang warning:
https://clang.llvm.org/docs/DiagnosticsReference.html#wxor-used-as-pow
As per the clang warning, the warning suggests converting the left-hand
side to a hexadecimal constant if you really mean xor, which suppresses
the warning (though this patch implements a fix-it hint for that, whereas
the clang implementation only has a fix-it hint for the initial
suggestion of exponentiation).
I initially tried implementing this without checking for decimals, but
this version had lots of false positives. Checking for decimals
requires extending the lexer to capture whether or not a CPP_NUMBER
token was decimal. I added a new DECIMAL_INT flag to cpplib.h for this.
Unfortunately, c_token and cp_tokens both have only an unsigned char for
their flags (as captured by c_lex_with_flags), whereas this would add
the 12th flag to cpp_tokens. Of the first 8 flags, all but BOL are used
in the C or C++ frontends, but BOL is not, so I moved that to a higher
position, using its old value for the new DECIMAL_INT flag, so that it
is representable within an unsigned char.
Example output:
demo.c:5:13: warning: result of '2^8' is 10; did you mean '1 << 8' (256)? [-Wxor-used-as-pow]
5 | int t2_8 = 2^8;
| ^
| --
| 1<<
demo.c:5:12: note: you can silence this warning by using a hexadecimal constant (0x2 rather than 2)
5 | int t2_8 = 2^8;
| ^
| 0x2
demo.c:21:15: warning: result of '10^6' is 12; did you mean '1e6'? [-Wxor-used-as-pow]
21 | int t10_6 = 10^6;
| ^
| ---
| 1e
demo.c:21:13: note: you can silence this warning by using a hexadecimal constant (0xa rather than 10)
21 | int t10_6 = 10^6;
| ^~
| 0xa
gcc/c-family/ChangeLog:
PR c/90885
* c-common.h (check_for_xor_used_as_pow): New decl.
* c-lex.cc (c_lex_with_flags): Add DECIMAL_INT to flags as appropriate.
* c-warn.cc (check_for_xor_used_as_pow): New.
* c.opt (Wxor-used-as-pow): New.
gcc/c/ChangeLog:
PR c/90885
* c-parser.cc (c_parser_string_literal): Clear ret.m_decimal.
(c_parser_expr_no_commas): Likewise.
(c_parser_conditional_expression): Likewise.
(c_parser_binary_expression): Clear m_decimal when popping the
stack.
(c_parser_unary_expression): Clear ret.m_decimal.
(c_parser_has_attribute_expression): Likewise for result.
(c_parser_predefined_identifier): Likewise for expr.
(c_parser_postfix_expression): Likewise for expr.
Set expr.m_decimal when handling a CPP_NUMBER that was a decimal
token.
* c-tree.h (c_expr::m_decimal): New bitfield.
* c-typeck.cc (parser_build_binary_op): Clear result.m_decimal.
(parser_build_binary_op): Call check_for_xor_used_as_pow.
gcc/cp/ChangeLog:
PR c/90885
* cp-tree.h (class cp_expr): Add bitfield m_decimal. Clear it in
existing ctors. Add ctor that allows specifying its value.
(cp_expr::decimal_p): New accessor.
* parser.cc (cp_parser_expression_stack_entry::flags): New field.
(cp_parser_primary_expression): Set m_decimal of cp_expr when
handling numbers.
(cp_parser_binary_expression): Extract flags from token when
populating stack. Call check_for_xor_used_as_pow.
gcc/ChangeLog:
PR c/90885
* doc/invoke.texi (Warning Options): Add -Wxor-used-as-pow.
gcc/testsuite/ChangeLog:
PR c/90885
* c-c++-common/Wxor-used-as-pow-1.c: New test.
* c-c++-common/Wxor-used-as-pow-fixits.c: New test.
* g++.dg/parse/expr3.C: Convert 2 to 0x2 to suppress
-Wxor-used-as-pow.
* g++.dg/warn/Wparentheses-10.C: Likewise.
* g++.dg/warn/Wparentheses-18.C: Likewise.
* g++.dg/warn/Wparentheses-19.C: Likewise.
* g++.dg/warn/Wparentheses-9.C: Likewise.
* g++.dg/warn/Wxor-used-as-pow-named-op.C: New test.
* gcc.dg/Wparentheses-6.c: Convert 2 to 0x2 to suppress
-Wxor-used-as-pow.
* gcc.dg/Wparentheses-7.c: Likewise.
* gcc.dg/precedence-1.c: Likewise.
libcpp/ChangeLog:
PR c/90885
* include/cpplib.h (BOL): Move macro to 1 << 12 since it is
not used by C/C++'s unsigned char token flags.
(DECIMAL_INT): New, using 1 << 6, so that it is visible as
part of C/C++'s 8 bits of token flags.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
The following patch introduces a new warning - -Winvalid-utf8 similarly
to what clang now has - to diagnose invalid UTF-8 byte sequences in
comments, but not just in those, but also in string/character literals
and outside of them.
The warning is on by default when explicit -finput-charset=UTF-8 is
used and C++23 compilation is requested and if -{,W}pedantic or
-pedantic-errors it is actually a pedwarn.
The reason it is on by default only for -finput-charset=UTF-8 is
that the sources often are UTF-8, but sometimes could be some ASCII
compatible single byte encoding where non-ASCII characters only
appear in comments. So having the warning off by default
is IMO desirable. The C++23 pedantic mode for when the source code
is UTF-8 is -std=c++23 -pedantic-errors -finput-charset=UTF-8.
2022-09-01 Jakub Jelinek <jakub@redhat.com>
PR c++/106655
libcpp/
* include/cpplib.h (struct cpp_options): Implement C++23
P2295R6 - Support for UTF-8 as a portable source file encoding.
Add cpp_warn_invalid_utf8 and cpp_input_charset_explicit fields.
(enum cpp_warning_reason): Add CPP_W_INVALID_UTF8 enumerator.
* init.cc (cpp_create_reader): Initialize cpp_warn_invalid_utf8
and cpp_input_charset_explicit.
* charset.cc (_cpp_valid_utf8): Adjust function comment.
* lex.cc (UCS_LIMIT): Define.
(utf8_continuation): New const variable.
(utf8_signifier): Move earlier in the file.
(_cpp_warn_invalid_utf8, _cpp_handle_multibyte_utf8): New functions.
(_cpp_skip_block_comment): Handle -Winvalid-utf8 warning.
(skip_line_comment): Likewise.
(lex_raw_string, lex_string): Likewise.
(_cpp_lex_direct): Likewise.
gcc/
* doc/invoke.texi (-Winvalid-utf8): Document it.
gcc/c-family/
* c.opt (-Winvalid-utf8): New warning.
* c-opts.cc (c_common_handle_option) <case OPT_finput_charset_>:
Set cpp_opts->cpp_input_charset_explicit.
(c_common_post_options): If -finput-charset=UTF-8 is explicit
in C++23, enable -Winvalid-utf8 by default and if -pedantic
or -pedantic-errors, make it a pedwarn.
gcc/testsuite/
* c-c++-common/cpp/Winvalid-utf8-1.c: New test.
* c-c++-common/cpp/Winvalid-utf8-2.c: New test.
* c-c++-common/cpp/Winvalid-utf8-3.c: New test.
* g++.dg/cpp23/Winvalid-utf8-1.C: New test.
* g++.dg/cpp23/Winvalid-utf8-2.C: New test.
* g++.dg/cpp23/Winvalid-utf8-3.C: New test.
* g++.dg/cpp23/Winvalid-utf8-4.C: New test.
* g++.dg/cpp23/Winvalid-utf8-5.C: New test.
* g++.dg/cpp23/Winvalid-utf8-6.C: New test.
* g++.dg/cpp23/Winvalid-utf8-7.C: New test.
* g++.dg/cpp23/Winvalid-utf8-8.C: New test.
* g++.dg/cpp23/Winvalid-utf8-9.C: New test.
* g++.dg/cpp23/Winvalid-utf8-10.C: New test.
* g++.dg/cpp23/Winvalid-utf8-11.C: New test.
* g++.dg/cpp23/Winvalid-utf8-12.C: New test.
|
|
The following patch implements the C++23 P2290R3 paper.
2022-08-20 Jakub Jelinek <jakub@redhat.com>
PR c++/106645
libcpp/
* include/cpplib.h (struct cpp_options): Implement
P2290R3 - Delimited escape sequences. Add delimite_escape_seqs
member.
* init.cc (struct lang_flags): Likewise.
(lang_defaults): Add delim column.
(cpp_set_lang): Copy over delimite_escape_seqs.
* charset.cc (extend_char_range): New function.
(_cpp_valid_ucn): Use it. Handle delimited escape sequences.
(convert_hex): Likewise.
(convert_oct): Likewise.
(convert_ucn): Use extend_char_range.
(convert_escape): Call convert_oct even for \o.
(_cpp_interpret_identifier): Handle delimited escape sequences.
* lex.cc (get_bidi_ucn_1): Likewise. Add end argument, fill it in.
(get_bidi_ucn): Adjust get_bidi_ucn_1 caller. Use end argument to
compute num_bytes.
gcc/testsuite/
* c-c++-common/cpp/delimited-escape-seq-1.c: New test.
* c-c++-common/cpp/delimited-escape-seq-2.c: New test.
* c-c++-common/cpp/delimited-escape-seq-3.c: New test.
* c-c++-common/Wbidi-chars-24.c: New test.
* gcc.dg/cpp/delimited-escape-seq-1.c: New test.
* gcc.dg/cpp/delimited-escape-seq-2.c: New test.
* g++.dg/cpp/delimited-escape-seq-1.C: New test.
* g++.dg/cpp/delimited-escape-seq-2.C: New test.
|
|
ISO C2x standardizes the existing #warning extension. Arrange
accordingly for it not to be diagnosed with -std=c2x -pedantic, but to
be diagnosed with -Wc11-c2x-compat.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/testsuite/
* gcc.dg/cpp/c11-warning-1.c, gcc.dg/cpp/c11-warning-2.c,
gcc.dg/cpp/c11-warning-3.c, gcc.dg/cpp/c11-warning-4.c,
gcc.dg/cpp/c2x-warning-1.c, gcc.dg/cpp/c2x-warning-2.c,
gcc.dg/cpp/gnu11-warning-1.c, gcc.dg/cpp/gnu11-warning-2.c,
gcc.dg/cpp/gnu11-warning-3.c, gcc.dg/cpp/gnu11-warning-4.c,
gcc.dg/cpp/gnu2x-warning-1.c, gcc.dg/cpp/gnu2x-warning-2.c: New
tests.
libcpp/
* include/cpplib.h (struct cpp_options): Add warning_directive.
* init.cc (struct lang_flags, lang_defaults): Add
warning_directive.
* directives.cc (DIRECTIVE_TABLE): Mark #warning as STDC2X not
EXTENSION.
(directive_diagnostics): Diagnose #warning with -Wc11-c2x-compat,
or with -pedantic for a standard not supporting #warning.
|