From bedfca647a9e9c1adadd8924f3ee0ab4189424e0 Mon Sep 17 00:00:00 2001 From: David Malcolm Date: Fri, 2 Sep 2022 18:29:33 -0400 Subject: c/c++: new warning: -Wxor-used-as-pow [PR90885] PR c/90885 notes various places in real-world code where people have written C/C++ code that uses ^ (exclusive or) where presumbably they meant exponentiation. For example https://codesearch.isocpp.org/cgi-bin/cgi_ppsearch?q=2%5E32&search=Search currently finds 11 places using "2^32", and all of them appear to be places where the user means 2 to the power of 32, rather than 2 exclusive-orred with 32 (which is 34). This patch adds a new -Wxor-used-as-pow warning to the C and C++ frontends to complain about ^ when the left-hand side is the decimal constant 2 or the decimal constant 10. This is the same name as the corresponding clang warning: https://clang.llvm.org/docs/DiagnosticsReference.html#wxor-used-as-pow As per the clang warning, the warning suggests converting the left-hand side to a hexadecimal constant if you really mean xor, which suppresses the warning (though this patch implements a fix-it hint for that, whereas the clang implementation only has a fix-it hint for the initial suggestion of exponentiation). I initially tried implementing this without checking for decimals, but this version had lots of false positives. Checking for decimals requires extending the lexer to capture whether or not a CPP_NUMBER token was decimal. I added a new DECIMAL_INT flag to cpplib.h for this. Unfortunately, c_token and cp_tokens both have only an unsigned char for their flags (as captured by c_lex_with_flags), whereas this would add the 12th flag to cpp_tokens. Of the first 8 flags, all but BOL are used in the C or C++ frontends, but BOL is not, so I moved that to a higher position, using its old value for the new DECIMAL_INT flag, so that it is representable within an unsigned char. Example output: demo.c:5:13: warning: result of '2^8' is 10; did you mean '1 << 8' (256)? [-Wxor-used-as-pow] 5 | int t2_8 = 2^8; | ^ | -- | 1<< demo.c:5:12: note: you can silence this warning by using a hexadecimal constant (0x2 rather than 2) 5 | int t2_8 = 2^8; | ^ | 0x2 demo.c:21:15: warning: result of '10^6' is 12; did you mean '1e6'? [-Wxor-used-as-pow] 21 | int t10_6 = 10^6; | ^ | --- | 1e demo.c:21:13: note: you can silence this warning by using a hexadecimal constant (0xa rather than 10) 21 | int t10_6 = 10^6; | ^~ | 0xa gcc/c-family/ChangeLog: PR c/90885 * c-common.h (check_for_xor_used_as_pow): New decl. * c-lex.cc (c_lex_with_flags): Add DECIMAL_INT to flags as appropriate. * c-warn.cc (check_for_xor_used_as_pow): New. * c.opt (Wxor-used-as-pow): New. gcc/c/ChangeLog: PR c/90885 * c-parser.cc (c_parser_string_literal): Clear ret.m_decimal. (c_parser_expr_no_commas): Likewise. (c_parser_conditional_expression): Likewise. (c_parser_binary_expression): Clear m_decimal when popping the stack. (c_parser_unary_expression): Clear ret.m_decimal. (c_parser_has_attribute_expression): Likewise for result. (c_parser_predefined_identifier): Likewise for expr. (c_parser_postfix_expression): Likewise for expr. Set expr.m_decimal when handling a CPP_NUMBER that was a decimal token. * c-tree.h (c_expr::m_decimal): New bitfield. * c-typeck.cc (parser_build_binary_op): Clear result.m_decimal. (parser_build_binary_op): Call check_for_xor_used_as_pow. gcc/cp/ChangeLog: PR c/90885 * cp-tree.h (class cp_expr): Add bitfield m_decimal. Clear it in existing ctors. Add ctor that allows specifying its value. (cp_expr::decimal_p): New accessor. * parser.cc (cp_parser_expression_stack_entry::flags): New field. (cp_parser_primary_expression): Set m_decimal of cp_expr when handling numbers. (cp_parser_binary_expression): Extract flags from token when populating stack. Call check_for_xor_used_as_pow. gcc/ChangeLog: PR c/90885 * doc/invoke.texi (Warning Options): Add -Wxor-used-as-pow. gcc/testsuite/ChangeLog: PR c/90885 * c-c++-common/Wxor-used-as-pow-1.c: New test. * c-c++-common/Wxor-used-as-pow-fixits.c: New test. * g++.dg/parse/expr3.C: Convert 2 to 0x2 to suppress -Wxor-used-as-pow. * g++.dg/warn/Wparentheses-10.C: Likewise. * g++.dg/warn/Wparentheses-18.C: Likewise. * g++.dg/warn/Wparentheses-19.C: Likewise. * g++.dg/warn/Wparentheses-9.C: Likewise. * g++.dg/warn/Wxor-used-as-pow-named-op.C: New test. * gcc.dg/Wparentheses-6.c: Convert 2 to 0x2 to suppress -Wxor-used-as-pow. * gcc.dg/Wparentheses-7.c: Likewise. * gcc.dg/precedence-1.c: Likewise. libcpp/ChangeLog: PR c/90885 * include/cpplib.h (BOL): Move macro to 1 << 12 since it is not used by C/C++'s unsigned char token flags. (DECIMAL_INT): New, using 1 << 6, so that it is visible as part of C/C++'s 8 bits of token flags. Signed-off-by: David Malcolm --- libcpp/include/cpplib.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'libcpp') diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index a7600de..1a3fb19 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -190,7 +190,7 @@ struct GTY(()) cpp_string { #define NAMED_OP (1 << 4) /* C++ named operators. */ #define PREV_FALLTHROUGH (1 << 5) /* On a token preceeded by FALLTHROUGH comment. */ -#define BOL (1 << 6) /* Token at beginning of line. */ +#define DECIMAL_INT (1 << 6) /* Decimal integer, set in c-lex.cc. */ #define PURE_ZERO (1 << 7) /* Single 0 digit, used by the C++ frontend, set in c-lex.cc. */ #define SP_DIGRAPH (1 << 8) /* # or ## token was a digraph. */ @@ -199,6 +199,7 @@ struct GTY(()) cpp_string { after a # operator. */ #define NO_EXPAND (1 << 10) /* Do not macro-expand this token. */ #define PRAGMA_OP (1 << 11) /* _Pragma token. */ +#define BOL (1 << 12) /* Token at beginning of line. */ /* Specify which field, if any, of the cpp_token union is used. */ -- cgit v1.1