aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Lex/Lexer.cpp
AgeCommit message (Collapse)AuthorFilesLines
2017-07-05Fix invalid warnings for header guards in preamblesErik Verbruggen1-1/+1
Fixes https://bugs.llvm.org/show_bug.cgi?id=33574 Differential Revision: https://reviews.llvm.org/D34882 llvm-svn: 307134
2017-06-16[PR33394] Avoid lexing editor placeholders when Clang is used onlyAlex Lorenz1-1/+2
for preprocessing r300667 added support for editor placeholder to Clang. That commit didn’t take into account that users who use Clang for preprocessing only (-E) will get the "editor placeholder in source file" error when preprocessing their source (PR33394). This commit ensures that Clang doesn't lex editor placeholders when running a preprocessor only action. rdar://32718000 Differential Revision: https://reviews.llvm.org/D34256 llvm-svn: 305576
2017-06-03Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova1-0/+2
llvm-svn: 304643
2017-05-30Allow for unfinished #if blocks in preamblesErik Verbruggen1-28/+11
Previously, a preamble only included #if blocks (and friends like ifdef) if there was a corresponding #endif before any declaration or definition. The problem is that any header file that uses include guards will not have a preamble generated, which can make code-completion very slow. To prevent errors about unbalanced preprocessor conditionals in the preamble, and unbalanced preprocessor conditionals after a preamble containing unfinished conditionals, the conditional stack is stored in the pch file. This fixes PR26045. Differential Revision: http://reviews.llvm.org/D15994 llvm-svn: 304207
2017-05-17[Lexer] Ensure that the token is not an annotation token whenAlex Lorenz1-0/+4
retrieving the identifer info for an Objective-C keyword This commit fixes an assertion that's triggered in getIdentifier when the token is an annotation token. rdar://32225463 llvm-svn: 303246
2017-05-05Add a fix-it for -Wunguarded-availabilityAlex Lorenz1-17/+49
This patch adds a fix-it for the -Wunguarded-availability warning. This fix-it is similar to the Swift one: it suggests that you wrap the statement in an `if (@available)` check. The produced fixits are indented (just like the Swift ones) to make them look nice in Xcode's fix-it preview. rdar://31680358 Differential Revision: https://reviews.llvm.org/D32424 llvm-svn: 302253
2017-04-19Add support for editor placeholders to ClangAlex Lorenz1-0/+33
This commit teaches Clang to recognize editor placeholders that are produced when an IDE like Xcode inserts a code-completion result that includes a placeholder. Now when the lexer sees a placeholder token, it emits an 'editor placeholder in source file' error and creates an identifier token that represents the placeholder. The parser/sema can now recognize the placeholders and can suppress the diagnostics related to the placeholders. This ensures that live issues in an IDE like Xcode won't get spurious diagnostics related to placeholders. This commit also adds a new compiler option named '-fallow-editor-placeholders' that silences the 'editor placeholder in source file' error. This is useful for an IDE like Xcode as we don't want to display those errors in live issues. rdar://31581400 Differential Revision: https://reviews.llvm.org/D32081 llvm-svn: 300667
2017-04-18Do not warn about whitespace between ??/ trigraph and newline in line ↵Richard Smith1-4/+6
comments if trigraphs are disabled in the current language. llvm-svn: 300609
2017-04-17Fix mishandling of escaped newlines followed by newlines or nuls.Richard Smith1-18/+10
Previously, if an escaped newline was followed by a newline or a nul, we'd lex the escaped newline as a bogus space character. This led to a bunch of different broken corner cases: For the pattern "\\\n\0#", we would then have a (horizontal) space whose spelling ends in a newline, and would decide that the '#' is at the start of a line, and incorrectly start preprocessing a directive in the middle of a logical source line. If we were already in the middle of a directive, this would result in our attempting to process multiple directives at the same time! This resulted in crashes, asserts, and hangs on invalid input, as discovered by fuzz-testing. For the pattern "\\\n" at EOF (with an implicit following nul byte), we would produce a bogus trailing space character with spelling "\\\n". This was mostly harmless, but would lead to clang-format getting confused and misformatting in rare cases. We now produce a trailing EOF token with spelling "\\\n", consistent with our handling for other similar cases -- an escaped newline is always part of the token containing the next character, if any. For the pattern "\\\n\n", this was somewhat more benign, but would produce an extraneous whitespace token to clients who care about preserving whitespace. However, it turns out that our lexing for line comments was relying on this bug due to an off-by-one error in its computation of the end of the comment, on the slow path where the comment might contain escaped newlines. llvm-svn: 300515
2017-04-07Skip Unicode character expansion in assembly filesSanne Wouda1-9/+11
Summary: When using the C preprocessor with assembly files, either with a capital `S` file extension, or with `-xassembler-with-cpp`, the Unicode escape sequence `\u` is ignored. The `\u` pattern can be used for expanding a macro argument that starts with `u`. Author: Salman Arif <salman.arif@arm.com> Reviewers: rengolin, olista01 Reviewed By: olista01 Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D31765 llvm-svn: 299754
2016-12-30Allow lexer to handle string_view literals. Patch from Anton Bikineev.Eric Fiselier1-3/+3
This implements the compiler side of p0403r0. This patch was reviewed as https://reviews.llvm.org/D26829. llvm-svn: 290744
2016-09-30Move UTF functions into namespace llvm.Justin Lebar1-12/+12
Summary: This lets people link against LLVM and their own version of the UTF library. I determined this only affects llvm, clang, lld, and lldb by running $ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq clang lld lldb llvm Tested with ninja lldb ninja check-clang check-llvm check-lld (ninja check-lldb doesn't complete for me with or without this patch.) Reviewers: rnk Subscribers: klimek, beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D24996 llvm-svn: 282822
2016-09-07Fix some Clang-tidy modernize-use-using and Include What You Use warnings; ↵Eugene Zelenko1-20/+22
other minor fixes. Differential revision: https://reviews.llvm.org/D24115 llvm-svn: 280870
2016-07-27Implement filtering for code completion of identifiers.Vassil Vassilev1-1/+9
Patch by Cristina Cristescu and Axel Naumann! Agreed on post commit review (D17820). llvm-svn: 276878
2016-04-01[Lexer] Let the compiler infer string lengths. No functionality change intended.Benjamin Kramer1-2/+2
llvm-svn: 265126
2016-04-01[Lexer] Don't read out of bounds if a conflict marker is at the end of a fileBenjamin Kramer1-1/+1
This can happen as we look for '<<<<' while scanning tokens but then expect '<<<<\n' to tell apart perforce from diff3 conflict markers. Just harden the pointer arithmetic. Found by libfuzzer + asan! llvm-svn: 265125
2016-03-04Update diagnostics now that hexadecimal literals look likely to be part of ↵Richard Smith1-2/+3
C++17. llvm-svn: 262753
2016-02-18Remove use of builtin comma operator.Richard Trieu1-1/+3
Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261271
2016-02-03[OpenCL] Adding reserved operator logical xor for OpenCLAnastasia Stulova1-0/+3
This patch adds the reserved operator ^^ when compiling for OpenCL (spec v1.1 s6.3.g), which results in a more meaningful error message. Patch by Neil Hickey! Review: http://reviews.llvm.org/D13280 M test/SemaOpenCL/unsupported.cl M include/clang/Basic/TokenKinds.def M include/clang/Basic/DiagnosticParseKinds.td M lib/Basic/OperatorPrecedence.cpp M lib/Lex/Lexer.cpp M lib/Parse/ParseExpr.cpp llvm-svn: 259651
2016-01-26Fix -Wnull-conversion for long macros.Richard Trieu1-0/+25
Move the function to get a macro name from DiagnosticRenderer.cpp to Lexer.cpp so that other files can use it. Lexer now has two functions to get the immediate macro name, the newly added one is better for diagnostic purposes. Make -Wnull-conversion use this function for better NULL macro detection. llvm-svn: 258778
2015-12-29Emit a -Wmicrosoft warning when treating ^Z as EOF in MS mode.Nico Weber1-1/+4
llvm-svn: 256596
2015-11-20[clang] Disable Unicode in asm filesVinicius Tinti1-2/+6
Clang should not convert tokens to Unicode when preprocessing assembly files. Fixes PR25558. llvm-svn: 253738
2015-11-14Use %select to merge similar diagnostics. NFCCraig Topper1-5/+5
llvm-svn: 253119
2015-10-22Disable trigraph and escaped newline expansion on all types of raw string ↵Craig Topper1-1/+1
literals not just ASCII type. llvm-svn: 251025
2015-06-01Replace a few std::string& with StringRef. NFC.Rafael Espindola1-1/+1
Patch by Косов Евгений! llvm-svn: 238774
2015-05-04Fix buffer overflow in LexerKostya Serebryany1-1/+1
Summary: Fix PR22407, where the Lexer overflows the buffer when parsing #include<\ (end of file after slash) Test Plan: Added a test that will trigger in asan build. This case is also covered by the clang-fuzzer bot. Reviewers: rnk Reviewed By: rnk Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D9489 llvm-svn: 236466
2015-03-06Use delegating ctors to reduce code duplication. NFC.Benjamin Kramer1-8/+2
llvm-svn: 231476
2014-12-14Lex: Don't crash if both conflict markers are on the same lineDavid Majnemer1-2/+2
We would check if the terminator marker is on a newline. However, the logic would end up out-of-bounds if the terminator marker immediately follows the start marker. This fixes PR21820. llvm-svn: 224210
2014-11-08[c++1z] Support for u8 character literals.Richard Smith1-6/+14
llvm-svn: 221576
2014-10-29Fix warning in Altivec code when building with GCC 4.8.2 on Ubuntu 14.04.Jay Foad1-1/+1
llvm-svn: 220855
2014-08-19C++1y is now C++14!Aaron Ballman1-2/+2
Changes diagnostic options, language standard options, diagnostic identifiers, diagnostic wording to use c++14 instead of c++1y. It also modifies related test cases to use the updated diagnostic wording. llvm-svn: 215982
2014-08-12Use StringRef instead of MemoryBuffer&.Rafael Espindola1-7/+7
This code doesn't care where the data it is processing comes from, so a StringRef is probably the most natural interface. llvm-svn: 215448
2014-08-11Change MemoryBuffer* to MemoryBuffer& parameter to Lexer::ComputePreambleDavid Blaikie1-9/+9
(dropping const from the reference as MemoryBuffer is immutable already, so const is just redundant - and while I'd personally put const everywhere, that's not the LLVM Way (see llvm::Type for another example of an immutable type where "const" is omitted for brevity)) Changing the pointer argument to a reference parameter makes call sites identical between callers with unique_ptrs or raw pointers, minimizing the churn in a pending unique_ptr migrations. llvm-svn: 215391
2014-06-15Hide the concept of diagnostic levels from lex, parse and semaAlp Toker1-6/+3
The compilation pipeline doesn't actually need to know about the high-level concept of diagnostic mappings, and hiding the final computed level presents several simplifications and other potential benefits. The only exceptions are opportunistic checks to see whether expensive code paths can be avoided for diagnostics that are guaranteed to be ignored at a certain SourceLocation. This commit formalizes that invariant by introducing and using DiagnosticsEngine::isIgnored() in place of individual level checks throughout lex, parse and sema. llvm-svn: 211005
2014-05-18Remove historical Unicode TODOsAlp Toker1-16/+3
There's no immediate demand or plan to work on these. llvm-svn: 209090
2014-05-17[C++11] Use 'nullptr'. Lex edition.Craig Topper1-9/+12
llvm-svn: 209083
2014-05-17Provide and use a safe Token::getRawIdentifier() accessorAlp Toker1-3/+2
llvm-svn: 209061
2014-04-03Revert r205436:Roman Divacky1-28/+5
Extend the SSE2 comment lexing to AVX2. Only 16byte align when not on AVX2. This provides some 3% speedup when preprocessing gcc.c as a single file. The patch is wrong, it always uses SSE2, and when I fix that there's no speedup at all. I am not sure where the 3% came from previously. --Thi lie, and those below, will be ignored-- M Lex/Lexer.cpp llvm-svn: 205548
2014-04-02Extend the SSE2 comment lexing to AVX2. Only 16byte align when not on AVX2.Roman Divacky1-5/+28
This provides some 3% speedup when preprocessing gcc.c as a single file. llvm-svn: 205436
2014-03-02[C++11] Replace llvm::tie with std::tie.Benjamin Kramer1-1/+1
llvm-svn: 202639
2014-02-28Fix a minor bug in lexing pp-numbers with digit separators: if a pp-number ↵Richard Smith1-0/+1
contains "'e+", the pp-number ends between the 'e' and the '+'. llvm-svn: 202533
2014-02-17PR18855: Add support for UCNs and UTF-8 encoding within ud-suffixes.Richard Smith1-60/+90
llvm-svn: 201532
2014-01-14Rename language option MicrosoftMode to MSVCCompatAlp Toker1-4/+4
There's been long-standing confusion over the role of these two options. This commit makes the necessary changes to differentiate them clearly, following up from r198936. MicrosoftExt (aka. fms-extensions): Enable largely unobjectionable Microsoft language extensions to ease portability. This mode, also supported by gcc, is used for building software like FreeBSD and Linux kernel extensions that share code with Windows drivers. MSVCCompat (aka. -fms-compatibility, formerly MicrosoftMode): Turn on a special mode supporting 'heinous' extensions for drop-in compatibility with the Microsoft Visual C++ product. Standards-compilant C and C++ code isn't guaranteed to work in this mode. Implies MicrosoftExt. Note that full -fms-compatibility mode is currently enabled by default on the Windows target, which may need tuning to serve as a reasonable default. See cfe-commits for the full discourse, thread 'r198497 - Move MS predefined type_info out of InitializePredefinedMacros' No change in behaviour. llvm-svn: 199209
2014-01-07Sort all the #include lines with LLVM's utils/sort_includes.py whichChandler Carruth1-1/+1
encodes the canonical rules for LLVM's style. I noticed this had drifted quite a bit when cleaning up LLVM, so wanted to clean up Clang as well. llvm-svn: 198686
2013-12-14Lexer: Issue -Wbackslash-newline-escape for line commentsAlp Toker1-1/+8
The warning for backslash and newline separated by whitespace was missed in this code path. backslash<whitespace><newline> is handled differently from compiler to compiler so it's important to warn consistently where there's ambiguity. Matches similar handling of block comments and non-comment lines. llvm-svn: 197331
2013-12-13Fix raw lex crash and -frewrite-includes noeol-at-eof failureAlp Toker1-1/+2
Raw lexers don't have a preprocessor so we need to null check. llvm-svn: 197245
2013-10-21Lex: Don't restrict legal UCNs when preprocessing assemblyJustin Bogner1-0/+4
The C and C++ standards disallow using universal character names to refer to some characters, such as basic ascii and control characters, so we reject these sequences in the lexer. However, when the preprocessor isn't being used on C or C++, it doesn't make sense to apply these restrictions. Notably, accepting these characters avoids issues with unicode escapes when GHC uses the compiler as a preprocessor on haskell sources. Fixes rdar://problem/14742289 llvm-svn: 193067
2013-09-26Per updates to D3781, allow underscore under ' in a pp-number, and allow ' ↵Richard Smith1-1/+1
in a #line directive. llvm-svn: 191443
2013-09-26Implement C++1y digit separator proposal (' as a digit separator). This is notRichard Smith1-0/+12
yet approved by full committee, but was unanimously supported by EWG. llvm-svn: 191417
2013-09-24Avoid a signed/unsigned comparison warning with compilers that don't know howRichard Smith1-1/+1
to handle constant expressions. llvm-svn: 191336