aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Lex
AgeCommit message (Collapse)AuthorFilesLines
4 days[clang] Use the VFS to check the system framework marker (#160946)Jan Svoboda1-2/+1
This PR uses the VFS/`FileManager` to check the system framework marker instead of going straight to the real file system. This matches the behavior of other input files of the compiler.
8 days[Clang]: prevent assertion on empty filename arg in __has_embed (#159928)Oleksandr T.1-1/+3
Fixes #159898 --- This PR addresses the issue of Clang asserting when `__has_embed` is used with an empty filename ```c #if __has_embed("") #endif ```
2025-09-11[clang] Fix assertion with invalid embed limit parameter value (#157896)Mariya Podchishchaeva1-0/+2
If a negative value was given we would fail to skip till the end of the directive and trip a failed assertion. Fixes https://github.com/llvm/llvm-project/issues/157842
2025-09-10Reland "[clang] Delay normalization of `-fmodules-cache-path` (#150123)"Jan Svoboda1-0/+7
This reverts commit 613caa909c78f707e88960723c6a98364656a926, essentially reapplying 4a4bddec3571d78c8073fa45b57bbabc8796d13d after moving `normalizeModuleCachePath` from clangFrontend to clangLex. This PR is part of an effort to remove file system usage from the command line parsing code. The reason for that is that it's impossible to do file system access correctly without a configured VFS, and the VFS can only be configured after the command line is parsed. I don't want to intertwine command line parsing and VFS configuration, so I decided to perform the file system access after the command line is parsed and the VFS is configured - ideally right before the file system entity is used for the first time. This patch delays normalization of the module cache path until `CompilerInstance` is asked for the cache path in the current compilation context.
2025-09-06Revert "[clang][Modules] Reporting Errors for Duplicating Link Declar… ↵Qiongsi Wu1-30/+4
(#157154) …ations in `modulemap`s (#148959)" This reverts commit 538e9e8ebd09233b3900ed2dfd23e4e1ca5c9fc0 for two reasons. 1. Link decls in submodules can make sense even if the submodule is not explicit. We need to review the error check. This PR reverts the check so we still allow link decls in submodules. 2. It is not a fatal error to have duplicating link decls. The linker deduplicates them anyways. rdar://159467837
2025-08-22[clang][Modules] Reporting Errors for Duplicating Link Declarations in ↵Qiongsi Wu1-4/+30
`modulemap`s (#148959) This PR teaches the modulemap parsing logic to report warnings that default to errors if the parsing logic sees duplicating link declarations in the same module. Specifically, duplicating link declarations means multiple link declarations with the same string-literal in the same module. No errors are reported if a same link declaration exist in a submodule and its enclosing module. The warning can be disabled with `-Wno-module-link-redeclaration`. rdar://155880064
2025-08-18Reland [clang][modules-driver] Add scanner to detect C++20 module presence ↵Naveen Seth Hanig1-0/+50
(#153497) This patch is part of a series to support driver managed module builds for C++ named modules and Clang modules. This introduces a scanner that detects C++ named module usage early in the driver with only negligible overhead. For now, it is enabled only with the `-fmodules-driver` flag and serves solely diagnostic purposes. In the future, the scanner will be enabled for any (modules-driver compatible) compilation with two or more inputs, and will help the driver determine whether to implicitly enable the modules driver. Since the scanner adds very little overhead, we are also exploring enabling it for compilations with only a single input. This approach could allow us to detect `import std` usage in a single-file compilation, which would then activate the modules driver. For performance measurements on this, see https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark. RFC for driver managed module builds: https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system This patch relands the reland (2d31fc8) for commit ded1426. The earlier reland failed due to a missing link dependency on `clangLex`. This reland fixes the issue by adding the link dependency after discussing it in the following RFC: https://discourse.llvm.org/t/rfc-driver-link-the-driver-against-clangdependencyscanning-clangast-clangfrontend-clangserialization-and-clanglex
2025-08-18[clang] Allow trivial pp-directives before C++ module directive (#153641)yronglin2-12/+37
Consider the following code: ```cpp # 1 __FILE__ 1 3 export module a; ``` According to the wording in [P1857R3](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1857r3.html): ``` A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.) ``` and the wording in [[cpp.pre]](https://eel.is/c++draft/cpp.pre#nt:module-file) ``` module-file: pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt] ``` `#` is the first pp-token in the translation unit, and it was rejected by clang, but they really should be exempted from this rule. The goal is to not allow any preprocessor conditionals or most state changes, but these don't fit that. State change would mean most semantically observable preprocessor state, particularly anything that is order dependent. Global flags like being a system header/module shouldn't matter. We should exempt a brunch of directives, even though it violates the current standard wording. In this patch, we introduce a `TrivialDirectiveTracer` to trace the **State change** that described above and propose to exempt the following kind of directive: `#line`, GNU line marker, `#ident`, `#pragma comment`, `#pragma mark`, `#pragma detect_mismatch`, `#pragma clang __debug`, `#pragma message`, `#pragma GCC warning`, `#pragma GCC error`, `#pragma gcc diagnostic`, `#pragma OPENCL EXTENSION`, `#pragma warning`, `#pragma execution_character_set`, `#pragma clang assume_nonnull` and builtin macro expansion. Fixes https://github.com/llvm/llvm-project/issues/145274 --------- Signed-off-by: yronglin <yronglin777@gmail.com>
2025-08-11[Clang] Fixed a crash when parsing #embed parameters with unmatched closing ↵Oleksandr T.1-1/+5
brackets (#152877) Fixes #152829 --- This patch addresses the issue where the preprocessor could crash when parsing `#embed` parameters containing unmatched closing brackets ```cpp #embed "file" prefix(]) #embed "file" prefix(}) ```
2025-07-28[Clang] Reland '__has_builtin should return false for aux triple builtins' ↵Nick Sarnie1-2/+6
(#126324) Reland https://github.com/llvm/llvm-project/pull/121839 based on the results of the Discourse discussion [here](https://discourse.llvm.org/t/rfc-has-builtin-behavior-on-offloading-targets/84964). --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-07-22[clang] Check empty macro name in `#pragma push_macro("")` or `#pragma ↵yronglin1-1/+11
pop_macro("")` (#149982) Fixes https://github.com/llvm/llvm-project/issues/149762. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
2025-07-21Revert "Reland [clang][modules-driver] Add scanner to detect C++20 module ↵Naveen Seth Hanig1-49/+0
presence" (#149900) Reverts llvm/llvm-project#147630. This causes a linker error caused by linking the driver against the lexer.
2025-07-21Reland [clang][modules-driver] Add scanner to detect C++20 module presence ↵Naveen Seth Hanig1-0/+49
(#147630) This patch is part of a series to natively support C++20 module usage from the Clang driver (without requiring an external build system). This introduces a new scanner that detects C++20 module usage in source files without using the preprocessor or lexer. For now, it is enabled only with the `-fmodules-driver` flag and serves solely diagnostic purposes. In the future, the scanner will be enabled for any (modules-driver compatible) compilation with two or more inputs, and will help the driver determine whether to implicitly enable the modules driver. Since the scanner adds very little overhead, we are also exploring enabling it for compilations with only a single input. This approach could allow us to detect `import std` usage in a single-file compilation, which would then activate the modules driver. For performance measurements on this, see https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark. RFC: https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system This patch relands commit ded1426. The CI failure is resolved by removing the compatibility warning for using the `-fmodules-driver` flag with pre-C++20 standards, which also better aligns its behavior with other features/flags supported only in newer standards.
2025-07-21[clang] Don't warn on zero literals with -std=c2y (#149688)Timothy Herchen1-1/+1
Fixes #149669; the old check compared with the end of the literal, but we can just check that after parsing digits, we're pointing to one character past the token start.
2025-07-19 [clang][deps] Properly capture the global module and '\n' for all module ↵Naveen Seth Hanig2-15/+22
directives (#148685) Previously, the newline after a module directive was not properly captured and printed by `clang::printDependencyDirectivesAsSource`. According to P1857R3, each directive must, after skipping horizontal whitespace, appear at the start of a logical line. Because the newline after module directives was missing, this invalidated the following line. This fixes tests that were previously in violation of P1857R3, including for Objective-C directives, which should also comply with P1857R3. This also ensures that the global module fragment `module;` is captured by the dependency directives scanner.
2025-07-15Remove Native Client support (#133661)Brad Smith1-1/+0
Remove the Native Client support now that it has finally reached end of life.
2025-07-15[clang][deps] Fix dependency scanner misidentifying 'import::' as module ↵Naveen Seth Hanig1-0/+7
partition (#148674) The dependency directive scanner was incorrectly classifying namespaces such as `import::inner xi` as directives. According to P1857R3, `import` should not be treated as a directive when followed by `::`. This change fixes that behavior.
2025-07-09Address a handful of C4146 compiler warnings where literals can be replaced ↵Alex Sepkowski1-1/+2
with std::numeric_limits (#147623) This PR addresses instances of compiler warning C4146 that can be replaced with std::numeric_limits. Specifically, these are cases where a literal such as '-1ULL' was used to assign a value to a uint64_t variable. The intent is much cleaner if we use the appropriate std::numeric_limits value<Type>::max() for these cases. Addresses #147439
2025-07-08[HLSL][RootSignature] Correct `RootSignatureParser` to use correct ↵Finn Plummer1-3/+3
`SourceLocation` in diagnostics (#147084) The `SourceLocation` of a `RootSignatureToken` is incorrectly set to be the "offset" into the concatenated string that denotes the rootsignature. This causes an issue when the `StringLiteral` is a multi-line expansion macro, since the offset will not account for the characters between `StringLiteral` tokens. This pr resolves this by retaining the `SourceLocation` information that is kept in `StringLiteral` and then converting the offset in the concatenated string into the proper `SourceLocation` using the `StringLiteral::getLocationOfByte` interface. To do so, we will need to adjust the `RootSignatureToken` to only hold its offset into the root signature string. Then when the parser will use the token, it will need to compute its actual `SourceLocation`. See linked issue for more context. For example: ``` #define DemoRootSignature \ "CBV(b0)," \ "RootConstants(num32BitConstants = 3, b0, invalid)" expected caret location ---------------^ actual caret location ------------^ ``` The caret points 5 characters early because the current offset did not account for the characters: ``` '"' ' ' '\' ' ' '"' 1 2 3 4 5 ``` - Updates `RootSignatureParser` to retain `SourceLocation` information by retaining the `StringLiteral` and passing the underlying `StringRef` to the `Lexer` - Updates `RootSignatureLexer` so that the constructed tokens only reflect an offset into the `StringRef` - Updates `RootSignatureParser` to directly construct its used `Lexer` so that the `StringLiteral` is directly tied with the string used in the `RootSignatureLexer` - Updates `RootSignatureParser` to use `StringLiteral::getLocationOfByte` to get the actual token location for diagnostics - Updates `ParseHLSLRootSignatureTest` to construct a phony `AST`/`StringLiteral` for the test cases - Adds a test to `RootSignature-err.hlsl` showing that the `SourceLocation` is correctly set for diagnostics in a multi-line macro expansion Resolves: https://github.com/llvm/llvm-project/issues/146967
2025-07-08[win][clang] Do not inject static_assert macro definition (#147030)Mariya Podchishchaeva1-17/+0
In ms-compatibility mode we inject static_assert macro definition if assert macro is defined. This is done by 8da090381d567d0ec555840f6b2a651d2997e4b3 for the sake of better diagnosing, in particular to emit a compatibility warning when static_assert keyword is used without inclusion of <assert.h>. Unfortunately it doesn't do a good job in c99 mode adding that macro unexpectedly for the users, so this patch removes macro injection and the diagnostics. --------- Co-authored-by: Corentin Jabot <corentinjabot@gmail.com>
2025-07-07[clang][deps] Stop lexing if hit a failure while loading a PCH/module in a ↵Volodymyr Sapsai1-0/+3
submodule. (#146976) Otherwise we are continuing in an invalid state and can easily crash. It is a follow-up to cde90e68f8123e7abef3f9e18d79980aa19f460a but an important difference is when a failure happens in a submodule. In this case in `Preprocessor::HandleEndOfFile` `tok::eof` is replaced by `tok::annot_module_end`. And after exiting a file with bad `#include/#import` we work with a new buffer, so `BufferPtr < BufferEnd`. As there are no signs to stop lexing we just keep doing it. The fix is the same as in dc9fdaf2171cc480300d5572606a8ede1678d18b in `Lexer::LexTokenInternal` but this time in `Lexer::LexDependencyDirectiveToken` as well. rdar://152499276
2025-07-07NFC, use structured binding to simplify the code.Haojian Wu1-3/+1
2025-07-04[clang-scan-deps] Fix "unterminated conditional directive" bug (#146645)Ziqing Luo1-1/+3
`clang-scan-deps` threw "unterminated conditional directive" error falsely on the following example: ``` #ifndef __TEST #define __TEST #if defined(__TEST_DUMMY) #if defined(__TEST_DUMMY2) #pragma GCC warning \ "Hello!" #else #pragma GCC error \ "World!" #endif // defined(__TEST_DUMMY2) #endif // defined(__TEST_DUMMY) #endif // #ifndef __TEST ``` The issue comes from PR #143950, where the flag `LastNonWhitespace` does not correctly represent the state for the example above. The PR aimed to support that a line-continuation can be followed by whitespaces. This commit fixes the issue by moving the `LastNonWhitespace` variable to the inner loop so that it will be correctly reset. rdar://153742186
2025-06-28[clang] Remove unused includes (NFC) (#146254)Kazu Hirata2-2/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-06-27[clang][scandeps] Improve handling of rawstrings. (#139504)Tobias Hieta1-7/+41
2025-06-26[clang] NFC: Add alias for std::pair<FileID, unsigned> used in ↵Haojian Wu3-16/+13
SourceLocation (#145711) Introduce a type alias for the commonly used `std::pair<FileID, unsigned>` to improve code readability, and make it easier for future updates (64-bit source locations).
2025-06-26Triple: Forward declare Twine and remove include (#145685)Matt Arsenault1-13/+6
2025-06-26[clang][Preprocessor] Handle the first pp-token in EnterMainSourceFile (#145244)yronglin4-13/+17
Depends on [[clang][Preprocessor] Add peekNextPPToken, makes look ahead next token without side-effects](https://github.com/llvm/llvm-project/pull/143898). This PR fix the performance regression that introduced in https://github.com/llvm/llvm-project/pull/144233. The original PR(https://github.com/llvm/llvm-project/pull/144233) handle the first pp-token in the main source file in the macro definition/expansion and `Lexer::Lex`, but the lexer is almost always on the hot path, we may hit a performance regression. In this PR, we handle the first pp-token in `Preprocessor::EnterMainSourceFile`. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
2025-06-26[NFC][Clang][Preprocessor] Refine the implementation of isNextPPTokenOneOf ↵yronglin2-4/+4
(#145546) This PR follow the suggestion(https://github.com/llvm/llvm-project/pull/143898#discussion_r2164253141) to refine the implementation of `Preprocessor::isNextPPToken`, also use C++ fold expression to refine `Token::isOneOf`. We don't need `bool isOneOf(tok::TokenKind K1, tok::TokenKind K2) const` anymore. In order to reduce the impact, specificed `TokenKind` is still passed to `Token::isOneOf` and `Preprocessor::isNextPPTokenOneOf` as function parameters. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
2025-06-25[Clang][Preprocessor] Expand UCNs in macro concatenation (#145351)yronglin1-0/+11
Fixs https://github.com/llvm/llvm-project/issues/145240. The UCN in preprocessor pasted identifier not resolved to unicode, it may cause the following issue: ```c #define CAT(a,b) a##b char foo\u00b5; char*p = &CAT(foo, \u00b5); // error: use of undeclared identifier 'foo\u00b5' ``` The real identifier after paste is `fooµ`. This PR fix this issue in `TokenLexer::pasteTokens`, if there has any UCN in pasting tokens, the final pasted token should have a Token::HasUCN flag. Then `Preprocessor::LookUpIdentifierInfo` will expand UCNs in this token. Signed-off-by: yronglin <yronglin777@gmail.com>
2025-06-24[clang][Preprocessor] Add peekNextPPToken, makes look ahead next token ↵yronglin5-57/+20
without side-effects (#143898) This PR introduce a new function `peekNextPPToken`. It's an extension of `isNextPPTokenLParen` and can makes look ahead one token in preprocessor without side-effects. It's also the 1st part of https://github.com/llvm/llvm-project/pull/107168 and it was used to look ahead next token then determine whether current lexing pp directive is one of pp-import or pp-module directive. At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. --------- Signed-off-by: yronglin <yronglin777@gmail.com>
2025-06-21[C++][Modules] A module directive may only appear as the first preprocessing ↵yronglin4-0/+21
tokens in a file (#144233) This PR is 2nd part of [P1857R3](https://github.com/llvm/llvm-project/pull/107168) implementation, and mainly implement the restriction `A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.)`: [cpp.pre](https://eel.is/c++draft/cpp.pre): ``` module-file: pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt] ``` We also refine tests use `split-file` instead of conditional macro. Signed-off-by: yronglin <yronglin777@gmail.com>
2025-06-20[clang] Add managarm support (#144791)no921-0/+1
This is a repost of the quickly reverted #139271. The failing buildbot tests have been fixed and pass on my machine now.
2025-06-17Revert "[clang] Add managarm support" (#144514)Aaron Ballman1-1/+0
Reverts llvm/llvm-project#139271 There are multiple failing build bots: https://lab.llvm.org/buildbot/#/builders/10/builds/7482 https://lab.llvm.org/buildbot/#/builders/11/builds/17473
2025-06-17[clang] Add managarm support (#139271)no921-0/+1
This PR is part of a series to upstream managarm support, as laid out in the [RFC](https://discourse.llvm.org/t/rfc-new-proposed-managarm-support-for-llvm-and-clang-87845/85884/1). This PR is a follow-up to #87845 and #138854.
2025-06-15[clang] Remove unused includes (NFC) (#144285)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-06-13[clang-scan-deps] Implement P2223R2 for DependencyDirectiveScanner.cpp (#143950)Naveen Seth Hanig1-10/+22
P2223R2 allows the line-continuation slash `\` to be followed by additional whitespace. The Clang lexer already follows this behavior, also for versions prior to C++23. The dependency directive scanner however only implements it for `#define` directives (15d5f5d). This fully implements P2223R2 for the dependency directive scanner (for any C++ standard) and aligns the dependency directive scanner's splicing behavior with that of the Clang lexer. For example, the following code was previously not scanned correctly by `clang-scan-deps` but now works as expected: ```cpp import \<whitespace here> A; ```
2025-06-06[clang][dep-scan] Resolve lexer crash from a permutation of invalid tokens ↵Cyndy Ishida1-0/+9
(#142452) Sometimes, when a user writes invalid code, the minimization used for scanning can create a stream of tokens that is invalid at lex time. This patch protects against the case where there are valid (non-c++20) import directives discovered in the middle of an invalid `import` declaration. Mostly authored by: @akyrtzi resolves: rdar://152335844
2025-06-06[C2y] Handle FP-suffixes on prefixed octals (#141230) (#141695)Naveen Seth Hanig1-4/+16
Fixes https://github.com/llvm/llvm-project/issues/141230. Currently, prefixed octal literals used with floating-point suffixes are not rejected, causing Clang to crash. This adds proper handling to reject invalid literals such as `0o0.1` or `0.0e1`. No release note because this is fixing an issue with a new change.
2025-06-03[Clang] Slightly tweak the code to try to fix a potential codegen issue in ↵Corentin Jabot1-14/+6
#142592
2025-06-03[Clang] Improve infrastructure for libstdc++ workarounds (Reland) (#142592)cor3ntin1-0/+53
Reland with debug traces to try to understand a bug that only happens on one CI configuration === This introduces a way detect the libstdc++ version, use that to enable workarounds. The version is cached. This should make it easier in the future to find and remove these hacks. I did not find the need for enabling a hack between or after specific versions, so it's left as a future exercise. We can extend this fature to other libraries as the need arise. ===
2025-06-03Revert "[Clang] Improve infrastructure for libstdc++ workarounds" (#142432)cor3ntin1-46/+0
Reverts llvm/llvm-project#141977 This causes CI failure that I am unable to reproduce. https://lab.llvm.org/buildbot/#/builders/168/builds/12688
2025-05-31[Clang] Improve infrastructure for libstdc++ workarounds (#141977)cor3ntin1-0/+46
This introduces a way detect the libstdc++ version, use that to enable workarounds. The version is cached. This should make it easier in the future to find and remove these hacks. I did not find the need for enabling a hack between or after specific versions, so it's left as a future exercise. We can extend this fature to other libraries as the need arise.
2025-05-29[clang][Lex][NFC] Reorder SrcMgr checks in CheckMacroName (#141483)Timm Baeder1-2/+4
isInPredefinedFile() will look at the presumed loc, which is comparatively slow. Move it after isInSystemFile(). http://llvm-compile-time-tracker.com/compare.php?from=843e362318e884991e517a54446b4faeacdad789&to=de0421a1a38052042721a67a6094f5cb38431f26&stat=instructions:u
2025-05-29[SystemZ][z/OS] Add back include required for strnlen functionAbhina Sreeskantharajan1-0/+1
2025-05-28[C2y] Add stdcountof.h (#140890)Aaron Ballman2-2/+3
WG14 N3469 changed _Lengthof to _Countof but it also introduced the <stdcountof.h> header to expose a macro with a non-ugly identifier. GCC vends this header as part of the compiler implementation, so Clang should do the same. Suggested-by: Alejandro Colomar <alx@kernel.org>
2025-05-26[Lex] Remove unused includes (NFC) (#141523)Kazu Hirata3-5/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-05-22Revert "[Modules] Don't fail when an unused textual header is missing. ↵Volodymyr Sapsai1-4/+2
(#138227)" This reverts commit 64bb60a471a5ddc9c9bec413c65fdab730a1e4b0. Revert to give more time affected parties to adjust to the change.
2025-05-19[clang] Use *Map::try_emplace (NFC) (#140477)Kazu Hirata2-2/+2
We can simplify the code with *Map::try_emplace where we need default-constructed values while avoding calling constructors when keys are already present.
2025-05-17[clang] Use llvm::stable_sort (NFC) (#140413)Kazu Hirata1-1/+1