aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Demangle/MicrosoftDemangle.cpp
AgeCommit message (Collapse)AuthorFilesLines
2019-05-22llvm-undname: Fix an assert-on-invalid, found by oss-fuzzNico Weber1-1/+3
If a template parameter refers to a pointer to member, but the mangling of that was a string literal instead of a real symbol, llvm-undname used to crash instead of rejecting the input. llvm-svn: 361402
2019-04-24llvm-undname: Fix assert-on->4GiB-string-literal, found by oss-fuzzNico Weber1-1/+4
llvm-svn: 359109
2019-04-23llvm-undname: Support demangling the spaceship operatorNico Weber1-3/+2
Also add a test for demanling the co_await operator. llvm-svn: 359007
2019-04-22llvm-undname: Fix an assert-on-invalid, found by oss-fuzzNico Weber1-1/+1
llvm-svn: 358891
2019-04-21llvm-undname: Fix hex escapes in wchar_t, char16_t, char32_t stringsNico Weber1-3/+3
llvm-undname used to put '\x' in front of every pair of nibbles, but u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct for a single character (plus terminating \0) is u\xD7FF instead. Now, wchar_t, char16_t, and char32_t strings roundtrip from source to clang-cl (and cl.exe) and then llvm-undname. (...at least as long as it's not a string like L"\xD7FF" L"foo" which gets demangled as L"\xD7FFfoo", where the compiler then considers the "f" as part of the hex escape. That seems ok.) Also add a comment saying that the "almost-valid" char32_t string I added in my last commit is actually produced by compilers. llvm-svn: 358857
2019-04-21llvm-undname: Fix stack overflow on almost-validNico Weber1-3/+3
If a unsigned with all 4 bytes non-0 was passed to outputHex(), there were two off-by-ones in it: - Both MaxPos and Pos left space for the final \0, which left the buffer one byte to small. Set MaxPos to 16 instead of 15 to fix. - The `assert(Pos >= 0);` was after a `Pos--`, move it up one line. Since valid Unicode codepoints are <= 0x10ffff, this could never really happen in practice. Found by oss-fuzz. llvm-svn: 358856
2019-04-21llvm-undname: Fix stack overflow on invalid found by oss-fuzzNico Weber1-1/+1
llvm-svn: 358852
2019-04-20llvm-undname: Improve string literal demangling with embedded \0 charsNico Weber1-2/+5
- Don't assert when a string looks like a u32 string to the heuristic but doesn't have a length that's 0 mod 4. Instead, classify those as u16 with embedded \0 chars. Found by oss-fuzz. - Print embedded nul bytes as \0 instead of \x00. llvm-svn: 358835
2019-04-19llvm-undname: Attempt to fix leak-on-invalid found by oss-fuzzNico Weber1-3/+6
llvm-svn: 358760
2019-04-18llvm-undname: Fix two more asserts-on-invalid, found by oss-fuzzNico Weber1-3/+4
llvm-svn: 358708
2019-04-18llvm-undname: Fix two asserts-on-invalidNico Weber1-3/+5
llvm-svn: 358707
2019-04-16llvm-undname: Consistently use "return nullptr" in functions returning pointersNico Weber1-4/+4
llvm-svn: 358492
2019-04-16llvm-undname: Fix nullptr deref on invalid structor names in template argsNico Weber1-3/+4
Similar to r358421: A StructorIndentifierNode has a Class field which is read when printing it, but if the StructorIndentifierNode appears in a template argument then demangleFullyQualifiedSymbolName() which sets Class isn't called. Since StructorIndentifierNodes are always leaf names, we can just reject them as well. Found by oss-fuzz. llvm-svn: 358491
2019-04-15llvm-undname: Fix nullptr deref on invalid conversion operator names in ↵Nico Weber1-1/+10
template args A ConversionOperatorIdentifierNode has a TargetType which is read when printing it, but if the ConversionOperatorIdentifierNode appears in a template argument there's nothing that can provide the TargetType. Normally the COIN is a symbol (leaf) name and takes its TargetType from the symbol's type, but in a template argument context the COIN can only be either a non-leaf name piece or a type, and must hence be invalid. Similar to the COIN check in demangleDeclarator(). Found by oss-fuzz. llvm-svn: 358421
2019-04-14llvm-undname: Fix oss-fuzz-foudn crash-on-invalid with incomplete special ↵Nico Weber1-0/+4
table nodes llvm-svn: 358367
2019-04-14llvm-undname: Fix another crash-on-invalid found by oss-fuzzNico Weber1-1/+4
llvm-svn: 358363
2019-04-11llvm-undname: Use UNREACHABLE after exhaustive switch returning everywhereNico Weber1-1/+1
No behavior change. llvm-svn: 358241
2019-04-11llvm-undname: Name a bool param, no behavior changeNico Weber1-5/+6
llvm-svn: 358240
2019-04-11llvm-undname: Fix out-of-bounds read on invalid intrinsic function codeNico Weber1-3/+9
Found by inspection. llvm-svn: 358239
2019-04-11llvm-undname: Don't crash on incomplete enum tag manglingsNico Weber1-1/+1
Found by inspection. llvm-svn: 358238
2019-04-11llvm-undname: Fix crash on incomplete virtual this adjustsNico Weber1-2/+3
Found by oss-fuzz. Also remove an else-after-return, this part has no behavior change. llvm-svn: 358237
2019-04-11llvm-undname: Fix crash on invalid name in a template parameter pointer to ↵Nico Weber1-0/+2
member arg Found by oss-fuzz. llvm-svn: 358234
2019-04-10llvm-undname: Fix another crash-on-invalidNico Weber1-2/+0
This fixes a regression from https://reviews.llvm.org/D60354. We used to SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN); if (Symbol) { Symbol->Name = QN; } but changed that to SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN); if (Error) return nullptr; Symbol->Name = QN; and one branch somewhere returned a nullptr without setting Error. Looking at the code changed in r340083 and r340710 that branch looks like a remnant from an earlier attempt to demangle RTTI descriptors that has since been rewritten -- so just remove this branch. It shouldn't change behavior for correctly mangled symbols. llvm-svn: 358112
2019-04-08llvm-undname: Fix more crashes and asserts on invalid inputsNico Weber1-24/+76
For functions whose callers don't check that enough input is present, add checks at the start of the function that enough input is there and set Error otherwise. For functions that return AST objects, return nullptr instead of incomplete AST objects with nullptr fields if an error occurred during the function. Introduce a new function demangleDeclarator() for the sequence demangleFullyQualifiedSymbolName(); demangleEncodedSymbol() and use it in the two places that had this sequence. Let this new function check that ConversionOperatorIdentifiers have a valid TargetType. Some of the bad inputs found by oss-fuzz, others by inspection. Differential Revision: https://reviews.llvm.org/D60354 llvm-svn: 357936
2019-04-03llvm-undname: Name a pair. No behavior change.Nico Weber1-3/+5
Differential Revision: https://reviews.llvm.org/D60210 llvm-svn: 357653
2019-04-03llvm-undname: Fix a crash-on-invalidNico Weber1-1/+1
Found by oss-fuzz, fixes issue 13260 on oss-fuzz. Differential Revision: https://reviews.llvm.org/D60207 llvm-svn: 357649
2019-04-03llvm-undame: Fix an assert-on-invalidNico Weber1-1/+4
Found by oss-fuzz, fixes issue 12432 on os-fuzz. Differential Revision: https://reviews.llvm.org/D60206 llvm-svn: 357648
2019-04-03llvm-undname: Fix an assert-on-invalidNico Weber1-0/+5
Found by oss-fuzz, fixes issues 12428 and 12429 on oss-fuzz. Differential Revision: https://reviews.llvm.org/D60204 llvm-svn: 357647
2019-04-03llvm-undname: Fix a crash-on-invalidNico Weber1-0/+4
Found by oss-fuzz, fixes issues 12435 and 12438 on oss-fuzz. Differential Revision: https://reviews.llvm.org/D60202 llvm-svn: 357646
2019-01-19Update more file headers across all of the LLVM projects in the monorepoChandler Carruth1-4/+3
to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
2019-01-17NFC: Make the copies of the demangler byte-for-byte identicalErik Pilkington1-10/+10
With this patch, the copies of the files ItaniumDemangle.h, StringView.h, and Utility.h are kept byte-for-byte in sync between libcxxabi and llvm. All differences (namespaces, fallthrough, and unreachable macros) are defined in each copies' DemanglerConfig.h. This patch also adds a script to copy changes from libcxxabi (cp-to-llvm.sh), and a README.txt explaining the situation. Differential revision: https://reviews.llvm.org/D53538 llvm-svn: 351474
2019-01-08[llvm-undname] Add support for demangling msvc's noexcept types.Zachary Turner1-3/+6
Starting in C++17, MSVC introduced a new mangling for function parameters that are themselves noexcept functions. This patch makes llvm-undname properly demangle them. Patch by Zachary Henkel Differential Revision: https://reviews.llvm.org/D55769 llvm-svn: 350656
2018-12-14[MS Demangler] Fail gracefully on invalid pointer types.Zachary Turner1-6/+12
Once we detect a 'P', we know we a pointer type is upcoming, so we make some assumptions about the output that follows. If those assumptions didn't hold, we would assert. Instead, we should fail gracefully and propagate the error up. llvm-svn: 349169
2018-12-14Fix a crash in llvm-undname with invalid types.Zachary Turner1-2/+2
llvm-svn: 349165
2018-11-11Make initializeOutputStream() return false on error and true on success.Nico Weber1-5/+5
As discussed in https://reviews.llvm.org/D52104 Differential Revision: https://reviews.llvm.org/D52143 llvm-svn: 346606
2018-11-01[MS Demangler] Expose the Demangler AST publicly.Zachary Turner1-155/+12
LLDB would like to use this in order to build a clang AST from a mangled name. This is NFC otherwise. llvm-svn: 345837
2018-10-13Move some helpers from the global namespace into anonymous ones.Benjamin Kramer1-5/+6
llvm-svn: 344468
2018-09-15Update microsoftDemangle() to work more like itaniumDemangle().Nico Weber1-14/+32
* Use same method of initializing the output stream and its buffer * Allow a nullptr Status pointer * Don't print the mangled name on demangling error * Write to N (if it is non-nullptr) Differential Revision: https://reviews.llvm.org/D52104 llvm-svn: 342330
2018-08-30Remove some debugging code that was accidentally left in.Zachary Turner1-11/+0
llvm-svn: 341122
2018-08-30[MS Demangler] Add support for $$Z parameter pack separator.Zachary Turner1-5/+32
$$Z appears between adjacent expanded parameter packs in the same template instantiation. We don't need to print it, it's only there to disambiguate between manglings that would otherwise be ambiguous. So we just need to parse it and throw it away. llvm-svn: 341119
2018-08-29[MS Demangler] Fix several crashes and demangling bugs.Zachary Turner1-23/+52
These bugs were found by writing a Python script which spidered the entire Chromium build directory tree demangling every symbol in every object file. At the start, the tool printed: Processed 27443 object files. 2926377/2936108 symbols successfully demangled (99.6686%) 9731 symbols could not be demangled (0.3314%) 14589 files crashed while demangling (53.1611%) After this patch, it prints: Processed 27443 object files. 41295518/41295617 symbols successfully demangled (99.9998%) 99 symbols could not be demangled (0.0002%) 0 files crashed while demangling (0.0000%) The issues fixed in this patch are: * Ignore empty parameter packs. Previously we would encounter a mangling for an empty parameter pack and add a null node to the AST. Since we don't print these anyway, we now just don't add anything to the AST and ignore it entirely. This fixes some of the crashes. * Account for "incorrect" string literal demanglings. Apparently an older version of clang would not truncate mangled string literals to 32 bytes of encoded character data. The demangling code however would allocate a 32 byte buffer thinking that it would not encounter more than this, and overrun the buffer. We now demangle up to 128 bytes of data, since the buggy clang would encode up to 32 *characters* of data. * Extended support for demangling init-fini stubs. If you had something like struct Foo { static vector<string> S; }; this would generate a dynamic atexit initializer *for the variable*. We didn't handle this, but now we print something nice. This is actually an improvement over undname, which will fail to demangle this at all. * Fixed one case of static this adjustment. We weren't handling several thunk codes so we didn't recognize the mangling. These are now handled. * Fixed a back-referencing problem. Member pointer templates should have their components considered for back-referencing The remaining 99 symbols which can't be demangled are all symbols which are compiler-generated and undname can't demangle either. llvm-svn: 341000
2018-08-29Add support for various C++14 demanglings.Zachary Turner1-14/+40
Mostly this includes <auto> and <decltype-auto> return values. Additionally, this fixes a fairly obscure back-referencing bug that was encountered in one of the C++14 tests, which is that if you have something like Foo<&bar, &bar> then the `bar` forms a backreference. llvm-svn: 340896
2018-08-29[MS Demangler] Add output flags to all function calls.Zachary Turner1-4/+4
Previously we had a FunctionSigFlags, but it's more flexible to just have one set of output flags that apply to the entire process and just pipe the entire set of flags through the output process. This will be useful when we start allowing the user to customize the outputting behavior. llvm-svn: 340894
2018-08-27[MS Demangler] Re-write the Microsoft demangler.Zachary Turner1-1824/+885
This is a pretty large refactor / re-write of the Microsoft demangler. The previous one was a little hackish because it evolved as I was learning about all the various edge cases, exceptions, etc. It didn't have a proper AST and so there was lots of custom handling of things that should have been much more clean. Taking what was learned from that experience, it's now re-written with a completely redesigned and much more sensible AST. It's probably still not perfect, but at least it's comprehensible now to someone else who wants to come along and make some modifications or read the code. Incidentally, this fixed a couple of bugs, so I've enabled the tests which now pass. llvm-svn: 340710
2018-08-25Fix -Wunused-function warning. NFCI.Simon Pilgrim1-9/+0
llvm-svn: 340687
2018-08-21[MS Demangler] Print template constructor args.Zachary Turner1-0/+13
Previously if you had something like this: template<typename T> struct Foo { template<typename U> Foo(U); }; Foo F(3.7); this would mangle as ??$?0N@?$Foo@H@@QEAA@N@Z and this would be demangled as: undname: __cdecl Foo<int>::Foo<int><double>(double) llvm-undname: __cdecl Foo<int>::Foo<int>(double) Note the lack of the constructor template parameter in our demangling. This patch makes it so we print the constructor argument list. llvm-svn: 340356
2018-08-21[MS Demangler] Fix a few more edge cases.Zachary Turner1-18/+69
I found these by running llvm-undname over a couple hundred megabytes of object files generated as part of building chromium. The issues fixed in this patch are: 1) decltype-auto return types. 2) Indirect vtables (e.g. const A::`vftable'{for `B'}) 3) Pointers, references, and rvalue-references to member pointers. I have exactly one remaining symbol out of a few hundred MB of object files that produces a name we can't demangle, and it's related to back-referencing. llvm-svn: 340341
2018-08-20[MS Demangler] Demangle special operator 'dynamic initializer'.Zachary Turner1-1/+18
This is encoded as __E and should print something like "dynamic initializer for 'Foo'(void)" This also adds support for dynamic atexit destructor, which is basically identical but encoded as __F with slightly different description. llvm-svn: 340239
2018-08-20[MS Demangler] Anonymous namespace hashes can be backreferenced.Zachary Turner1-0/+2
Previously we were not remembering the key values of anonymous namespaces, but we need to do this. llvm-svn: 340238
2018-08-20[MS Demangler] Properly demangle anonymous namespaces.Zachary Turner1-5/+7
llvm-svn: 340237