Age | Commit message (Collapse) | Author | Files | Lines |
|
If a template parameter refers to a pointer to member, but the mangling
of that was a string literal instead of a real symbol, llvm-undname used
to crash instead of rejecting the input.
llvm-svn: 361402
|
|
llvm-svn: 359109
|
|
Also add a test for demanling the co_await operator.
llvm-svn: 359007
|
|
llvm-svn: 358891
|
|
llvm-undname used to put '\x' in front of every pair of nibbles, but
u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct
for a single character (plus terminating \0) is u\xD7FF instead.
Now, wchar_t, char16_t, and char32_t strings roundtrip from source to
clang-cl (and cl.exe) and then llvm-undname.
(...at least as long as it's not a string like L"\xD7FF" L"foo" which
gets demangled as L"\xD7FFfoo", where the compiler then considers the
"f" as part of the hex escape. That seems ok.)
Also add a comment saying that the "almost-valid" char32_t string I
added in my last commit is actually produced by compilers.
llvm-svn: 358857
|
|
If a unsigned with all 4 bytes non-0 was passed to outputHex(), there
were two off-by-ones in it:
- Both MaxPos and Pos left space for the final \0, which left the buffer
one byte to small. Set MaxPos to 16 instead of 15 to fix.
- The `assert(Pos >= 0);` was after a `Pos--`, move it up one line.
Since valid Unicode codepoints are <= 0x10ffff, this could never really
happen in practice.
Found by oss-fuzz.
llvm-svn: 358856
|
|
llvm-svn: 358852
|
|
- Don't assert when a string looks like a u32 string to the heuristic
but doesn't have a length that's 0 mod 4. Instead, classify those
as u16 with embedded \0 chars. Found by oss-fuzz.
- Print embedded nul bytes as \0 instead of \x00.
llvm-svn: 358835
|
|
llvm-svn: 358760
|
|
llvm-svn: 358708
|
|
llvm-svn: 358707
|
|
llvm-svn: 358492
|
|
Similar to r358421: A StructorIndentifierNode has a Class field which
is read when printing it, but if the StructorIndentifierNode appears in
a template argument then demangleFullyQualifiedSymbolName() which sets
Class isn't called. Since StructorIndentifierNodes are always leaf
names, we can just reject them as well.
Found by oss-fuzz.
llvm-svn: 358491
|
|
template args
A ConversionOperatorIdentifierNode has a TargetType which is read when
printing it, but if the ConversionOperatorIdentifierNode appears in a
template argument there's nothing that can provide the TargetType.
Normally the COIN is a symbol (leaf) name and takes its TargetType from the
symbol's type, but in a template argument context the COIN can only be
either a non-leaf name piece or a type, and must hence be invalid.
Similar to the COIN check in demangleDeclarator().
Found by oss-fuzz.
llvm-svn: 358421
|
|
table nodes
llvm-svn: 358367
|
|
llvm-svn: 358363
|
|
No behavior change.
llvm-svn: 358241
|
|
llvm-svn: 358240
|
|
Found by inspection.
llvm-svn: 358239
|
|
Found by inspection.
llvm-svn: 358238
|
|
Found by oss-fuzz.
Also remove an else-after-return, this part has no behavior change.
llvm-svn: 358237
|
|
member arg
Found by oss-fuzz.
llvm-svn: 358234
|
|
This fixes a regression from https://reviews.llvm.org/D60354. We used to
SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN);
if (Symbol) {
Symbol->Name = QN;
}
but changed that to
SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN);
if (Error)
return nullptr;
Symbol->Name = QN;
and one branch somewhere returned a nullptr without setting Error.
Looking at the code changed in r340083 and r340710 that branch looks
like a remnant from an earlier attempt to demangle RTTI descriptors
that has since been rewritten -- so just remove this branch. It
shouldn't change behavior for correctly mangled symbols.
llvm-svn: 358112
|
|
For functions whose callers don't check that enough input is present,
add checks at the start of the function that enough input is there and
set Error otherwise.
For functions that return AST objects, return nullptr instead of
incomplete AST objects with nullptr fields if an error occurred during
the function.
Introduce a new function demangleDeclarator() for the sequence
demangleFullyQualifiedSymbolName(); demangleEncodedSymbol() and
use it in the two places that had this sequence. Let this new function
check that ConversionOperatorIdentifiers have a valid TargetType.
Some of the bad inputs found by oss-fuzz, others by inspection.
Differential Revision: https://reviews.llvm.org/D60354
llvm-svn: 357936
|
|
Differential Revision: https://reviews.llvm.org/D60210
llvm-svn: 357653
|
|
Found by oss-fuzz, fixes issue 13260 on oss-fuzz.
Differential Revision: https://reviews.llvm.org/D60207
llvm-svn: 357649
|
|
Found by oss-fuzz, fixes issue 12432 on os-fuzz.
Differential Revision: https://reviews.llvm.org/D60206
llvm-svn: 357648
|
|
Found by oss-fuzz, fixes issues 12428 and 12429 on oss-fuzz.
Differential Revision: https://reviews.llvm.org/D60204
llvm-svn: 357647
|
|
Found by oss-fuzz, fixes issues 12435 and 12438 on oss-fuzz.
Differential Revision: https://reviews.llvm.org/D60202
llvm-svn: 357646
|
|
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351648
|
|
With this patch, the copies of the files ItaniumDemangle.h,
StringView.h, and Utility.h are kept byte-for-byte in sync between
libcxxabi and llvm. All differences (namespaces, fallthrough, and
unreachable macros) are defined in each copies' DemanglerConfig.h.
This patch also adds a script to copy changes from libcxxabi
(cp-to-llvm.sh), and a README.txt explaining the situation.
Differential revision: https://reviews.llvm.org/D53538
llvm-svn: 351474
|
|
Starting in C++17, MSVC introduced a new mangling for function
parameters that are themselves noexcept functions. This patch
makes llvm-undname properly demangle them.
Patch by Zachary Henkel
Differential Revision: https://reviews.llvm.org/D55769
llvm-svn: 350656
|
|
Once we detect a 'P', we know we a pointer type is upcoming, so
we make some assumptions about the output that follows. If those
assumptions didn't hold, we would assert. Instead, we should
fail gracefully and propagate the error up.
llvm-svn: 349169
|
|
llvm-svn: 349165
|
|
As discussed in https://reviews.llvm.org/D52104
Differential Revision: https://reviews.llvm.org/D52143
llvm-svn: 346606
|
|
LLDB would like to use this in order to build a clang AST from
a mangled name.
This is NFC otherwise.
llvm-svn: 345837
|
|
llvm-svn: 344468
|
|
* Use same method of initializing the output stream and its buffer
* Allow a nullptr Status pointer
* Don't print the mangled name on demangling error
* Write to N (if it is non-nullptr)
Differential Revision: https://reviews.llvm.org/D52104
llvm-svn: 342330
|
|
llvm-svn: 341122
|
|
$$Z appears between adjacent expanded parameter packs in the
same template instantiation. We don't need to print it, it's
only there to disambiguate between manglings that would otherwise
be ambiguous. So we just need to parse it and throw it away.
llvm-svn: 341119
|
|
These bugs were found by writing a Python script which spidered
the entire Chromium build directory tree demangling every symbol
in every object file. At the start, the tool printed:
Processed 27443 object files.
2926377/2936108 symbols successfully demangled (99.6686%)
9731 symbols could not be demangled (0.3314%)
14589 files crashed while demangling (53.1611%)
After this patch, it prints:
Processed 27443 object files.
41295518/41295617 symbols successfully demangled (99.9998%)
99 symbols could not be demangled (0.0002%)
0 files crashed while demangling (0.0000%)
The issues fixed in this patch are:
* Ignore empty parameter packs. Previously we would encounter
a mangling for an empty parameter pack and add a null node
to the AST. Since we don't print these anyway, we now just
don't add anything to the AST and ignore it entirely. This
fixes some of the crashes.
* Account for "incorrect" string literal demanglings. Apparently
an older version of clang would not truncate mangled string
literals to 32 bytes of encoded character data. The demangling
code however would allocate a 32 byte buffer thinking that it
would not encounter more than this, and overrun the buffer.
We now demangle up to 128 bytes of data, since the buggy
clang would encode up to 32 *characters* of data.
* Extended support for demangling init-fini stubs. If you had
something like
struct Foo {
static vector<string> S;
};
this would generate a dynamic atexit initializer *for the
variable*. We didn't handle this, but now we print something
nice. This is actually an improvement over undname, which will
fail to demangle this at all.
* Fixed one case of static this adjustment. We weren't handling
several thunk codes so we didn't recognize the mangling. These
are now handled.
* Fixed a back-referencing problem. Member pointer templates
should have their components considered for back-referencing
The remaining 99 symbols which can't be demangled are all symbols
which are compiler-generated and undname can't demangle either.
llvm-svn: 341000
|
|
Mostly this includes <auto> and <decltype-auto> return values.
Additionally, this fixes a fairly obscure back-referencing bug
that was encountered in one of the C++14 tests, which is that
if you have something like Foo<&bar, &bar> then the `bar`
forms a backreference.
llvm-svn: 340896
|
|
Previously we had a FunctionSigFlags, but it's more flexible
to just have one set of output flags that apply to the entire
process and just pipe the entire set of flags through the
output process.
This will be useful when we start allowing the user to customize
the outputting behavior.
llvm-svn: 340894
|
|
This is a pretty large refactor / re-write of the Microsoft
demangler. The previous one was a little hackish because it
evolved as I was learning about all the various edge cases,
exceptions, etc. It didn't have a proper AST and so there was
lots of custom handling of things that should have been much
more clean.
Taking what was learned from that experience, it's now
re-written with a completely redesigned and much more sensible
AST. It's probably still not perfect, but at least it's
comprehensible now to someone else who wants to come along
and make some modifications or read the code.
Incidentally, this fixed a couple of bugs, so I've enabled
the tests which now pass.
llvm-svn: 340710
|
|
llvm-svn: 340687
|
|
Previously if you had something like this:
template<typename T>
struct Foo {
template<typename U>
Foo(U);
};
Foo F(3.7);
this would mangle as ??$?0N@?$Foo@H@@QEAA@N@Z
and this would be demangled as:
undname: __cdecl Foo<int>::Foo<int><double>(double)
llvm-undname: __cdecl Foo<int>::Foo<int>(double)
Note the lack of the constructor template parameter in our
demangling.
This patch makes it so we print the constructor argument list.
llvm-svn: 340356
|
|
I found these by running llvm-undname over a couple hundred
megabytes of object files generated as part of building chromium.
The issues fixed in this patch are:
1) decltype-auto return types.
2) Indirect vtables (e.g. const A::`vftable'{for `B'})
3) Pointers, references, and rvalue-references to member pointers.
I have exactly one remaining symbol out of a few hundred MB of object
files that produces a name we can't demangle, and it's related to
back-referencing.
llvm-svn: 340341
|
|
This is encoded as __E and should print something like
"dynamic initializer for 'Foo'(void)"
This also adds support for dynamic atexit destructor, which is
basically identical but encoded as __F with slightly different
description.
llvm-svn: 340239
|
|
Previously we were not remembering the key values of anonymous
namespaces, but we need to do this.
llvm-svn: 340238
|
|
llvm-svn: 340237
|