aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-10-06RISC-V: Print comma and tabs as the "text" styleTsukasa OI1-11/+20
On the RISC-V disassembler, some separators have non-text style when printed with another word with another style. This commit splits those, making sure that those comma and tabs are printed with the "text" style. opcodes/ChangeLog: * riscv-dis.c (print_insn_args): Split and print the comma as text. (riscv_disassemble_insn): Split and print tabs as text. (riscv_disassemble_data): Likewise.
2022-10-06RISC-V: Optimize riscv_disassemble_data printfTsukasa OI1-6/+4
This commit makes types of printf arguments on riscv_disassemble_data as small as possible (as long as we can preserve the portability) to reduce the cost of printf (especially on 32-bit host). opcodes/ChangeLog: * riscv-dis.c (riscv_disassemble_data): Use smallest possible type to printing data.
2022-10-06RISC-V: Fix printf argument types corresponding %xTsukasa OI1-7/+7
"%x" format specifier requires unsigned type, not int. This commit fixes this issue on the RISC-V disassembler. opcodes/ChangeLog: * riscv-dis.c (print_insn_args): Fix printf argument types where the format specifier is "%x".
2022-10-06RISC-V: Fix immediates to have "immediate" styleTsukasa OI1-5/+5
This commit fixes certain print calls on immediate operands to have dis_style_immediate. opcodes/ChangeLog: * riscv-dis.c (print_insn_args): Fix immediates to have "immediate" style. (riscv_disassemble_data): Likewise.
2022-10-06Automatic date update in version.inGDB Administrator1-1/+1
2022-10-06Re: bfd BLD-POTFILES.in dependenciesAlan Modra2-2/+2
Removing $BLD_POTFILES from BFD-POTFILES.in was correct, but left a hole in dependencies. make[4]: Entering directory '/home/alan/build/gas/all/bfd/po' make[4]: *** No rule to make target '../elf32-aarch64.c', needed by '/home/alan/src/binutils-gdb/bfd/po/bfd.pot'. Stop. * Makefile.am (BUILT_SOURCES): Add BUILD_CFILES. * Makefile.in: Regenerate.
2022-10-05x86/gas: support quoted address scale factor in AT&T syntaxJan Beulich4-12/+35
An earlier attempt (e68c3d59acd0 ["x86: better respect quotes in parse_operands()"]) needed undoing (cc0f96357e0b ["x86: permit parenthesized expressions again as addressing scale factor"]) as far its effect here went. As indicated back then, the issue is the backwards scanning of the operand string to find the matching opening parenthesis. Switch to forward scanning, finding the last outermost unquoted opening parenthesis (which is the one matching the trailing closing one).
2022-10-05Arm64: support CLEARBHB aliasJan Beulich6-1577/+1582
While the Arm v8 ARM (rev I-a) still doesn't mention this alias, it is (typically via a macro) already in use in kernels and alike.
2022-10-05PR29647, objdump -S loopingAlan Modra1-2/+3
Fuzzed input with this in .debug_line [0x0000003b] Special opcode 115: advance Address by 8 to 0x401180 and Line by -2 to -1 PR 29647 * objdump.c (print_line): Don't decrement line number here.. (dump_lines): ..do so here instead, ensuring loop terminates.
2022-10-05Re: stab nearest_line bfd_malloc_and_get_sectionAlan Modra1-0/+4
It didn't take long for the fuzzers to avoid size checks in bfd_malloc_and_get_section. Plug this hole. * syms.c (_bfd_stab_section_find_nearest_line): Ignore fuzzed sections with no contents.
2022-10-04gprofng: fix build with --enable-pgo-build=ltoVladimir Mezentsev5-28/+30
gprofng/ChangeLog 2022-10-04 Vladimir Mezentsev <vladimir.mezentsev@oracle.com> PR gprofng/29579 * libcollector/dispatcher.c: Fix the symbol version in SYMVER_ATTRIBUTE. * libcollector/iotrace.c: Likewise. * libcollector/linetrace.c: Likewise. * libcollector/mmaptrace.c: Likewise. * libcollector/synctrace.c: Likewise.
2022-10-05Automatic date update in version.inGDB Administrator1-1/+1
2022-10-04Remove decode_location_spec_defaultTom Tromey1-30/+15
This removes decode_location_spec_default, inlining it into its sole caller. Regression tested on x86-64 Fedora 34.
2022-10-04gas: NEWS: Mention the T-Head extensions that were recently addedPalmer Dabbelt1-0/+5
2022-10-04[gdb/symtab] Don't complain about function declsTom de Vries3-1/+67
[ Requires "[gdb/symtab] Don't complain about inlined functions" as submitted here ( https://sourceware.org/pipermail/gdb-patches/2022-September/191762.html ). ] With the test-case included in this patch, we get: ... (gdb) ptype main^M During symbol reading: cannot get low and high bounds for subprogram DIE \ at 0xc1^M type = int (void)^M (gdb) FAIL: gdb.dwarf2/anon-ns-fn.exp: ptype main without complaints ... The DIE causing the complaint is a function declaration: ... <2><c1>: Abbrev Number: 3 (DW_TAG_subprogram) <c2> DW_AT_name : foo <c8> DW_AT_declaration : 1 ... which is referred to from the DIE representing the function definition: ... <1><f4>: Abbrev Number: 7 (DW_TAG_subprogram) <f5> DW_AT_specification: <0xc1> <f9> DW_AT_low_pc : 0x4004c7 <101> DW_AT_high_pc : 0x7 ... which does contain the low and high bounds. Fix this by not complaining about function declarations. Tested on x86_64-linux.
2022-10-04[gdb/symtab] Don't complain about inlined functionsTom de Vries3-1/+74
With the test-case included in this patch, we get: ... (gdb) ptype main^M During symbol reading: cannot get low and high bounds for subprogram DIE \ at 0x113^M During symbol reading: cannot get low and high bounds for subprogram DIE \ at 0x11f^M type = int (void)^M (gdb) FAIL: gdb.dwarf2/inline.exp: ptype main ... The complaints are about foo, with DW_AT_inline == DW_INL_inlined: ... <1><11f>: Abbrev Number: 6 (DW_TAG_subprogram) <120> DW_AT_name : foo <126> DW_AT_prototyped : 1 <126> DW_AT_type : <0x10c> <12a> DW_AT_inline : 1 (inlined) ... and foo2, with DW_AT_inline == DW_INL_declared_inlined: ... <1><113>: Abbrev Number: 5 (DW_TAG_subprogram) <114> DW_AT_name : foo2 <11a> DW_AT_prototyped : 1 <11a> DW_AT_type : <0x10c> <11e> DW_AT_inline : 3 (declared as inline and inlined) ... Fix this by not complaining about inlined functions. Tested on x86_64-linux.
2022-10-04gdb/riscv: Partial support for instructions up to 176-bitTsukasa OI1-4/+5
Because riscv_insn_length started to support instructions up to 176-bit, we need to increase buf size to 176-bit in size. Also, that would break an assumption in riscv_insn::decode so this commit fixes it, noting that instructions longer than 64-bit are not fully supported yet.
2022-10-04RISC-V: Fix buffer overflow on print_insn_riscvTsukasa OI2-1/+3
Because riscv_insn_length started to support instructions up to 176-bit, we need to increase packet buffer size to 176-bit in size. include/ChangeLog: * opcode/riscv.h (RISCV_MAX_INSN_LEN): Max instruction length for use in buffer size. opcodes/ChangeLog: * riscv-dis.c (print_insn_riscv): Increase buffer size for max 176-bit length instructions.
2022-10-04RISC-V: Renamed INSN_CLASS for floating point in integer extensions.Nelson Chu3-244/+243
Just added suffix _INX for those INSN_CLASS should be enough to represent their fpr can be replaced by gpr.
2022-10-04Note that at least dejagnu version 1.5.3 is required in order to be ale to ↵Nick Clifton2-4/+18
run the testsuites. * README-maintainer-mode: Add a minimum version of dejagnu requirement.
2022-10-04opcodes/riscv: style csr names as registersAndrew Burgess1-1/+2
While reviewing another patch I noticed that RISC-V CSR names are given the text style, not the register style. This patch fixes this mistake.
2022-10-04[AArch64] Update FPSR/FPCR fields for FPU and SVELuis Machado3-2/+53
I noticed some missing flags/fields from FPSR and FPCR registers in both the FPU and SVE target descriptions. This patch adds those and makes the SVE versions of FPSR and FPCR use the proper flags/bitfields types.
2022-10-04Support objcopy changing compression to or from zstdAlan Modra3-169/+169
Commit 2cac01e3ffff lacked support for objcopy changing compression style. Add that support, which meant a rewrite of bfd_compress_section_contents. In the process I've fixed some memory leaks. * compress.c (bfd_is_section_compressed_info): Rename from bfd_is_section_compressed_with_header and add ch_type param to return compression header ch_type field. Update all callers. (decompress_section_contents): Remove buffer and size params. Rewrite. Update callers. (bfd_init_section_compress_status): Free contents on failure. (bfd_compress_section): Likewise. * elf.c (_bfd_elf_make_section_from_shdr): Support objcopy changing between any of the three compression schemes. Report "unable to compress/decompress" rather than "unable to initialize compress/decompress status" on compress/decompress failures. * bfd-in2.h: Regenerate.
2022-10-04Re: compress .gnu.debuglto_.debug_* sections if requestedAlan Modra2-16/+11
Enable zlib-gnu compression for .gnu.debuglto_.debug_*. This differs from zlib-gnu for .debug_* where the name is changed to .zdebug_*. The name change isn't really needed. bfd/ * elf.c (elf_fake_sections): Replace "." with ".z" in debug section names only when name was ".d*", ie. ".debug_*". (_bfd_elf_assign_file_positions_for_non_load): Likewise. gas/ * write.c (compress_debug): Compress .gnu.debuglto_.debug_* for zlib-gnu too. Compress .gnu.linkonce.wi.*.
2022-10-04compress .gnu.debuglto_.debug_* sections if requestedMartin Liska2-5/+6
Right now, when using LTO, the intermediate object files do contain debug info in sections starting with .gnu.debuglto_ prefix and are not compressed when --compress-debug-sections is used. It's a mistake and we can save quite some disk space. The following example comes from tramp3d when the corresponding LTO sections are compressed with zlib: $ bloaty tramp3d-v4-v2.o -- tramp3d-v4.o FILE SIZE VM SIZE -------------- -------------- +83% +10 [ = ] 0 [Unmapped] -68.0% -441 [ = ] 0 .gnu.debuglto_.debug_line -52.3% -759 [ = ] 0 .gnu.debuglto_.debug_line_str -62.4% -3.24Ki [ = ] 0 .gnu.debuglto_.debug_abbrev -64.8% -1.12Mi [ = ] 0 .gnu.debuglto_.debug_info -88.8% -4.58Mi [ = ] 0 .gnu.debuglto_.debug_str -27.7% -5.70Mi [ = ] 0 TOTAL bfd/ChangeLog: * elf.c (_bfd_elf_make_section_from_shdr): Compress all debug info sections. gas/ChangeLog: * write.c (compress_debug): Compress also ".gnu.debuglto_.debug_" if the compression algorithm is different from zlib-gnu.
2022-10-04RISC-V/gas: allow generating up to 176-bit instructions with .insnJan Beulich8-10/+85
For the time being simply utilize O_big to avoid widening other fields, bypassing append_insn() etc.
2022-10-04RISC-V/gas: don't open-code insn_length()Jan Beulich1-1/+1
Use the helper when it can be used.
2022-10-04RISC-V/gas: drop stray call to install_insn()Jan Beulich1-1/+0
add_fixed_insn(), by calling move_insn(), already invokes install_insn().
2022-10-04RISC-V/gas: drop riscv_subsets static variableJan Beulich1-18/+14
It's fully redundant with the subset_list member of riscv_rps_as.
2022-10-04RISC-V: don't cast expressions' X_add_number to long in diagnosticsJan Beulich1-4/+4
There's no need for such workarounds anymore now that we use C99 uniformly. This addresses several testsuite failures encountered when (cross-)building on a 32-bit host.
2022-10-04ignore DWARF debug information for -gsplit-dwarf with dwarf-5Potharla, Rupesh1-1/+11
Skip dwo_id for split dwarf. * dwarf2.c (parse_comp_unit): Skip DWO_id for DW_UT_skeleton.
2022-10-04Automatic date update in version.inGDB Administrator1-1/+1
2022-10-03Fix self-move warning check for GCC 13+Jan-Benedict Glaw2-0/+8
GCC 13 got the self-move warning (0abb78dda084a14b3d955757c6431fff71c263f3), but that warning is only checked for clang, resulting in: /usr/lib/gcc-snapshot/bin/g++ -x c++ -I. -I. -I./config -DLOCALEDIR="\"/tmp/gdb-m68k-linux/share/locale\"" -DHAVE_CONFIG_H -I./../include/opcode -I./../readline/readline/.. -I./../zlib -I../bfd -I./../bfd -I./../include -I../libdecnumber -I./../libdecnumber -I./../gnulib/import -I../gnulib/import -I./.. -I.. -I./../libbacktrace/ -I../libbacktrace/ -DTUI=1 -I./.. -pthread -Wall -Wpointer-arith -Wno-unused -Wunused-value -Wunused-variable -Wunused-function -Wno-switch -Wno-char-subscripts -Wempty-body -Wunused-but-set-parameter -Wunused-but-set-variable -Wno-sign-compare -Wno-error=maybe-uninitialized -Wno-mismatched-tags -Wsuggest-override -Wimplicit-fallthrough=3 -Wduplicated-cond -Wshadow=local -Wdeprecated-copy -Wdeprecated-copy-dtor -Wredundant-move -Wmissing-declarations -Wstrict-null-sentinel -Wformat -Wformat-nonliteral -Werror -g -O2 -c -o unittests/environ-selftests.o -MT unittests/environ-selftests.o -MMD -MP -MF unittests/.deps/environ-selftests.Tpo unittests/environ-selftests.c unittests/environ-selftests.c: In function 'void selftests::gdb_environ_tests::test_self_move()': unittests/environ-selftests.c:228:7: error: moving 'env' of type 'gdb_environ' to itself [-Werror=self-move] 228 | env = std::move (env); | ~~~~^~~~~~~~~~~~~~~~~ unittests/environ-selftests.c:228:7: note: remove 'std::move' call cc1plus: all warnings being treated as errors make[1]: *** [Makefile:1896: unittests/environ-selftests.o] Error 1 make[1]: Leaving directory '/var/lib/laminar/run/gdb-m68k-linux/3/binutils-gdb/gdb' make: *** [Makefile:13193: all-gdb] Error 2
2022-10-03gdb: constify inferior::target_is_pushedSimon Marchi1-1/+1
Change-Id: Ia4143b9c63cb76e2c824ba773c66f5c5cd94b2aa
2022-10-03[AArch64] Handle W registers as pseudo-registers instead of aliases of X ↵Luis Machado4-34/+212
registers The aarch64 port handles W registers as aliases of X registers. This is incorrect because X registers are 64-bit and W registers are 32-bit. This patch teaches GDB how to handle W registers as pseudo-registers of 32-bit, the bottom half of the X registers. Testcase included.
2022-10-03[AArch64] Fix pseudo-register numbering in the presence of unexpected ↵Luis Machado1-2/+13
additional registers When using AArch64 GDB with the QEMU debugging stub (in user mode), we get additional system registers that GDB doesn't particularly care about, so it doesn't number those explicitly. But given the pseudo-register numbers are above the number of real registers, we need to setup/account for the real registers first before going ahead and numbering the pseudo-registers. This has to happen at the end of aarch64_gdbarch_init, after the call to tdesc_use_registers, as that updates the total number of real registers. This is in preparation to supporting pointer authentication for bare metal aarch64 (QEMU).
2022-10-03readelf: DO not load section headers from file offset zeroNick Clifton2-0/+19
* readelf.c (get_32bit_section_headers): Return false if the e_shoff field is zero. (get_64bit_section_headers): Likewise.
2022-10-03RISC-V: Move supervisor instructions after all unprivileged onesTsukasa OI1-32/+32
This location of supervisor instructions is out of place (because many other privileged instructions are located at the tail but after the supervisor instructions, we have many unprivileged instructions including bit manipulation / crypto / vector instructions). Not only that, this is harmful to implement pseudoinstructions in the latest 'P'-extension proposal (CLROV and RDOV). This commit moves supervisor instructions after all unprivileged instructions. opcodes/ChangeLog: * riscv-opc.c (riscv_opcodes): Adjust indents. Move supervisor instructions after all unprivileged instructions.
2022-10-03Improve GDB's baseclass detection with typedefsBruno Larsen3-17/+37
When a class inherits from a typedef'd baseclass, GDB may be unable to find the baseclass if the user is not using the typedef'd name, as is tested on gdb.cp/virtbase2.exp; the reason that test case is working under gcc is that the dwarf generated by gcc links the class to the original definition of the baseclass, not to the typedef. If the inheritance is linked to the typedef, such as how clang does it, gdb.cp/virtbase2.exp starts failing. This can also be seen in gdb.cp/impl-this.exp, when attempting to print D::Bint::i, and GDB not being able to find the baseclass Bint. This happens because searching for baseclasses only uses the macro TYPE_BASECLASS_NAME, which returns the typedef'd name. However, we can't switch that macro to checking for typedefs, otherwise we wouldn't be able to find the typedef'd name anymore. This is fixed by searching for members or baseclasses by name, we check both the saved name and the name after checking for typedefs. This also fixes said long-standing bug in gdb.cp/impl-this.exp when the compiler adds information about typedefs in the debuginfo.
2022-10-03RISC-V: Assign DWARF numbers to vector registersTsukasa OI4-12/+91
This commit assigns DWARF register numbers to vector registers (v0-v31: 96..127) to implement RISC-V DWARF Specification version 1.0-rc4 (now in the frozen state): https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0-rc4 binutils/ChangeLog: * dwarf.c (dwarf_regnames_riscv): Assign DWARF register numbers 96..127 to vector registers v0-v31. gas/ChangeLog: * config/tc-riscv.c (tc_riscv_regname_to_dw2regnum): Support vector registers. * testsuite/gas/riscv/dw-regnums.s: Add vector registers to the DWARF register number test. * testsuite/gas/riscv/dw-regnums.d: Likewise.
2022-10-03RISC-V: Add testcase for DWARF register numbersTsukasa OI2-0/+296
Although it had csr-dw-regnums.d (for CSRs), it didn't have DWARF register number test for GPRs/FPRs. This commit adds dw-regnums.{s,d} to test such registers. gas/ChangeLog: * testsuite/gas/riscv/dw-regnums.s: New DWARF register number test for GPRs/FPRs. * testsuite/gas/riscv/dw-regnums.d: Likewise.
2022-10-03Automatic date update in version.inGDB Administrator1-1/+1
2022-10-02[gdb/testsuite] Fix waitpid testing in next-fork-other-thread.cTom de Vries1-1/+1
In next-fork-other-thread.c, there's this loop: ... do { ret = waitpid (pid, &stat, 0); } while (ret == EINTR); ... The loop condition tests for "ret == EINTR" but waitpid signals EINTR by returning -1 and setting errno to EINTR. Fix this by changing the loop condition to "ret == -1 && errno == EINTR". Tested on x86_64-linux.
2022-10-02gdb/testsuite: handle invalid .exp names passed in TESTSAndrew Burgess1-10/+30
I ran some tests like: $ make check-gdb TESTS="gdb.base/break.exp" then, then I went to rerun the tests later, I managed to corrupt the command line, like this: $ make check-gdb TESTS="gdb.base/breakff.exp" the make command did exit with an error, but DejaGnu appeared to report that every test passed! The tail end of the output looks like this: Illegal Argument "no-matching-tests-found" try "runtest --help" for option list === gdb Summary === # of expected passes 115 /tmp/build/gdb/gdb version 13.0.50.20220831-git -nw -nx -iex "set height 0" -iex "set width 0" -data-directory /tmp/build/gdb/testsuite/../data-directory make[3]: *** [Makefile:212: check-single] Error 1 make[3]: Leaving directory '/tmp/build/gdb/testsuite' make[2]: *** [Makefile:161: check] Error 2 make[2]: Leaving directory '/tmp/build/gdb/testsuite' make[1]: *** [Makefile:1916: check] Error 2 make[1]: Leaving directory '/tmp/build/gdb' make: *** [Makefile:13565: check-gdb] Error 2 For a while, I didn't spot that DejaGnu had failed at all, I saw the 115 passes, and thought everything had run correctly - though I was puzzled that make was reporting an error. What happens is that in gdb/testsuite/Makefile, in the check-single rule, we first run DejaGnu, then run the dg-add-core-file-count.sh script, and finally, we use sed to extract the results from the gdb.sum file. In my case, with the invalid test name, DejaGnu fails, but the following steps are still run, the final result, the 115 passes, is then extracted from the pre-existing gdb.sum file. If I use 'make -jN' then the 'check-parallel' rule, rather than the 'check-single' rule is used. In this case the behaviour is slightly different, the tail end of the output now looks like this: No matching tests found. make[4]: Leaving directory '/tmp/build/gdb/testsuite' find: ‘outputs’: No such file or directory Usage: ../../../src/gdb/testsuite/../../contrib/dg-extract-results.py [-t tool] [-l variant-list] [-L] log-or-sum-file ... tool The tool (e.g. g++, libffi) for which to create a new test summary file. If not specified then output is created for all tools. variant-list One or more test variant names. If the list is not specified then one is constructed from all variants in the files for <tool>. sum-file A test summary file with the format of those created by runtest from DejaGnu. If -L is used, merge *.log files instead of *.sum. In this mode the exact order of lines may not be preserved, just different Running *.exp chunks should be in correct order. find: ‘outputs’: No such file or directory Usage: ../../../src/gdb/testsuite/../../contrib/dg-extract-results.py [-t tool] [-l variant-list] [-L] log-or-sum-file ... tool The tool (e.g. g++, libffi) for which to create a new test summary file. If not specified then output is created for all tools. variant-list One or more test variant names. If the list is not specified then one is constructed from all variants in the files for <tool>. sum-file A test summary file with the format of those created by runtest from DejaGnu. If -L is used, merge *.log files instead of *.sum. In this mode the exact order of lines may not be preserved, just different Running *.exp chunks should be in correct order. make[3]: Leaving directory '/tmp/build/gdb/testsuite' make[2]: Leaving directory '/tmp/build/gdb/testsuite' make[1]: Leaving directory '/tmp/build/gdb' Rather than DejaGnu failing, we now get a nice 'No matching tests found' message, followed by some other noise. This other noise is first `find` failing, followed by the dg-extract-results.py script failing. What happens here is that, in the check-parallel rule, the outputs directory is deleted before DejaGnu is invoked. Then we try to run all the tests, and finally we use find and dg-extract-results.py to combine all the separate .sun and .log files together. However, if there are no tests run then the outputs/ directory is never created, so the find command and consequently the dg-extract-results.py script, fail. This commit aims to fix the following issues: (1) For check-single and check-parallel rules, don't run any of the post-processing steps if DejaGnu failed to run. This will avoid all the noise after the initial failure of DejaGnu, (2) For check-single ensure that we don't accidentally report previous results, this is related to the above, but is worth calling out as a separate point, and (3) For check-single, print the 'No matching tests found' message just like we do for a parallel test run. This makes the parallel and non-parallel testing behaviour more similar, and I think is clearer than the current 'Illegal Argument' error message. Points (1) and (2) will be handled by moving the post processing steps inside an if block within the recipe. For check-single I propose deleting the gdb.sum and gdb.log files before running DejaGnu, this is similar (I think) to how we delete the outputs/ directory in the check-parallel rule. For point (3) I plan to split the check-single rule in two, the existing check-single will be renamed do-check-single, then a new check-single rule will be added. The new check-single rule can either depend on the new do-check-single, or will ensure the 'No matching tests found' message is printed when appropriate.
2022-10-02gdb/disasm: better intel flavour disassembly styling with PygmentsAndrew Burgess1-4/+55
This commit was inspired by this stackoverflow post: https://stackoverflow.com/questions/73491793/why-is-there-a-%C2%B1-in-lea-rax-rip-%C2%B1-0xeb3 One of the comments helpfully links to this Python test case: from pygments import formatters, lexers, highlight def colorize_disasm(content, gdbarch): try: lexer = lexers.get_lexer_by_name("asm") formatter = formatters.TerminalFormatter() return highlight(content, lexer, formatter).rstrip().encode() except: return None print(colorize_disasm("lea [rip+0x211] # COMMENT", None).decode()) Run the test case and you should see that the '+' character is underlined, and could be confused with a combined +/- symbol. What's happening is that Pygments is failing to parse the input text, and the '+' is actually being marked in the error style. The error style is red and underlined. It is worth noting that the assembly instruction being disassembled here is an x86-64 instruction in the 'intel' disassembly style, rather than the default att style. Clearly the Pygments module expects the att syntax by default. If we change the test case to this: from pygments import formatters, lexers, highlight def colorize_disasm(content, gdbarch): try: lexer = lexers.get_lexer_by_name("asm") lexer.add_filter('raiseonerror') formatter = formatters.TerminalFormatter() return highlight(content, lexer, formatter).rstrip().encode() except: return None res = colorize_disasm("lea rax,[rip+0xeb3] # COMMENT", None) if res: print(res.decode()) else: print("No result!") Here I've added the call: lexer.add_filter('raiseonerror'), and I am now checking to see if the result is None or not. Running this and the test now print 'No result!' - instead of styling the '+' in the error style, we instead give up on the styling attempt. There are two things we need to fix relating to this disassembly text. First, Pygments is expecting att style disassembly, not the intel style that this example uses. Fortunately, Pygments also supports the intel style, all we need to do is use the 'nasm' lexer instead of the 'asm' lexer. However, this leads to the second problem; in our disassembler line we have '# COMMENT'. The "official" Intel disassembler style uses ';' for its comment character, however, gas and libopcodes use '#' as the comment character, as gas uses ';' for an instruction separator. Unfortunately, Pygments expects ';' as the comment character, and treats '#' as an error, which means, with the addition of the 'raiseonerror' filter, that any line containing a '#' comment, will not get styled correctly. However, as the i386 disassembler never produces a '#' character other than for comments, we can easily "fix" Pygments parsing of the disassembly line. This is done by creating a filter. This filter looks for an Error token with the value '#', we then change this into a comment token. Every token after this (until the end of the line) is also converted into a comment. In this commit I do the following: 1. Check the 'disassembly-flavor' setting and select between the 'asm' and 'nasm' lexers based on the setting. If the setting is not available then the 'asm' lexer is used by default, 2. Use "add_filter('raiseonerror')" to ensure that the formatted output will not include any error text, which would be underlined, and might be confusing, 3. If the 'nasm' lexer is selected, then add an additional filter that will format '#' and all other text on the line, as a comment, and 4. If Pygments throws an exception, instead of returning None, return the original, unmodified content. This will mean that this one instruction is printed without styling, but GDB will continue to call into the Python code to style later instructions. I haven't included a test specifically for the above error case, though I have manually check that the above case now styles correctly (with no underline). The existing style tests check that the disassembler styling still works though, so I know I've not generally broken things. One final thought I have after looking at this issue is that I wonder now if using Pygments for styling disassembly from every architecture is actually a good idea? Clearly, the 'asm' lexer is OK with att style x86-64, but not OK with intel style x86-64, so who knows how well it will handle other random architectures? When I first added this feature I tested it against some random RISC-V, ARM, and X86-64 (att style) code, and it seemed fine, but I never tried to make an exhaustive check of all instructions, so its quite possible that there are corner cases where things are styled incorrectly. With the above changes I think that things should be a bit better now. If a particular instruction doesn't parse correctly then our Pygments based styling code will just not style that one instruction. This is combined with the fact that many architectures are now moving to libopcodes based styling, which is much more reliable. So, I think it is fine to keep using Pygments as a fallback mechanism for styling all architectures, even if we know it might not be perfect in all cases.
2022-10-02gdb: improve disassembler styling when Pygments raises an exceptionAndrew Burgess3-24/+130
While working on another issue relating to GDB's use of the Python Pygments package for disassembly styling I noticed an issue in the case where the Pygments package raises an exception. The intention of the current code is that, should the Pygments package raise an exception, GDB will disable future attempts to call into the Pygments code. This was intended to prevent repeated errors during disassembly if, for some reason, the Pygments code isn't working. Since the Pygments based styling was added, GDB now supports disassembly styling using libopcodes, but this is only available for some architectures. For architectures not covered by libopcodes Pygments is still the only option. What I observed is that, if I disable the libopcodes styling, then setup GDB so that the Pygments based styling code will indicate an error (by returning None), GDB does, as expected, stop using the Pygments based styling. However, the libopcodes based styling will instead be used, despite this feature having been disabled. The problem is that the disassembler output is produced into a string_file buffer. When we are using Pygments, this buffer is created without styling support. However, when Pygments fails, we recreate the buffer with styling support. The problem is that we should only recreate the buffer with styling support only if libopcodes styling is enabled. This was an oversight in commit: commit 4cbe4ca5da5cd7e1e6331ce11f024bf3c07b9744 Date: Mon Feb 14 14:40:52 2022 +0000 gdb: add support for disassembler styling using libopcodes This commit: 1. Factors out some of the condition checking logic into two new helper functions use_ext_lang_for_styling and use_libopcodes_for_styling, 2. Reorders gdb_disassembler::m_buffer and gdb_disassembler::m_dest, this allows these fields to be initialised m_dest first, which means that the new condition checking functions can rely on m_dest being set, even when called from the gdb_disassembler constructor, 3. Make use of the new condition checking functions each time m_buffer is initialised, 4. Add a new test that forces the Python disassembler styling function to return None, this will cause GDB to disable use of Pygments for styling, and 5. When we reinitialise m_buffer, and re-disassemble the instruction, call reset the in-comment flag. If the instruction being disassembler ends in a comment then the first disassembly pass will have set the in-comment flag to true. This shouldn't be a problem as we will only be using Pygments, and thus performing a re-disassembly pass, if libopcodes is disabled, so the in-comment flag will never be checked, even if it is set incorrectly. However, I think that having the flag set correctly is a good thing, even if we don't check it (you never know what future uses might come up).
2022-10-02gdb/testsuite: extend styling test for libopcodes stylingAndrew Burgess1-30/+65
This commit extends the gdb.base/style.exp test to cover disassembler styling using libopcodes (where available). The test will try to enable libopcode based styling, if this works (because such styling is available) then some tests are run to check that the output is styled, and that the styling can be disabled using 'set style disassembler enabled off'. If libopcodes styling is not available on the current architecture then these new tests will be skipped. I've moved the Python Pygments module check inside the test_disable_disassembler_styling function now, so that the test will be run even when Python Pygments is not available, this allows the new tests discussed above. If the Pygments module is not available then the Pygments based tests will be skipped just as they were before.
2022-10-02gdb: update now gdbarch_register_name doesn't return nullptrAndrew Burgess16-56/+27
After the previous few commit, gdbarch_register_name no longer returns nullptr. This commit audits all the calls to gdbarch_register_name and removes any code that checks the result against nullptr. There should be no visible change after this commit.
2022-10-02gdb: final cleanup of various gdbarch_register_name methodsAndrew Burgess29-146/+66
Building on the previous commits, this commit goes through the various gdbarch_register_name methods and removes all the remaining 'return NULL' cases, I claim that these either couldn't be hit, or should be returning the empty string. In all cases the return of NULL was used if the register number being passed to gdbarch_register_name was "invalid", i.e. negative, or greater than the total number of declared registers. I don't believe either of these cases can occur, and the previous commit asserts that this is the case. As a result we can simplify the code by removing these checks. In many cases, where the register names are held in an array, I was able to add a static assert that the array contains the correct number of strings, after that, a direct access into the array is fine. I don't have any means of testing these changes.
2022-10-02gdb/csky: remove nullptr return from csky_pseudo_register_nameAndrew Burgess2-20/+11
Building on the previous commits, in this commit I remove two instances of 'return NULL' from csky_pseudo_register_name, and replace them with a return of the empty string. These two are particularly interesting, and worth pulling into their own commit, because these returns of NULL appear to be depended on within other parts of the csky code. In csky-linux-tdep.c in the register collect/supply code, GDB checks for the register name being nullptr in order to decide if a target supports a particular feature or not. I've updated the code to check for the empty string. I have no way of testing this change.