aboutsummaryrefslogtreecommitdiff
path: root/llvm/tools/llvm-mca
AgeCommit message (Collapse)AuthorFilesLines
2025-07-04[llvm] Use llvm::fill instead of std::fill(NFC) (#146911)Austin3-7/+6
Use llvm::fill instead of std::fill
2025-05-25MCParser: Replace deprecated alias MCAsmLexer with AsmLexerFangrui Song1-1/+1
2025-05-25Replace #include MCAsmLexer.h with AsmLexer.hFangrui Song1-1/+1
MCAsmLexer.h has been made a forwarder header since #134207
2025-05-21[llvm] Use *Map::try_emplace (NFC) (#140843)Kazu Hirata1-1/+1
try_emplace can default-construct values, so we do not need to do so on our own. Plus, try_emplace(Key) is much shorter than insert(std::make_pair(Key, Value()).
2025-05-20[llvm-mca] Drop const from a return type (NFC) (#140836)Kazu Hirata2-2/+2
2025-05-18[llvm] Remove unused local variables (NFC) (#140422)Kazu Hirata1-2/+0
2025-03-25[MCA] Extend -instruction-tables option with verbosity levels (#130574)Julien Villette3-66/+209
Option becomes: -instruction-tables=`<level>` The choice of `<level>` controls number of printed information. `<level>` may be `none` (default), `normal`, `full`. Note: If the option is used without `<label>`, default is `normal` (legacy). When `<level>` is `full`, additional information are: - `<Bypass Latency>`: Latency when a bypass is implemented between operands in pipelines (see SchedReadAdvance). - `<LLVM Opcode Name>`: mnemonic plus operands identifier. - `<Resources units>`: Used resources associated with LLVM Opcode. - `<instruction comment>`: reports comment if any from source assembly. Level `full` can be used to better check scheduling info when TableGen is modified. LLVM Opcode name help to find right instruction regexp to fix TableGen Scheduling Info. -instruction-tables=full option is validated on AArch64/Neoverse/V1-sve-instructions.s Follow up of MR #126703 --------- Co-authored-by: Julien Villette <julien.villette@sipearl.com>
2025-03-04[llvm-mca] Avoid repeated hash lookups (NFC) (#129656)Kazu Hirata1-3/+2
2025-03-02[MC] Move MIPS-specific gprel/tprel/dtprel from MCStreamer to MipsTargetStreamerFangrui Song1-1/+0
https://reviews.llvm.org/D23669 inappropriately added MIPS-specific dtprel/tprel directives to MCStreamer. In addition, llvm-mc -filetype=null parsing these directives will crash. This patch moves these functions to MipsTargetStreamer and fixes -filetype=null. gprel32 and gprel64, called by AsmPrinter, are moved to MCTargetStreamer.
2025-01-31[MCA] Do not allocate space for DependenceEdge by default in ↵Anton Sidorenko1-1/+4
DependencyGraphNode (NFC) (#125080) For each instruction from the input assembly sequence, DependencyGraph has a dedicated node (DGNode). Outgoing edges (data, resource and memory dependencies) are tracked as SmallVector<..., 8> for each DGNode in the graph. However, it's unlikely that a usual input instruction will have approximately eight dependent instructions. Below is my statistics for several RISC-V input sequences: ``` Number of | Number of nodes with edges | this # of edges --------------------------------- 0 | 8239447 1 | 464252 2 | 6164 3 | 6783 4 | 939 5 | 500 6 | 545 7 | 116 8 | 2 9 | 1 10 | 1 ``` Approximately the same distribution is produced by llvm-mca lit tests for X86, AArch and RISC-V (even modified ones with extra dependencies added). On a rather big input asm sequences, the use of SmallVector<..., 8> dramatically increases memory consumption without any need for it. In my case, replacing it with SmallVector<...,0> reduces memory usage by ~28% or ~1700% of input file size (2.2GB in absolute values). There is no change in execution time, I verified it on mca lit-tests and on my big test (execution time is ~30s in both cases). This change was made with the same intention as #124904 and optimizes I believe quite an unusual scenario. However, if there is no negative impact on other known scenarios, I'd like to have the change in llvm-project.
2025-01-31[MCA] Optimize memory consumption in resource pressure view (NFC) (#124904)Anton Sidorenko2-20/+62
ResourceUsage is a very sparse table. On large input asm sequences it consumes a lot of memory utilizing only a few percents of it (~4% on my benchmark). Reorganization of ResourceUsage to keep only used fields allows saving up to 18% of total memory use by mca or ~850% of input file size (~1.1GB in absolute values in my case).
2024-08-19[llvm-mca] Add bottle-neck analysis to JSON output. (#90056)Phil Camp2-1/+48
This patch implements the bottle-neck analysis data in the JSON dump mode.
2024-07-21[MC] Remove unnecessary isVerboseAsm from createAsmTargetStreamerFangrui Song1-2/+1
2024-06-19[llvm-mca] Use llvm::erase_if (NFC) (#96029)Kazu Hirata1-5/+3
2024-05-22[llvm-mca] Add command line option -call-latency (#92958)Chinmay Deshpande1-1/+6
Currently we assume a constant latency of 100 cycles for call instructions. This commit allows the user to specify a custom value for the same as a command line argument. Default latency is set to 100.
2024-05-07[llvm-mca] Abort on parse error without -skip-unsupported-instructions (#90474)Peter Waller3-24/+70
[llvm-mca] Abort on parse error without -skip-unsupported-instructions Prior to this patch, llvm-mca would continue executing after parse errors. These errors can lead to some confusion since some analysis results are printed on the standard output, and they're printed after the errors, which could otherwise be easy to miss. However it is still useful to be able to continue analysis after errors; so extend the recently added -skip-unsupported-instructions to support this. Two tests which have parse errors for some of the 'RUN' branches are updated to use -skip-unsupported-instructions so they can remain as-is. Add a description of -skip-unsupported-instructions to the llvm-mca command guide, and add it to the llvm-mca --help output: ``` --skip-unsupported-instructions=<value> - Force analysis to continue in the presence of unsupported instructions =none - Exit with an error when an instruction is unsupported for any reason (default) =lack-sched - Skip instructions on input which lack scheduling information =parse-failure - Skip lines on the input which fail to parse for any reason =any - Skip instructions or lines on input which are unsupported for any reason ``` Tests within this patch are intended to cover each of the cases. Reason | Flag | Comment --------------|------|------- none | none | Usual case, existing test suite lack-sched | none | Advises user to use -skip-unsupported-instructions=lack-sched, tested in llvm/test/tools/llvm-mca/X86/BtVer2/unsupported-instruction.s parse-failure | none | Advises user to use -skip-unsupported-instructions=parse-failure, tested in llvm/test/tools/llvm-mca/bad-input.s any | none | (N/A, covered above) lack-sched | any | Continues, prints warnings, tested in llvm/test/tools/llvm-mca/X86/BtVer2/unsupported-instruction.s parse-failure | any | Continues, prints errors, tested in llvm/test/tools/llvm-mca/bad-input.s lack-sched | parse-failure | Advises user to use -skip-unsupported-instructions=lack-sched, tested in llvm/test/tools/llvm-mca/X86/BtVer2/unsupported-instruction.s parse-failure | lack-sched | Advises user to use -skip-unsupported-instructions=parse-failure, tested in llvm/test/tools/llvm-mca/bad-input.s none | * | This would be any test case with skip-unsupported-instructions, coverage added in llvm/test/tools/llvm-mca/X86/BtVer2/simple-test.s any | * | (Logically covered by the other cases)
2024-04-29[llvm-mca] Add -skip-unsupported-instructions option (#89733)Peter Waller2-5/+49
Prior to this patch, if llvm-mca encountered an instruction which parses but has no scheduler info, the instruction is always reported as unsupported, and llvm-mca halts with an error. However, it would still be useful to allow MCA to continue even in the case of instructions lacking scheduling information. Obviously if scheduling information is lacking, it's not possible to give an accurate analysis for those instructions, and therefore a warning is emitted. A user could previously have worked around such unsupported instructions manually by deleting such instructions from the input, but this provides them a way of doing this for bulk inputs where they may not have a list of such unsupported instructions to drop up front. Note that this behaviour of instructions with no scheduling information under -skip-unsupported-instructions is analagous to current instructions which fail to parse: those are currently dropped from the input with a message printed, after which the analysis continues. ~Testing the feature is a little awkward currently, it relies on an instruction which is currently marked as unsupported, which may not remain so; should the situation change it would be necessary to find an alternative unsupported instruction or drop the test.~ A test is added to check that analysis still reports an error if all instructions are removed from the input, to mirror the current behaviour of giving an error if no instructions are supplied.
2024-04-11[llvm-mca] Remove spurious include_directories() (#88277)Thomas Preud'homme1-2/+0
llvm-mca does not have an include directory so this commit removes the spurious include_directories directive.
2024-03-10Add llvm::min/max_element and use it in llvm/ and mlir/ directories. (#84678)Justin Lebar2-5/+5
For some reason this was missing from STLExtras.
2023-11-11[llvm] Stop including llvm/ADT/DenseMap.h (NFC)Kazu Hirata1-1/+0
Identified with clangd.
2023-08-24[TableGen] Rename ResourceCycles and StartAtCycle to clarify semanticsMichael Maitland3-6/+6
D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the fact that resource allocation is now represented as an interval, relatively to the issue cycle of the instruction. This patch implements that TODO. This naming clarifies how to use these fields in the scheduler. In addition it was confusing that `StartAtCycle` was singular but `Cycles` was plural. This renaming fixes this inconsistency. This commit as previously reverted since it missed renaming that came down after rebasing. This version of the commit fixes those problems. Differential Revision: https://reviews.llvm.org/D158568
2023-08-24Revert "[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics"Michael Maitland3-6/+6
This reverts commit 5b854f2c23ea1b000cb4cac4c0fea77326c03d43. Build still failing.
2023-08-24[TableGen] Rename ResourceCycles and StartAtCycle to clarify semanticsMichael Maitland3-6/+6
D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the fact that resource allocation is now represented as an interval, relatively to the issue cycle of the instruction. This patch implements that TODO. This naming clarifies how to use these fields in the scheduler. In addition it was confusing that `StartAtCycle` was singular but `Cycles` was plural. This renaming fixes this inconsistency. This commit as previously reverted since it missed renaming that came down after rebasing. This version of the commit fixes those problems. Differential Revision: https://reviews.llvm.org/D158568
2023-08-24Revert "[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics"Michael Maitland3-6/+6
This reverts commit 030d33409568b2f0ea61116e83fd40ca27ba33ac. This commit is causing build failures
2023-08-24[TableGen] Rename ResourceCycles and StartAtCycle to clarify semanticsMichael Maitland3-6/+6
D150312 added a TODO: TODO: consider renaming the field `StartAtCycle` and `Cycles` to `AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the fact that resource allocation is now represented as an interval, relatively to the issue cycle of the instruction. This patch implements that TODO. This naming clarifies how to use these fields in the scheduler. In addition it was confusing that `StartAtCycle` was singular but `Cycles` was plural. This renaming fixes this inconsistency. Differential Revision: https://reviews.llvm.org/D158568
2023-07-06[llvm-mca][RISCV] vsetivli and vsetvli act as instrumentsMichael Maitland3-46/+88
Since the LMUL data that is needed to create an instrument is avaliable statically from vsetivli and vsetvli instructions, LMUL instruments can be automatically generated so that clients of the tool do no need to manually insert instrument comments. Instrument comments may be placed after a vset{i}vli instruction, which will override instrument that was automatically inserted. As a result, clients of llvm-mca instruments do not need to update their existing instrument comments. However, if the instrument has the same LMUL as the vset{i}vli, then it is reccomended to remove the instrument comment as it becomes redundant. Differential Revision: https://reviews.llvm.org/D154526
2023-06-19[llvm-mca][TimelineView] Skip invalid entries when printing the json output.Andrea Di Biagio1-0/+4
2023-06-03[llvm-mca] Modernize MCACommentConsumer (NFC)Kazu Hirata1-2/+2
2023-05-22[llvm-mca] Print InstructionInfoView using Instrument information.Michael Maitland3-6/+21
Previous reports calculated the overall report using Instrument information but did not print out per-instruction data using Instrument information. This patch fixes that. Differential Revision: https://reviews.llvm.org/D150459
2023-05-22[llvm-mca][RISCV] Fix llvm-mca RISCVInstrument memory leakMichael Maitland4-15/+19
There was a memory leak that presented itself once the llvm-mca tests were committed. This leak was not checked for by the pre-commit tests. This change changes the shared_ptr to a unique_ptr to avoid this problem. We will know that this fix works once committed since I don't know whether it is possible to force a lit test to use LSan. I spent the day trying to build llvm with LSan enabled without much luck. If anyone knows how to build llvm with LSan for the lit-tests, I am happy to give it another try locally. Differential Revision: https://reviews.llvm.org/D150816
2023-05-03[llvm-mca] Fix duplicate symbols errorMichael Maitland1-8/+11
Parsing instruments and analysis regions causes us to see the same labels two times since we parse the same file twice under the same context. This change creates a seperate context for instrument parsing and another for analysis region parsing. I will post a follow up commit once I get some free cycles to parse analysis regions and instruments in one parsing pass under a single context. Differential Revision: https://reviews.llvm.org/D149781
2023-03-15[llvm] Use *{Map,Set}::contains (NFC)Kazu Hirata2-4/+4
2023-03-15[ADT] Allow `llvm::enumerate` to enumerate over multiple rangesJakub Kuderski1-7/+2
This does not work by a mere composition of `enumerate` and `zip_equal`, because C++17 does not allow for recursive expansion of structured bindings. This implementation uses `zippy` to manage the iteratees and adds the stream of indices as the first zipped range. Because we have an upfront assertion that all input ranges are of the same length, we only need to check if the second range has ended during iteration. As a consequence of using `zippy`, `enumerate` will now follow the reference and lifetime semantics of the `zip*` family of functions. The main difference is that `enumerate` exposes each tuple of references through a new tuple-like type `enumerate_result`, with the familiar `.index()` and `.value()` member functions. Because the `enumerate_result` returned on dereference is a temporary, enumeration result can no longer be used through an lvalue ref. Reviewed By: dblaikie, zero9178 Differential Revision: https://reviews.llvm.org/D144503
2023-03-14[llvm] Use *{Set,Map}::contains (NFC)Kazu Hirata2-2/+2
2023-02-10[NFC][TargetParser] Replace uses of llvm/Support/Host.hArchibald Elliott1-1/+1
The forwarding header is left in place because of its use in `polly/lib/External/isl/interface/extract_interface.cc`, but I have added a GCC warning about the fact it is deprecated, because it is used in `isl` from where it is included by Polly.
2023-01-28Use llvm::count{lr}_{zero,one} (NFC)Kazu Hirata2-2/+2
2022-12-20[Support] Move TargetParsers to new componentArchibald Elliott1-0/+1
This is a fairly large changeset, but it can be broken into a few pieces: - `llvm/Support/*TargetParser*` are all moved from the LLVM Support component into a new LLVM Component called "TargetParser". This potentially enables using tablegen to maintain this information, as is shown in https://reviews.llvm.org/D137517. This cannot currently be done, as llvm-tblgen relies on LLVM's Support component. - This also moves two files from Support which use and depend on information in the TargetParser: - `llvm/Support/Host.{h,cpp}` which contains functions for inspecting the current Host machine for info about it, primarily to support getting the host triple, but also for `-mcpu=native` support in e.g. Clang. This is fairly tightly intertwined with the information in `X86TargetParser.h`, so keeping them in the same component makes sense. - `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains the target triple parser and representation. This is very intertwined with the Arm target parser, because the arm architecture version appears in canonical triples on arm platforms. - I moved the relevant unittests to their own directory. And so, we end up with a single component that has all the information about the following, which to me seems like a unified component: - Triples that LLVM Knows about - Architecture names and CPUs that LLVM knows about - CPU detection logic for LLVM Given this, I have also moved `RISCVISAInfo.h` into this component, as it seems to me to be part of that same set of functionality. If you get link errors in your components after this patch, you likely need to add TargetParser into LLVM_LINK_COMPONENTS in CMake. Differential Revision: https://reviews.llvm.org/D137838
2022-12-20[Support] Move Target/CPU Printing out of CommandLineArchibald Elliott1-0/+3
This change is rather more invasive than intended. The main intention here is to make CommandLine.cpp not rely on llvm/Support/Host.h. Right now, this reliance is only in 3 superficial places: - Choosing how to expand response files (in two places) - Printing the default triple and current CPU in `--version` output. The built in version system has a method for adding "extra version printers", commonly used by several tools (such as llc) to report the registered targets in the built version of LLVM. It was reasonably easy to move the logic for printing the default triple and current CPU into a similar function, and register it with any relevant binaries. The incompatible change here is that now, even if LLVM_VERSION_PRINTER_SHOW_HOST_TARGET_INFO is defined, most binaries will no longer print out the default target triple and cpu when provided with `--version`, for instance llvm-as and llvm-dis. This breakage is intended, but the changes in this patch keep printing the default target and detected in `llc` and `opt` as these were remarked as important binaries in the LLVM install. The change to expanding response files may also be controversial, but I believe that these macros should correspond exactly to the host triple introspection used before. Differential Revision: https://reviews.llvm.org/D137837
2022-12-17std::optional::value => operator*/operator->Fangrui Song1-1/+1
value() has undesired exception checking semantics and calls __throw_bad_optional_access in libc++. Moreover, the API is unavailable without _LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). This fixes check-llvm.
2022-12-14[tools] llvm::Optional => std::optionalFangrui Song1-1/+1
2022-12-09[Alignment] Use Align in MCStreamer::emitCommonSymbolGuillaume Chatelet1-1/+1
Next patch after D139548 and D139439. Same expectations, the change seems safe with as far as llvm goes, we cannot check downstream implementations. Differential Revision: https://reviews.llvm.org/D139614
2022-12-07[reland][Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbolGuillaume Chatelet1-1/+1
Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`. I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined. And currently, all calls to `emitLocalCommonSymbol` are provably `>0`. Differential Revision: https://reviews.llvm.org/D139439
2022-12-07Revert D139439 "[Alignment] Use Align in MCStreamer ↵Guillaume Chatelet1-1/+1
emitZeroFill/emitLocalCommonSymbol" This breaks Windows bots with `warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)` Some shift operators are lacking a proper literal unit ('1ULL' instead of '1'). Will reland once fixed. This reverts commit c621c1a8e81856e6bf2be79714767d80466e9ede.
2022-12-07[Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbolGuillaume Chatelet1-1/+1
Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`. I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined. And currently, all calls to `emitLocalCommonSymbol` are provably `>0`. Differential Revision: https://reviews.llvm.org/D139439
2022-11-22[llvm-mca] Fix class dominance warnings for parseCodeRegionsMichael Maitland1-0/+10
Fixes issue [59091](https://github.com/llvm/llvm-project/issues/59091). `CodeRegionGenerator::parseCodeRegions` is implemented by `AsmCodeRegionGenerator`. If it were to be implemented in `AnalysisRegionGenerator` or `InstrumentRegionGenerator`, then `parseCodeRegions` from an `AsmAnalysisRegionGenerator` or `AsmInstrumentRegionGenerator` object would be ambiguous. To solve this, `AsmAnalysisRegionGenerator` and `AsmInstrumentRegionGenerator` qualify their call to `AsmCodeRegionGenerator::parseCodeRegions`. Differential Revision: https://reviews.llvm.org/D138462
2022-11-18[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on RISCVMichael Maitland5-95/+444
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind of information impacts how the instruction takes to execute and what dependencies this may cause. On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or vl, in addition with the instruction itself. But MCA does not track or use the data in these registers. This patch fixes this problem by introducing Instruments into MCA. * Replace `CodeRegions` with `AnalysisRegions` * Add `Instrument` and `InstrumentManager` * Add `InstrumentRegions` * Add RISCV Instrument and `InstrumentManager` * Parse `Instruments` in driver * Use instruments to override schedule class * RISCV use lmul instrument to override schedule class * Fix unit tests to pass empty instruments * Add -ignore-im clopt to disable this change A prior version of this patch was commited in 5e82ee537321. 2323a4ee610f reverted that change because the unit test files caused build errors. The change with fixes were committed in b88b8307bf9e but reverted once again e8e92c8313a0 due to more build errors. This commit adds the prior changes and fixes the build error. Differential Revision: https://reviews.llvm.org/D137440
2022-11-15Revert "[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate ↵Michael Maitland5-444/+95
reports on RISCV" This reverts commit b88b8307bf9e24f53e7ef3052abf2c506ff55fd2.
2022-11-15[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on RISCVMichael Maitland5-95/+444
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind of information impacts how the instruction takes to execute and what dependencies this may cause. On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or vl, in addition with the instruction itself. But MCA does not track or use the data in these registers. This patch fixes this problem by introducing Instruments into MCA. * Replace `CodeRegions` with `AnalysisRegions` * Add `Instrument` and `InstrumentManager` * Add `InstrumentRegions` * Add RISCV Instrument and `InstrumentManager` * Parse `Instruments` in driver * Use instruments to override schedule class * RISCV use lmul instrument to override schedule class * Fix unit tests to pass empty instruments * Add -ignore-im clopt to disable this change A prior version of this patch was commited in. It was reverted in 5e82ee5373211db8522181054800ccd49461d9d8. 2323a4ee610f5e1db74d362af4c6fb8c704be8f6 reverted that change because the unit test files caused build errors. This commit adds the original changes and the fixed test files. Differential Revision: https://reviews.llvm.org/D137440
2022-11-15Revert "[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate ↵Michael Maitland5-445/+95
reports on RISCV" This reverts commit 5e82ee5373211db8522181054800ccd49461d9d8.
2022-11-15[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on RISCVMichael Maitland5-95/+445
On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind of information impacts how the instruction takes to execute and what dependencies this may cause. On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or vl, in addition with the instruction itself. But MCA does not track or use the data in these registers. This patch fixes this problem by introducing Instruments into MCA. * Replace `CodeRegions` with `AnalysisRegions` * Add `Instrument` and `InstrumentManager` * Add `InstrumentRegions` * Add RISCV Instrument and `InstrumentManager` * Parse `Instruments` in driver * Use instruments to override schedule class * RISCV use lmul instrument to override schedule class * Fix unit tests to pass empty instruments * Add -ignore-im clopt to disable this change Differential Revision: https://reviews.llvm.org/D137440