aboutsummaryrefslogtreecommitdiff
path: root/llvm/tools/llvm-mca
AgeCommit message (Collapse)AuthorFilesLines
2022-07-24[llvm] Remove redundaunt virtual specifiers (NFC)Kazu Hirata1-2/+2
Identified with modernize-use-override.
2022-07-13[llvm] Use value instead of getValue (NFC)Kazu Hirata1-1/+1
2022-07-12[MCA] Support multiple comma-separated -mattr featuresCullen Rhodes1-4/+14
Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D129479
2022-06-25[llvm] Don't use Optional::hasValue (NFC)Kazu Hirata1-1/+1
This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
2022-06-25Revert "Don't use Optional::hasValue (NFC)"Kazu Hirata1-2/+2
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25Don't use Optional::hasValue (NFC)Kazu Hirata1-2/+2
2022-06-24[MCA] Introducing incremental SourceMgr and resumable pipelineMin-Yih Hsu1-1/+2
The new resumable mca::Pipeline capability introduced in this patch allows users to save the current state of pipeline and resume from the very checkpoint. It is better (but not require) to use with the new IncrementalSourceMgr, where users can add mca::Instruction incrementally rather than having a fixed number of instructions ahead-of-time. Note that we're using unit tests to test these new features. Because integrating them into the `llvm-mca` tool will make too many churns. Differential Revision: https://reviews.llvm.org/D127083
2022-06-18[llvm] Use value_or instead of getValueOr (NFC)Kazu Hirata1-1/+1
2022-06-07[MC] De-capitalize MCStreamer functionsFangrui Song1-2/+2
Follow-up to c031378ce01b8485ba0ef486654bc9393c4ac024 . The class is mostly consistent now.
2022-06-03[tools] Forward declare classes & remove includesClemens Wasser1-1/+2
Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120208
2022-05-26[MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFCFangrui Song1-2/+2
2022-03-13[MCA] Moved six instruction flags from InstrDesc to InstructionBase.Patrick Holland2-14/+20
Differential Revision: https://reviews.llvm.org/D121508
2022-02-16[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`Shao-Ce SUN1-1/+1
Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846
2022-02-16Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`"Shao-Ce SUN1-1/+1
This reverts commit fe25c06cc5bdc2ef9427309f8ec1434aad69dc7a.
2022-02-16[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`Shao-Ce SUN1-1/+1
For ten years, it seems that `MCRegisterInfo` is not used by any target. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D119846
2022-02-11Cleanup MCParser headersserge-sans-paille1-0/+1
As usual with that header cleanup series, some implicit dependencies now need to be explicit: llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h Preprocessed lines to build llvm on my setup: after: 1068185081 before: 1068324320 So no compile time benefit to expect, but we still get the looser coupling between files which is great. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119359
2022-01-11[MCA] Switching from conservatively guessing which instructions arePatrick Holland3-9/+46
memory-barrier instructions to providing targets and developers a convenient way to explicitly declare which instructions are memory-barriers. Differential Revision: https://reviews.llvm.org/D116779
2022-01-08[llvm] Remove redundant member initialization (NFC)Kazu Hirata2-2/+2
Identified with readability-redundant-member-init.
2022-01-03Revert "[llvm] Remove redundant member initialization (NFC)"Kazu Hirata2-2/+2
This reverts commit fd4808887ee47f3ec8a030e9211169ef4fb094c3. This patch causes gcc to issue a lot of warnings like: warning: base class ‘class llvm::MCParsedAsmOperand’ should be explicitly initialized in the copy constructor [-Wextra]
2022-01-01[llvm] Remove redundant member initialization (NFC)Kazu Hirata2-2/+2
Identified with readability-redundant-member-init.
2021-12-13[llvm] Use llvm::reverse (NFC)Kazu Hirata1-2/+2
2021-12-07[MCA] Remove the warning about experimental support for in-order CPUAndrew Savonichev1-6/+1
There are not a lot of bug reports for this feature, so let's mark it stable. Differential Revision: https://reviews.llvm.org/D114701
2021-10-15[NFC] fix a typoShao-Ce SUN1-1/+1
2021-10-14[llvm-mca][timeline] Indicate output was stopped due to cycle limit.Daniel Sanders1-1/+3
It can be a bit confusing to stop with no explanation so we should indicate when further output was prevented by the cycle limit. Differential Revision: https://reviews.llvm.org/D111753
2021-10-08Move TargetRegistry.(h|cpp) from Support to MCReid Kleckner2-2/+2
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454
2021-08-27 [MCA][NFC] Removed unused method, and fixed a coverity issue.Andrea Di Biagio1-6/+0
The coverity issue was reported agaist class MCAOperand due to the lack of proper initialization for field Index. No functional change intended.
2021-08-25[MCA] Moved View.h and View.cpp from /tools/llvm-mca/ to /lib/MCA/.Patrick Holland12-74/+42
Moved View.h and View.cpp from /tools/llvm-mca/Views/ to /lib/MCA/ and /include/llvm/MCA/. This is so that targets can define their own Views within the /lib/Target/ directory (so that the View can use backend functionality). To enable these Views within mca, targets will need to add them to the vector of Views returned by their target's CustomBehaviour::getViews() methods. Differential Revision: https://reviews.llvm.org/D108520
2021-08-07[MCA] Simplify the rounding logic used in TimelineView::printWaitTimeEntry.Andrea Di Biagio1-7/+8
This is related to PR51392. Before this patch, the timeline view was rounding doubles to the first decimal, using a logic similar to this: ``` double AverageTime = (double)Input / CumulativeExecutions; double Result = floor((AverageTime * 10) + 0.5) / 10 ``` Here, Input and CumulativeExecutions are both unsigned integers. The last operation is what effectively performs the rounding of AverageTime. PR51392 has been raised because - under specific -m32 configurations of GCC - one of the timeline tests reports slighlty different values (due to a different rounding choice). This patch tries to minimise the propagation of floating-point error by hoisting the multiply by 10, so that it is performed on the unsigned. ``` double AverageTime = (double)(Input * 10) / CumulativeExecutions; floor(AverageTime + 0.5) / 10 ``` So we are trading a floating point multiply for a integer multiply (which can be expanded using a simple MUL or using an `ADD + LEA` sequence). This decrease in floating point operations executed should also help with decreasing the error in the computation.. Strictly speaking, that computation will always be potentially subject to error (depending on what values are passed in input). However, this patch should improve the situation and make bug like PR51392 less frequent.
2021-07-28[MCA] Moving the target specific CustomBehaviour impl. from /tools/llvm-mca/ ↵Patrick Holland7-169/+24
to /lib/Target/. Differential Revision: https://reviews.llvm.org/D106775
2021-07-16[llvm-mca][JSON] Store extra information about driver flags used for the ↵Marcos Horro3-7/+42
simulation Added information stored in PipelineOptions and the MCSubtargetInfo. Bug: https://bugs.llvm.org/show_bug.cgi?id=51041 Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D106077
2021-07-13[llvm-mca] [NFC] Formatting codeMarcos Horro13-50/+48
Applied clang-format to all files. Discarded BottleneckAnalysis.h 80-column width violation since it contains an example of report. Caught some typos and minor style details. Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D105900
2021-07-10[llvm-mca][JSON] Teach the PipelinePrinter how to deal with anonymous code ↵Andrea Di Biagio1-6/+12
regions (PR51008) This patch addresses the last remaining problems reported in PR51008. Previous fixes for PR51008 worked under the wrong assumption that code regions are always named (except maybe for the default region, which was automatically named "main"). In reality, it is quite common for users to declare multiple anonymous regions. So we cannot really use the region name as the key string of a JSON object. In practice, code region names are completely optional. Using "main" for the default region was also problematic because there can be another region with that same name. This patch fixes these issues by introducing a json::array of regions. Each region has a "Name" field, which would default to the empty string for anonymous regions. Added a few more tests to verify that the JSON file format is still valid, and that multiple anonymous regions all appear in the final output.
2021-07-10[llvm-mca][JSON] Further refactoring of the JSON printing logic.Andrea Di Biagio6-38/+32
This patch renames object "Resources" to "TargetInfo". Moved the getJSONTargetInfo method from class InstructionView to the PipelinePrinter. Removed uses of std::stringstream. Removed unused method View::printViewJSON().
2021-07-09[llvm-mca] Refactor the logic that prints JSON files.Andrea Di Biagio7-58/+86
Moved most of the printing logic into the PipelinePrinter. This patch also fixes the JSON output when flag -instruction-tables is specified.
2021-07-09[llvm-mca] Fix -Wunused-private-field after D105618Fangrui Song2-6/+3
2021-07-09[llvm-mca] Fix JSON format for multiple regionsMarcos Horro5-21/+47
Instead of printing each region individually when using JSON format, this patch creates a JSON object which is updated with the values of each region, printing them at the end. New test is added for JSON output with multiple regions. Bug: https://bugs.llvm.org/show_bug.cgi?id=51008 Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D105618
2021-07-07Revert "[MCA] [AMDGPU] Adding an implementation to AMDGPUCustomBehaviour for ↵Patrick Holland2-344/+2
handling s_waitcnt instructions." Build failures when building with shared libraries. Reverting until I can fix. Differential Revision: https://reviews.llvm.org/D104730
2021-07-07[MCA] [AMDGPU] Adding an implementation to AMDGPUCustomBehaviour for ↵Patrick Holland2-2/+344
handling s_waitcnt instructions. This commit also makes some slight changes to the scheduling model for AMDGPU to set the RetireOOO flag for all scheduling classes. This flag is only used by llvm-mca and allows instructions to retire out of order. See the differential link below for a deeper explanation of everything. Differential Revision: https://reviews.llvm.org/D104730
2021-07-01[llvm-mca] Fix JSON output (PR50922)Marcos Horro9-4/+30
Based on the discussion in PR50922, minor changes have been done to properly output a valid JSON. Removed "not implemented" keys. Differential Revision: https://reviews.llvm.org/D105064
2021-06-24[MCA] Allow unlimited cycles in the timeline viewJay Foad2-7/+7
Change --max-timeline-cycles=0 to mean no limit on the number of cycles. Use this in AMDGPU tests to show all instructions in the timeline view instead of having it arbitrarily truncated. Differential Revision: https://reviews.llvm.org/D104846
2021-06-23[MCA][TimelineView] Fixed a bug that was causing instructions outside of the ↵Patrick Holland1-0/+9
timeline-max-cycles to still be printed. Differential Revision: https://reviews.llvm.org/D104815
2021-06-22[MCA] [In-order pipeline] Fix for 0 latency instruction causing assertion to ↵Patrick Holland1-2/+0
fail. 0 latency instructions now get processed and retired properly within the in-order pipeline. Had to fix a bug within TimelineView.cpp as well that would show up when a 0 latency instruction was the first instruction in the source. Differential Revision: https://reviews.llvm.org/D104675
2021-06-16[MCA] Anchoring the vtable of CustomBehaviourMin-Yih Hsu1-0/+1
Put the dtor of mca::CustomBehaviour into the cpp file to avoid undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared library. Differential Revision: https://reviews.llvm.org/D104401
2021-06-16Reapply "[MCA] Adding the CustomBehaviour class to llvm-mca".Patrick Holland7-1/+185
The original change was pushed in main as commit f7a23ecece52. It was then reverted by commit a04f01bab2 because it caused linker failures on buildbots that don't build the AMDGPU target. -- Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. More details are available in the original commit log message (f7a23ecece52). Differential Revision: https://reviews.llvm.org/D104149
2021-06-15Revert "[MCA] Adding the CustomBehaviour class to llvm-mca"Andrea Di Biagio7-178/+1
This reverts commit f7a23ecece524564a0c3e09787142cc6061027bb. It appears to breaks buildbots that don't build the AMDGPU backend.
2021-06-15[MCA] Adding the CustomBehaviour class to llvm-mcaPatrick Holland7-1/+178
Some instructions are not defined well enough within the target’s scheduling model for llvm-mca to be able to properly simulate its behaviour. The ideal solution to this situation is to modify the scheduling model, but that’s not always a viable strategy. Maybe other parts of the backend depend on that instruction being modelled the way that it is. Or maybe the instruction is quite complex and it’s difficult to fully capture its behaviour with tablegen. The CustomBehaviour class (which I will refer to as CB frequently) is designed to provide intuitive scaffolding for developers to implement the correct modelling for these instructions. Implementation details: llvm-mca does its best to extract relevant register, resource, and memory information from every MCInst when lowering them to an mca::Instruction. It then uses this information to detect dependencies and simulate stalls within the pipeline. For some instructions, the information that gets captured within the mca::Instruction is not enough for mca to simulate them properly. In these cases, there are two main possibilities: 1. The instruction has a dependency that isn’t detected by mca. 2. mca is incorrectly enforcing a dependency that shouldn’t exist. For the rest of this discussion, I will be focusing on (1), but I have put some thought into (2) and I may revisit it in the future. So we have an instruction that has dependencies that aren’t picked up by mca. The basic idea for both pipelines in mca is that when an instruction wants to be dispatched, we first check for register hazards and then we check for resource hazards. This is where CB is injected. If no register or resource hazards have been detected, we make a call to CustomBehaviour::checkCustomHazard() to give the target specific CB the chance to detect and enforce any custom dependencies. The return value for checkCustomHazaard() is an unsigned int representing the (minimum) number of cycles that the instruction needs to stall for. It’s fine to underestimate this value because when StallCycles gets down to 0, we’ll end up checking for all the hazards again before the instruction is actually dispatched. However, it’s important not to overestimate the value and the more accurate your estimate is, the more efficient mca’s execution can be. In general, for checkCustomHazard() to be able to detect these custom dependencies, it needs information about the current instruction and also all of the instructions that are still executing within the pipeline. The mca pipeline uses mca::Instruction rather than MCInst and the current information encoded within each mca::Instruction isn’t sufficient for my use cases. I had to add a few extra attributes to the mca::Instruction class and have them get set by the MCInst during instruction building. For example, the current mca::Instruction doesn’t know its opcode, and it also doesn’t know anything about its immediate operands (both of which I had to add to the class). With information about the current instruction, a list of all currently executing instructions, and some target specific objects (MCSubtargetInfo and MCInstrInfo which the base CB class has references to), developers should be able to detect and enforce most custom dependencies within checkCustomHazard. If you need more information than is present in the mca::Instruction, feel free to add attributes to that class and have them set during the lowering sequence from MCInst. Fortunately, in the in-order pipeline, it’s very convenient for us to pass these arguments to checkCustomHazard. The hazard checking is taken care of within InOrderIssueStage::canExecute(). This function takes a const InstRef as a parameter (representing the instruction that currently wants to be dispatched) and the InOrderIssueStage class maintains a SmallVector<InstRef, 4> which holds all of the currently executing instructions. For the out-of-order pipeline, it’s a bit trickier to get the list of executing instructions and this is why I have held off on implementing it myself. This is the main topic I will bring up when I eventually make a post to discuss and ask for feedback. CB is a base class where targets implement their own derived classes. If a target specific CB does not exist (or we pass in the -disable-cb flag), the base class is used. This base class trivially returns 0 from its checkCustomHazard() implementation (meaning that the current instruction needs to stall for 0 cycles aka no hazard is detected). For this reason, targets or users who choose not to use CB shouldn’t see any negative impacts to accuracy or performance (in comparison to pre-patch llvm-mca). Differential Revision: https://reviews.llvm.org/D104149
2021-05-31[MCA][NFCI] Minor changes to InstrBuilder and Instruction.Andrea Di Biagio1-1/+1
This is based on the assumption that most simulated instructions don't define more than one or two registers. This is true for example on x86, where most instruction definitions don't declare more than one register write. The default code region size has been increased from 8 to 16. This is based on the assumption that, for small microbenchmarks, the typical code snippet size is often less than 16 instructions. mca::Instruction now uses bitfields to pack flags. No functional change intended.
2021-05-27[MCA] Refactor the InOrderIssueStage stage. NFCIAndrea Di Biagio1-2/+0
Moved the logic that checks for RAW hazards from the InOrderIssueStage to the RegisterFile. Changed how the InOrderIssueStage keeps track of backend stalls. Stall events are now generated from method notifyStallEvent(). No functional change intended.
2021-05-23[MC] Refactor MCObjectFileInfo initialization and allow targets to create ↵Philipp Krones1-4/+4
MCObjectFileInfo This makes it possible for targets to define their own MCObjectFileInfo. This MCObjectFileInfo is then used to determine things like section alignment. This is a follow up to D101462 and prepares for the RISCV backend defining the text section alignment depending on the enabled extensions. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101921
2021-05-19[MCA] llvm-mca MCTargetStreamer segfault fixPatrick Holland3-4/+35
In order to create the code regions for llvm-mca to analyze, llvm-mca creates an AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions(). Within this function, both an MCAsmParser and MCTargetAsmParser are created so that MCAsmParser::Run() can be used to create the code regions for us. These parser classes were created for llvm-mc so they are designed to emit code with an MCStreamer and MCTargetStreamer that are expected to be setup and passed into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any code, an MCStreamerWrapper class gets created instead and passed into the MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides many of the emit methods to just do nothing. The exception is the emitInstruction() method which calls Regions.addInstruction(Inst). This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build our code regions, however there are a few directives which rely on the MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of those directives, a segfault will occur. In x86, each one of these 7 directives will cause this segfault if they exist in the input assembly to llvm-mca: .cv_fpo_proc .cv_fpo_setframe .cv_fpo_pushreg .cv_fpo_stackalloc .cv_fpo_stackalign .cv_fpo_endprologue .cv_fpo_endproc I haven’t looked at other targets, but I wouldn’t be surprised if some of the other ones also have certain directives which could result in this same segfault. My proposed solution is to simply initialize an MCTargetStreamer after we initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream object, but we don’t actually want any of these directives to be emitted anywhere, so I use an ostream created with the nulls() function. Since this needs to happen after the MCStreamerWrapper has been initialized, it needs to happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize within the main() function of llvm-mca. So this MCInstPrinter gets constructed within main() then passed into the parseCodeRegions() function as a parameter. (If you feel like it would be appropriate and possible to create the MCInstPrinter within the parseCodeRegions() function, then feel free to modify my solution. That would stop us from having to pass it into the function and would limit its scope / lifetime.) My solution stops the segfault from happening and still passes all of the current (expected) llvm-mca tests. I also added a new test for x86 that checks for this segfault on an input that includes one of the .cv_fpo directives (this test fails without my solution, but passes with it). As far as I can tell, all of the functions that I modified are only called from within llvm-mca so there shouldn’t be any worries about breaking other tools. Differential Revision: https://reviews.llvm.org/D102709