riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2023-06-28	[MISched] Fix bug(s) in bottom-up scheduling.	Francesco Petrogalli	1	-3/+3
	BUG 1 - choosing the right cycle when booking a resource. --------------------------------------------------------- Bottom up scheduling should take in account the current cycle at the scheduling boundary when determing at what cycle a resource can be issued. Supposed the schedule boundary is at cycle `C`, and that we want to check at what cycle a 3 cycles resource can be instantiated. We have two cases: A, in which the last seen resource cycle LSRC in which the resource is known to be used is more than oe euqual to 3 cycles away from current cycle `C`, (`C - LSRC >=3`) and B in which the LSRC is less than 3 cycles away from C (`C - LSRC < 3`). Note that, in bottom-up scheduling LRS is always smaller or eaual to the current cycle `C`. The two cases can be schematized as follow: ``` ... \| C + 1 \| C \| C - 1 \| C - 2 \| C - 3 \| C - 4 \| ... \| \| \| \| \| \| LSRC \| -> Case A \| \| \| \| LSRC \| \| \| -> Case B // Before allocating the resource LSRC(A) = C - 4 LSRC(B) = C - 2 ``` In case A, the scheduler sees cycles `C`, `C-1` and `C-2` being available for booking the 3-cycles resource. Therefore the LSRC can be updated to be `C`, and the resource can be scheduled from cycle `C` (the `X` in the table): ``` ... \| C + 1 \| C \| C - 1 \| C - 2 \| C - 3 \| C - 4 \| ... \| \| X \| X \| X \| \| \| -> Case A // After allocating the resource LSRC(A) = C ``` In case B, the 3-cycle resource usage would clash with the LSRC if allocated starting from cycle C: ``` ... \| C + 1 \| C \| C - 1 \| C - 2 \| C - 3 \| C - 4 \| ... \| \| X \| X \| X \| \| \| -> clash at cycle C - 2 \| \| \| \| LSRC \| \| \| -> Case B ``` Therefore, the cycle in which the resource can be scheduled needs to be greater than `C`. For the example, the resource is booked in cycle `C + 1`. ``` ... \| C + 1 \| C \| C - 1 \| C - 2 \| C - 3 \| C - 4 \| ... \| X \| X \| X \| \| \| \| // After allocating the resource LSRC(B) = C + 1 ``` The behavior we need to correctly support cases A and B is obtained by computing the next value of the LSRC as the maximum between: 1. the current cycle `C`; 2. and the previous LSRC plus the number of cycle CYCLES the resource will need. In formula: ``` LSRC(next) = max(C, LSRC(previous) + CYCLES) ``` BUG 2 - booking the resource for the correct number of cycles. -------------------------------------------------------------- When storing the next LSRC, the funcion `getNextResourceCycle` was being invoked setting to 0 the number of cycles a resource was using. The invocation of `getNextResourceCycle` is now using the values of `Cycles` instead of 0. Effects on code generation -------------------------- This fix have effects only on AArch64, for the Cortex-A55 scheduling model (`-mcpu=cortex-a55`). The changes in the MIR tests caused by this patch show that the value now reported by `getNextResourceCycle` is correct. Other cortex-a55 tests have been touched by this change, where some instructions have been swapped. The final generated code is equivalent in term of the total number of cycles. The test `llvm/test/CodeGen/AArch64/misched-detail-resource-booking-02.mir` shows in details the correctness of the bottom up scheduling, and the effect on the codegen change that are visible in the test `llvm/test/CodeGen/AArch64/aarch64-smull.ll`. Reviewed By: andreadb, dmgreen Differential Revision: https://reviews.llvm.org/D153117
2023-06-20	[llc][MISched] Add `-misched-detail-resource-booking` to llc.	Francesco Petrogalli	1	-1/+17
	The option `-misched-detail-resource-booking` prints the following information every time the method `SchedBoundary::getNextResourceCycle` is invoked: 1. counters of the resources that have already been booked; 2. the values returned by `getNextResourceCycle`, which is the next available cycle in which a resource can be booked. The method is useful to debug low-level checks inside the machine scheduler that make decisions based on the values returned by `getNextResourceCycle`. Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D153116
2023-06-20	Revert "[llc][MISched] Add `-misched-detail-resource-booking` to llc."	Francesco Petrogalli	1	-17/+1
	Reverting because of https://lab.llvm.org/buildbot#builders/75/builds/32485: llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp:2374:7: error: use of undeclared identifier 'MischedDetailResourceBooking' if (MischedDetailResourceBooking) This reverts commit fc06262c1c365777e71207b6a5de281cba927c96.
2023-06-20	[llc][MISched] Add `-misched-detail-resource-booking` to llc.	Francesco Petrogalli	1	-1/+17
	The option `-misched-detail-resource-booking` prints the following information every time the method `SchedBoundary::getNextResourceCycle` is invoked: 1. counters of the resources that have already been booked; 2. the values returned by `getNextResourceCycle`, which is the next available cycle in which a resource can be booked. The method is useful to debug low-level checks inside the machine scheduler that make decisions based on the values returned by `getNextResourceCycle`. Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D153116
2023-06-13	[MISched][scheduleDump] Use stable_sort to prevent test failures.	Francesco Petrogalli	1	-14/+14
	When building the compiler with -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON, sometimes resources that are dumped in scheduled traces gets reordered even if they are booked in the same cycle. Using `stable_sort` guarantees that such occasional reordering does not happen. This change should fix failures like the one seen in https://lab.llvm.org/buildbot/#/builders/16/builds/49592. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152800
2023-06-12	[MISched] Use StartAtCycle in trace dumps.	Francesco Petrogalli	1	-14/+46
	This commit re-work the methods that dump traces with resource usage to take into account the StartAtCycle value added by https://reviews.llvm.org/D150310. For each i, the values of the lists StartAtCycle and ReservedCycles is are printed with the interval [StartAtCycle[i], ReservedCycles[i]) ``` ... \| StartAtCycle[i] \| ... \| ReservedCycles[i] - 1 \| ReservedCycles[i] \| ... \| xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \| \| ``` Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D150311
2023-06-09	[CodeGen] Fix a warning in release builds	Kazu Hirata	1	-2/+1
	This patch fixes: llvm/lib/CodeGen/MachineScheduler.cpp:4223:9: error: unused type alias 'IntervalTy' [-Werror,-Wunused-local-typedef]
2023-06-09	[MISched][rework] Introduce and use ResourceSegments.	Francesco Petrogalli	1	-21/+166
	Re-landing the code that was reverted because of the buildbot failure in https://lab.llvm.org/buildbot#builders/9/builds/27319. Original commit message ====================== The class `ResourceSegments` is used to keep track of the intervals that represent resource usage of a list of instructions that are being scheduled by the machine scheduler. The collection is made of intervals that are closed on the left and open on the right (represented by the standard notation `[a, b)`). These collections of intervals can be extended by `add`ing new intervals accordingly while scheduling a basic block. Unit tests are added to verify the possible configurations of intervals, and the relative possibility of scheduling a new instruction in these configurations. Specifically, the methods `getFirstAvailableAtFromBottom` and `getFirstAvailableAtFromTop` are tested to make sure that both bottom-up and top-down scheduling work when tracking resource usage across the basic block with `ResourceSegments`. Note that the scheduler tracks resource usage with two methods: 1. counters (via `std::vector<unsigned> ReservedCycles;`); 2. intervals (via `std::map<unsigned, ResourceSegments> ReservedResourceSegments;`). This patch can be considered a NFC test for existing scheduling models because the tracking system that uses intervals is turned off by default (field `bit EnableIntervals = false;` in the tablegen class `SchedMachineModel`). Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D150312
2023-06-09	Revert "[MISched] Introduce and use ResourceSegments."	Francesco Petrogalli	1	-166/+21
	Reverted because it produces the following builbot failure at https://lab.llvm.org/buildbot#builders/9/builds/27319: /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/unittests/CodeGen/SchedBoundary.cpp: In member function ‘virtual void ResourceSegments_getFirstAvailableAtFromBottom_empty_Test::TestBody()’: /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/unittests/CodeGen/SchedBoundary.cpp:395:31: error: call of overloaded ‘ResourceSegments(<brace-enclosed initializer list>)’ is ambiguous 395 \| auto X = ResourceSegments({}); \| ^ This reverts commit dc312f0331309692e8d6e06e93b3492b6a40989f.
2023-06-09	[MISched] Introduce and use ResourceSegments.	Francesco Petrogalli	1	-21/+166
	The class `ResourceSegments` is used to keep track of the intervals that represent resource usage of a list of instructions that are being scheduled by the machine scheduler. The collection is made of intervals that are closed on the left and open on the right (represented by the standard notation `[a, b)`). These collections of intervals can be extended by `add`ing new intervals accordingly while scheduling a basic block. Unit tests are added to verify the possible configurations of intervals, and the relative possibility of scheduling a new instruction in these configurations. Specifically, the methods `getFirstAvailableAtFromBottom` and `getFirstAvailableAtFromTop` are tested to make sure that both bottom-up and top-down scheduling work when tracking resource usage across the basic block with `ResourceSegments`. Note that the scheduler tracks resource usage with two methods: 1. counters (via `std::vector<unsigned> ReservedCycles;`); 2. intervals (via `std::map<unsigned, ResourceSegments> ReservedResourceSegments;`). This patch can be considered a NFC test for existing scheduling models because the tracking system that uses intervals is turned off by default (field `bit EnableIntervals = false;` in the tablegen class `SchedMachineModel`). Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D150312
2023-06-01	[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.	Jay Foad	1	-2/+2
	Differential Revision: https://reviews.llvm.org/D151424
2023-05-03	Restore CodeGen/MachineValueType.h from `Support`	NAKAMURA Takumi	1	-1/+1
	This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024
2023-03-30	[MachineScheduler] Rename postprocessDAG to postProcessDAG. NFC	jacquesguan	1	-3/+3
	Rename postprocessDAG to camel case. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D146795
2023-01-26	[MISched] Dump the execution trace of the schedule.	Francesco Petrogalli	1	-0/+160
	The traces are printed only for bottom-up and top-down scheduling because the values of TopReadyCycle and BottomReadyCycle are inconsistent when obtained via bidirectional scheduling (see `BIDIRECTIONAL` checks in the test). Differential Revision: https://reviews.llvm.org/D142529
2023-01-14	MachineScheduler.cpp: Fixup D141707, suppress `MISchedDumpReservedCycles` ↵	NAKAMURA Takumi	1	-0/+2
	conditionally. It is used in `LLVM_ENABLE_DUMP` regardless of `NDEBUG`.
2023-01-13	[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC	Craig Topper	1	-8/+8
	Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715
2023-01-13	[CodeGen] Fix build failure due to missing declaration.	Francesco Petrogalli	1	-0/+1
	The failure was reported in https://github.com/llvm/llvm-project/issues/60011 FAILED: lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MachineScheduler.cpp.o "/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/build-llvm/./bin/clang++" -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I"/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/build-llvm/tools/clang/stage2-bins/lib/CodeGen" -I"/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/llvm/lib/CodeGen" -I"/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/build-llvm/tools/clang/stage2-bins/include" -I"/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/llvm/include" -fstack-protector-strong -Wformat -Werror=format-security -Wno-unused-command-line-argument -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -ffile-prefix-map=/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/build-llvm/tools/clang/stage2-bins=build-llvm/tools/clang/stage2-bins -ffile-prefix-map=/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/= -no-canonical-prefixes -O2 -DNDEBUG -g1 -fno-exceptions -std=c++17 -MD -MT lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MachineScheduler.cpp.o -MF lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MachineScheduler.cpp.o.d -o lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MachineScheduler.cpp.o -c '/build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/llvm/lib/CodeGen/MachineScheduler.cpp' /build/llvm-toolchain-snapshot-16~++20230113111109+aba8983c9d86/llvm/lib/CodeGen/MachineScheduler.cpp:2639:7: error: use of undeclared identifier 'MISchedDumpReservedCycles' if (MISchedDumpReservedCycles) ^ 1 error generated. Fixes #60011 Differential Revision: https://reviews.llvm.org/D141707
2023-01-13	Recommit [SchedBoundary] Add dump method for resource usage.	Francesco Petrogalli	1	-0/+27
	Summary: As supporting information, I have added an example that describes how the indexes of the vector of resources SchedBoundary::ReservedCycles are tracked by the field SchedBoundary::ReservedCyclesIndex. This has a minor rework of https://github.com/llvm/llvm-project/commit/b39a9a94f420a25a239ae03097c255900cbd660e which was reverted in https://github.com/llvm/llvm-project/commit/df6ae1779fafd9984e144a27315d6dd65b32c325 becasue the llc invocation of the test was missing the argument `-mtriple`. See for example the failure at https://lab.llvm.org/buildbot#builders/231/builds/7245 that reported the following when targeting a non-aarch64 native build: 'cortex-a55' is not a recognized processor for this target (ignoring processor) Reviewers: jroelofs Subscribers: Differential Revision: https://reviews.llvm.org/D141367
2023-01-13	Revert "[SchedBoundary] Add dump method for resource usage."	Francesco Petrogalli	1	-27/+0
	Reverting because of https://lab.llvm.org/buildbot#builders/16/builds/41860 When building on x86, I need to specify also -mtriple in the invocation of llc otherwise the folllowing error shows up: 'cortex-a55' is not a recognized processor for this target (ignoring processor) This reverts commit b39a9a94f420a25a239ae03097c255900cbd660e.
2023-01-13	[SchedBoundary] Add dump method for resource usage.	Francesco Petrogalli	1	-0/+27
	As supporting information, I have added an example that describes how the indexes of the vector of resources SchedBoundary::ReservedCycles are tracked by the field SchedBoundary::ReservedCyclesIndex. Reviewed By: jroelofs Differential Revision: https://reviews.llvm.org/D141367
2022-07-30	[CodeGen] Fixed undeclared MISchedCutoff in case of NDEBUG and ↵	Dmitry Vassiliev	1	-1/+1
	LLVM_ENABLE_ABI_BREAKING_CHECKS This patch fixes the error llvm/lib/CodeGen/MachineScheduler.cpp(755): error C2065: 'MISchedCutoff': undeclared identifier in case of NDEBUG and LLVM_ENABLE_ABI_BREAKING_CHECKS. Note MISchedCutoff is declared under #ifndef NDEBUG. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D130425
2022-07-17	[CodeGen] Qualify auto variables in for loops (NFC)	Kazu Hirata	1	-1/+1

2022-07-14	[AMDGPU] SIMachineScheduler: Add support for several MachineScheduler features	Jannik Silvanus	1	-4/+3
	The SI machine scheduler inherits from ScheduleDAGMI. This patch adds support for a few features that are implemented in ScheduleDAGMI (or its base classes) that were missing so far because their support is implemented in overridden functions. * Support cl::opt -view-misched-dags This option allows to open a graphical window of the scheduling DAG. * Support cl::opt -misched-print-dags This option allows to print the scheduling DAG in text form. * After constructing the scheduling DAG, call postprocessDAG() to apply any registered DAG mutations. Note that currently there are no mutations defined in AMDGPUTargetMachine.cpp in case SIScheduler is used. Still add this to avoid surprises in the future in case mutations are added. Differential Revision: https://reviews.llvm.org/D128808
2022-03-24	[CodeGen] Define ABI breaking class members correctly	Daniil Kovalev	1	-4/+4
	Non-static class members declared under #ifndef NDEBUG should be declared under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to make headers library-friendly and allow cross-linking, as discussed in D120714. Differential Revision: https://reviews.llvm.org/D121549
2022-03-16	Cleanup codegen includes	serge-sans-paille	1	-1/+0
	This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
2022-03-10	Revert "Cleanup codegen includes"	Nico Weber	1	-0/+1
	This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10	Cleanup codegen includes	serge-sans-paille	1	-1/+0
	after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169
2022-02-07	[AMDGPU] Fix debug values in scheduler not placed correctly when reverting	Vang Thao	1	-4/+2
	Debug position data is cleared after ScheduleDAGMILive::schedule() due to it also calling placeDebugValues(). Make it so the data is not cleared after initial call to placeDebugValues since we will call it again after reverting a schedule. Secondly, since we skip debug instructions when reverting the schedule on AMDGPU, all debug instructions are now moved to the end of the scheduling region. RegionEnd points to the beginning of this chunk of debug instructions since it was not incremented when a debug instruction was skipped. RegionBegin may also point to the same debug instruction if Unsched.front() is a debug instruction thus shrinking the region to 1. Fix RegionBegin and RegionEnd so that they point to the current beginning and ending before calling placeDebugValues() since both vars will be used as reference points to move debug instructions back. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D119022
2021-12-06	[llvm][Hexagon] Generalize VLIWResourceModel, VLIWMachineScheduler, and ↵	James Nagurne	1	-4/+8
	ConvergingVLIWScheduler The Pre-RA VLIWMachineScheduler used by Hexagon is a relatively generic implementation that would make sense to use on other VLIW targets. This commit lifts those classes into their own header/source file with the root VLIWMachineScheduler. I chose this path rather than adding the strategy et al. into MachineScheduler to avoid bloating the file with other implementations. Target-specific behaviors have been captured and replicated through function overloads. - Added an overloadable DFAPacketizer creation member function. This is mainly done for our downstream, which has the capability to override the DFAPacketizer with custom implementations. This is an upstreamable TODO on our end. Currently, it always returns the result of TargetInstrInfo::CreateTargetScheduleState - Added an extra helper which returns the number of instructions in the current packet. This is used in our downstream, and may be useful elsewhere. - Placed the priority heuristic values into the ConvergingVLIWscheduler class instead of defining them as local statics in the implementation - Added a overridable helper in ConvergingVLIWScheduler so that targets can create their own VLIWResourceModel Differential Revision: https://reviews.llvm.org/D113150
2021-12-04	[CodeGen] Use range-based for loops (NFC)	Kazu Hirata	1	-5/+4

2021-08-26	[MachineScheduler] Fix tracing	Jay Foad	1	-1/+1
	Consistently print a newline before "RegionInstrs:".
2021-07-01	[NFC][Scheduler] Refactor tryCandidate to return boolean	Qiu Chaofan	1	-28/+36
	This patch changes return type of tryCandidate from void to bool: 1. Methods in some targets already follow this convention. 2. This would help if some target wants to re-use generic code. 3. It looks more intuitive if these try-method returns the same type. We may need to change return type of them from bool to some enum further, to make it less confusing. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D103951
2021-04-19	[CSSPGO] Exclude pseudo probes from slot index	Hongtao Yu	1	-3/+3
	Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because of enlarged lifetime length with pseudo probe instructions. As a consequence, program could get different code generated w/ and w/o pseudo probes. I'm closing the gap by excluding pseudo probes from stack index and downstream register allocation related passes. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D100334
2021-04-19	[CodeGen] Use ProcResGroup information in SchedBoundary	David Penry	1	-7/+47
	When the ProcResGroup has BufferSize=0, 1. if there is a subunit in the list of write resources for the scheduling class, do not attempt to schedule the ProcResGroup. 2. if there is not a subunit in the list of write resources for the scheduling class, choose a subunit to use instead of the ProcResGroup. 3. having both the ProcResGroup and any of its subunits in the resources implied by a InstRW is not supported. Used to model parallel uses from a pool of resources. Differential Revision: https://reviews.llvm.org/D98976
2021-02-16	[CodeGen] Use range-based for loops (NFC)	Kazu Hirata	1	-10/+8

2020-12-16	[DDG] Data Dependence Graph - DOT printer - recommit	Bardia Mahjour	1	-1/+1
	This is being recommitted to try and address the MSVC complaint. This patch implements a DDG printer pass that generates a graph in the DOT description language, providing a more visually appealing representation of the DDG. Similar to the CFG DOT printer, this functionality is provided under an option called -dot-ddg and can be generated in a less verbose mode under -dot-ddg-only option. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D90159
2020-12-14	Revert "[DDG] Data Dependence Graph - DOT printer"	Bardia Mahjour	1	-1/+1
	This reverts commit fd4a10732c8bd646ccc621c0a9af512be252f33a, to investigate the failure on windows: http://lab.llvm.org:8011/#/builders/127/builds/3274
2020-12-14	[DDG] Data Dependence Graph - DOT printer	Bardia Mahjour	1	-1/+1
	This patch implements a DDG printer pass that generates a graph in the DOT description language, providing a more visually appealing representation of the DDG. Similar to the CFG DOT printer, this functionality is provided under an option called -dot-ddg and can be generated in a less verbose mode under -dot-ddg-only option. Differential Revision: https://reviews.llvm.org/D90159
2020-11-17	[MachineScheduler] Inform pass infra of post-ra scheduler's dependencies	Jon Roelofs	1	-2/+7
	Differential Revision: https://reviews.llvm.org/D91561
2020-11-02	[Scheduling] Fall back to the fast cluster algorithm if the DAG is too complex	QingShan Zhang	1	-9/+62
	We have added a new load/store cluster algorithm in D85517. However, AArch64 see some compiling deg with the new algorithm as the IsReachable() is not cheap if the DAG is complex. O(M+N) See https://bugs.llvm.org/show_bug.cgi?id=47966 So, this patch added a heuristic to switch to old cluster algorithm if the DAG is too complex. Reviewed By: Owen Anderson Differential Revision: https://reviews.llvm.org/D90144
2020-10-28	[NFC] Use Register in RegisterPressure APIs	Mircea Trofin	1	-2/+2
	Some related changes as well. Differential Revision: https://reviews.llvm.org/D90268
2020-10-05	[CodeGen][MachineSched] Fixup function name typo. NFC	Jon Roelofs	1	-2/+2

2020-09-21	Revert "[NFC][ScheduleDAG] Remove unused EntrySU SUnit"	Alexander Belyaev	1	-2/+5
	This reverts commit 0345d88de654259ae90494bf9b015416e2cccacb. Google internal backend uses EntrySU, we are looking into removing dependency on it. Differential Revision: https://reviews.llvm.org/D88018
2020-09-18	[NFC][ScheduleDAG] Remove unused EntrySU SUnit	Francis Visoiu Mistrih	1	-5/+2
	EntrySU doesn't seem to be used at all when building the ScheduleDAG. Differential Revision: https://reviews.llvm.org/D87867
2020-08-26	[Scheduling] Implement a new way to cluster loads/stores	QingShan Zhang	1	-64/+72
	Before calling target hook to determine if two loads/stores are clusterable, we put them into different groups to avoid fake cluster due to dependency. For now, we are putting the loads/stores into the same group if they have the same predecessor. We assume that, if two loads/stores have the same predecessor, it is likely that, they didn't have dependency for each other. However, one SUnit might have several predecessors and for now, we just pick up the first predecessor that has non-data/non-artificial dependency, which is too arbitrary. And we are struggling to fix it. So, I am proposing some better implementation. 1. Collect all the loads/stores that has memory info first to reduce the complexity. 2. Sort these loads/stores so that we can stop the seeking as early as possible. 3. For each load/store, seeking for the first non-dependency instruction with the sorted order, and check if they can cluster or not. Reviewed By: Jay Foad Differential Revision: https://reviews.llvm.org/D85517
2020-08-07	[NFC] Add the stats for load/store cluster	QingShan Zhang	1	-0/+4
	We have the stats for MacroFusion but miss it for load/store cluster.
2020-08-07	[Scheduling] Create the missing dependency edges for store cluster	QingShan Zhang	1	-10/+26
	If it is load cluster, we don't need to create the dependency edges(SUb->reg) from SUb to SUa as they both depend on the base register "reg" +-------+ +----> reg \| \| +---+---+ \| ^ \| \| \| \| \| \| \| +---+---+ \| \| SUa \| Load 0(reg) \| +---+---+ \| ^ \| \| \| \| \| +---+---+ +----+ SUb \| Load 4(reg) +-------+ But if it is store cluster, we need to create it as follow shows to avoid the instruction store depend on scheduled in-between SUb and SUa. +-------+ +----> reg \| \| +---+---+ \| ^ \| \| Missing +-------+ \| \| +-------------------->+ y \| \| \| \| +---+---+ \| +---+-+-+ ^ \| \| SUa \| Store x 0(reg) \| \| +---+---+ \| \| ^ \| \| \| +------------------------+ \| \| \| \| +---+--++ +----+ SUb \| Store y 4(reg) +-------+ Reviewed By: evandro, arsenm, rampitec, foad, fhahn Differential Revision: https://reviews.llvm.org/D72031
2020-08-03	Fix typo: s/epomymous/eponymous/ NFC	Jon Roelofs	1	-1/+1

2020-07-27	[Scheduling] Improve group algorithm for store cluster	QingShan Zhang	1	-1/+7
	Store Addr and Store Addr+8 are clusterable pair. They have memory(ctrl) dependency on different loads. Current implementation will put these two stores into different group and miss to cluster them. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D84139
2020-07-17	[MachineScheduler] Fix the TopDepth/BotHeightReduce latency heuristics	Jay Foad	1	-2/+10
	tryLatency compares two sched candidates. For the top zone it prefers the one with lesser depth, but only if that depth is greater than the total latency of the instructions we've already scheduled -- otherwise its latency would be hidden and there would be no stall. Unfortunately it only tests the depth of one of the candidates. This can lead to situations where the TopDepthReduce heuristic does not kick in, but a lower priority heuristic chooses the other candidate, whose depth is greater than the already scheduled latency, which causes a stall. The fix is to apply the heuristic if the depth of either candidate is greater than the already scheduled latency. All this also applies to the BotHeightReduce heuristic in the bottom zone. Differential Revision: https://reviews.llvm.org/D72392