aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/ModuleBuilder.cpp
diff options
context:
space:
mode:
authorMehdi Amini <mehdi.amini@apple.com>2015-01-17 01:35:56 +0000
committerMehdi Amini <mehdi.amini@apple.com>2015-01-17 01:35:56 +0000
commit37f316afaf69cbdfb1f5959342ac74a888943ed6 (patch)
tree381cf151e139aa31592266e1f99fb82f665c0461 /clang/lib/CodeGen/ModuleBuilder.cpp
parentdcd7bb022a304de69161fe990b12e6341313a69e (diff)
downloadllvm-37f316afaf69cbdfb1f5959342ac74a888943ed6.zip
llvm-37f316afaf69cbdfb1f5959342ac74a888943ed6.tar.gz
llvm-37f316afaf69cbdfb1f5959342ac74a888943ed6.tar.bz2
Improve DAG combine pass on certain IR vector patterns
Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector produced suboptimal code in AVX2 mode with certain IR combinations. In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32 (undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical, but then mysteriously generated rather bad code; the movq/movhpd combination didn't match. The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs would get promoted to 4f32 by the type legalizer, eventually resulting in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing these were both half the output size, concatted them and then produced a shuffle. However, the resulting concat + shuffle was more complex than it should be; in the case where the upper half of the output is undef, we probably want to generate shuffle + concat instead. This enhancement causes the vector_shuffle combine step to recognize this suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR in case the same suboptimal pattern occurs for other reasons. This results in the optimizer correctly producing the optimal movq + movhpd sequence for all three variations on this IR, even with AVX2. I've included a test case. Radar link: rdar://problem/19287012 Fix for PR 21943. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 226360
Diffstat (limited to 'clang/lib/CodeGen/ModuleBuilder.cpp')
0 files changed, 0 insertions, 0 deletions