riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Guray Ozen <guray.ozen@gmail.com>	2024-04-24 12:00:12 +0200
committer	GitHub <noreply@github.com>	2024-04-24 12:00:12 +0200
commit	4d3308202e52b213a05023c8b8b470b346151de6 (patch)
tree	533094638c052fc79ee7898fb788fb231396a49e /llvm/lib/CodeGen/ModuloSchedule.cpp
parent	506c84a7198630b7476b02d985c6ed09338f757d (diff)
download	llvm-4d3308202e52b213a05023c8b8b470b346151de6.zip llvm-4d3308202e52b213a05023c8b8b470b346151de6.tar.gz llvm-4d3308202e52b213a05023c8b8b470b346151de6.tar.bz2

[mlir][nvgpu] NVGPU Tutorials (#87065)

I have a tutorial at EuroLLVM 2024 ([Zero to Hero: Programming Nvidia Hopper Tensor Core with MLIR's NVGPU Dialect](https://llvm.swoogo.com/2024eurollvm/session/2086997/zero-to-hero-programming-nvidia-hopper-tensor-core-with-mlir's-nvgpu-dialect)). For that, I implemented tutorial codes in Python. The focus is the nvgpu dialect and how to use its advanced features. I thought it might be useful to upstream this. The tutorial codes are as follows: - **Ch0.py:** Hello World - **Ch1.py:** 2D Saxpy - **Ch2.py:** 2D Saxpy using TMA - **Ch3.py:** GEMM 128x128x64 using Tensor Core and TMA - **Ch4.py:** Multistage performant GEMM using Tensor Core and TMA - **Ch5.py:** Warp Specialized GEMM using Tensor Core and TMA I might implement one more chapter: - **Ch6.py:** Warp Specialized Persistent ping-pong GEMM This PR also introduces the nvdsl class, making IR building in the tutorial easier.

Diffstat (limited to 'llvm/lib/CodeGen/ModuloSchedule.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: