aboutsummaryrefslogtreecommitdiff
path: root/libclc/ptx-nvidiacl
AgeCommit message (Collapse)AuthorFilesLines
2020-08-17libclc: Make all built-ins overloadableDaniel Stone7-12/+11
The SPIR spec states that all OpenCL built-in functions should be overloadable and mangled, to ensure consistency. Add the overload attribute to functions which were missing them: work dimensions, memory barriers and fences, and events. Reviewed By: tstellar, jenatali Differential Revision: https://reviews.llvm.org/D82078
2017-10-09Implement mem_fence on ptxJeroen Ketema2-0/+16
PTX does not differentiate between read and write fences. Hence, these a lowered to a mem_fence call. The mem_fence function compiles to the “member.cta” instruction, which commits all outstanding reads and writes of a thread such that these become visible to all other threads in the same CTA (i.e., work-group). The instruction does not differentiate between global and local memory. Hence, the flags parameter is ignored, except for deciding whether a “member.cta” instruction should be issued at all. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 315235
2017-10-09Make ptx barrier work irrespective of the cl_mem_fence_flagsJeroen Ketema1-3/+1
This generates a "bar.sync 0” instruction, which not only causes the threads to wait, but does acts as a memory fence, as required by OpenCL. The fence does not differentiate between local and global memory. Unfortunately, there is no similar instruction which does not include a memory fence. Hence, we cannot optimize the case where neither CLK_LOCAL_MEM_FENCE nor CLK_GLOBAL_MEM_FENCE is passed. llvm-svn: 315228
2016-07-22AMDGPU: Implement get_global_offset builtinJan Vesely2-0/+6
Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
2016-07-22ptx: Fix builtin names after clang r274770Jan Vesely5-13/+13
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-By: Aaron Watry <awatry@gmail.com> llvm-svn: 276423
2012-09-05Add barrier.cl to SOURCES, spotted by Jin Wang.Peter Collingbourne1-0/+1
llvm-svn: 163227
2012-08-05PTX: move implementations of work-item and synchronisation functionsPeter Collingbourne8-21/+19
to lib, and add header files in generic. Incorporates a patch by Tom Stellard! llvm-svn: 161313
2012-01-08Initial commit.Peter Collingbourne8-0/+54
llvm-svn: 147756