Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch does the noisy work of removing the test opcodes from the
exported interface to an interface that is only visible in `libc`. The
benefit of this is that we both test the exported RPC registration more
directly, and we do not need to give this interface to users.
I have decided to export any opcode that is not a "core" libc feature as
having its MSB set in the opcode. We can think of these as non-libc
"extensions".
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D154848
|
|
This test is excessively slow on GPU targets, taking anywhere beween 5
and 60 seconds to complete each time it's run. See
https://lab.llvm.org/buildbot/#/builders/55/builds/52203/steps/12/logs/stdio
for an example on the NVPTX buildbot. Simply disable testing this on the
GPU for now.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D155979
|
|
Currently we keep an internal buffer of device memory that is used to
indicate ownership of a port. Since we only use this as a single bit we
can simply turn this into a bitfield. I did this manually rather than
having a separate type as we need very special handling of the masks
used to interact with the locks.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155511
|
|
Was passing zeros to the string print function.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D155899
|
|
The functions are in a header, and so must be marked inline to avoid
symbol conflicts.
Differential Revision: https://reviews.llvm.org/D155892
|
|
The new printf writer design focuses on optimizing the fast path. It
inlines any write to a buffer or string, and by handling buffering
itself can more effectively work with both internal and external file
implementations. The overflow hook should allow for expansion to
asprintf with minimal extra code.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D153999
|
|
Clang supports the `-Wglobal-constructors` flag which will indicate if a
global constructor is being used. The current goal in `libc` is to make
the constructors `constexpr` to prevent this from happening with
straight construction. However, there are many other cases where we can
emit a constructor that this won't catch. This should give warning if
someone accidentally introduces a global constructor.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D155721
|
|
There are some cases when testing we want to override the logic for not
building tests if the loader is not present. This allows users to
specify an external binary that fulfils the same duties which will force
the tests to be built even without meeting the dependencies.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155837
|
|
If the clock_freq symbol isn't used, and is removed,
we don't need to abort the loader. Can instead just not set it.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D155832
|
|
This file required a global constructor due to copying the file stream
and have a non-constexpr constructor for the wrapper type. Also, I
changes the `opterr` to be a pointer, because it seemed like it wasn't
being set correctly as an externally visibile variable if we just
captured it by value.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D155766
|
|
The `File` interface currently has a destructor to delete the buffer if
it is owned by the file. This is problematic for the globally allocated
`stdout`, `stdin`, and `stderr` files. This causes the file interface to
have global constructors to initialize the destructors to use these.
However, these never use the destructors because they don't own the
buffer. This patch removes the destructor and calls in manually in the
close implementation. The platform close should never need to access the
buffer and it needs to be done before clearing the whole thing, so this
should work.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D155762
|
|
HSA headers might be under a hsa/ directory or might not.
This scheme matches the one used by the openmp amdgpu plugin.
Reviewed By: jhuber6, jplehr
Differential Revision: https://reviews.llvm.org/D155812
|
|
Summary:
Simple cleanup of the interface so we do not depend on the installed
headers and get everything we need just including rpc_client.h.
|
|
The indirection here is for some reason causing an unnecessary
constructor. If we leave this uninitialized we will get the default
constructor which simply zero initliaizes the global. I've checked the
output and confirmed that it uses the `zeroinitializer` so this should
be safe.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155720
|
|
This patch adds the `rpc_host_call` function as a GPU extension. This is
exported from the `libc` project to use the RPC interface to call a
function pointer via RPC any copying the arguments by-value. The
interface can only support a single void pointer argument much like
pthreads. The function call here is the bare-bones version of what's
required for OpenMP reverse offloading. Full support will require
interfacing with the mapping table, nowait support, etc.
I decided to test this interface in `libomptarget` as that will be the
primary consumer and it would be more difficult to make a test in `libc`
due to the testing infrastructure not really having a concept of the
"host" as it runs directly on the GPU as if it were a CPU target.
Reviewed By: jplehr
Differential Revision: https://reviews.llvm.org/D155003
|
|
Summary:
This caused test failures on the gfx90a buildbot. This works on my
gfx1030 and the Nvidia buildbots, so we'll need to investigate what is
going wrong here. For now revert it to get the bots green.
This reverts commit 05abcc579244b68162b847a6780d27b22bd58f74.
|
|
This patch mostly renames files so it better reflects the function they declare.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D155607
|
|
Reviewed By: JonChesterfield, jhuber6
Differential Revision: https://reviews.llvm.org/D155597
|
|
The amount of spaces to pad with is stored in the variable
padding_spaces, previously the actual write calls used the same formula
to calculate the value. This simplifies and clarifies the values by just
reusing the variable.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D155113
|
|
MPFR has a minimum precision of 2, but the strtofloat fuzz sometimes
would request a precision of 1 for the case of the minimum subnormal.
This patch tells the fuzzer to ignore any case where the precision would
go below 2.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D155130
|
|
Currently we keep an internal buffer of device memory that is used to
indicate ownership of a port. Since we only use this as a single bit we
can simply turn this into a bitfield. I did this manually rather than
having a separate type as we need very special handling of the masks
used to interact with the locks.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155511
|
|
This reverts commit eca8b54a5f76c65a055bac05556b70c2a0ec63a1.
Another user reverted the patch this was based on leaving this one in a
broken state.
|
|
A previous patch made this cause an error on the GPU. We have not yet
dedicated time towards an optimial implementaiton there but we do not
want it to cause an error. We simply use the fallback routines.
Differential Revision: https://reviews.llvm.org/D155615
|
|
Broke amdgpu libc bot
This reverts commit a39c951730aa92894e27da038e834229d4613db1.
|
|
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D155597
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155515
|
|
This is the target definition only. Currently they are treated the same
as GFX 11.0.x.
Differential Revision: https://reviews.llvm.org/D155429
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155181
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155174
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155099
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155076
|
|
Provide platform-specific x87 FPU definitions and operations
Differential Revision: https://reviews.llvm.org/D153823
|
|
This ensures that if someone calls the `rpc_shutdown` method multiple
times it will not segfault and gracefully continue. This was causing
problems in the OpenMP usage. This could point to other issues, but for
now this is a safe fix.
Differential Revision: https://reviews.llvm.org/D155005
|
|
Subnormal floating point numbers have a lower effective precision than
normal floating point numbers. This can cause issues for the fuzz test
since the MPFR floats have a constant precision regardless of the
exponent, and the precision must match exactly or else create rounding
errors. To solve this problem, the precision of the MPFR floats is
dynamically calculated.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D154909
|
|
CUDA requires a PTX feature to be compiled generally, because the
`libcgpu.a` archive contains LLVM-IR we need to have one present to
compile it. Currently, the wrapper fatbinary format we use to
incorporate these into single-source offloading languages has a special
option to provide this. Since this was not present in the builds, if the
user did not specify it via `-foffload-lto` it would not compile from
CUDA or OpenMP due to the missing PTX features. Fix this by passing it
to the packager invocation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D154864
|
|
These functions have definitions differing between C and C++. GNU
respects the C++ definitions while the LLVM libc does not. This causes
many bugs and the current hack creates other issues. Rather than hack
around this I'd rather temporarily disable these than regress with the
integration into other offloading languages. We lose test support for
them but we should be able to re-enable these once the `libc` headers
provide these correctly.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D154850
|
|
This was accidentally omitted from D154746.
|
|
D152592 introduced LIBC_INCLUDE_DIR for the location of the include
directory, use it in relevant CMake rules.
Differential Revision: https://reviews.llvm.org/D154278
|
|
Follow up on https://reviews.llvm.org/D154770
Differential Revision: https://reviews.llvm.org/D154800
|
|
header
There will be subsequent patches to move things around and make the file layout more principled.
Differential Revision: https://reviews.llvm.org/D154770
|
|
This is a follow up to D154529 covering tests.
Differential Revision: https://reviews.llvm.org/D154746
|
|
This reverts commit 147c0640a3faae5e382c0d319c15112c92e06098 since
it broke GPU builds.
|
|
This is an alternate approach to the patches proposed in D153897 and
D153794. Rather than exporting a single header that can be included on
the GPU in all circumstances, this patch chooses to instead generate a
separate set of headers that only provides the declarations. This can
then be used by external tooling to set up what's on the GPU. This
leaves room for header hacks for offloading languages without needing to
worry about the `libc` implementation.
Currently this generates a set of headers that only contain the
declarations. These will then be installed to a new clang resource
directory called `llvm_libc_wrappers/` which will house the shim code.
We can then automaticlaly include this from `clang` when offloading to
wrap around the headers while specifying what's on the GPU.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D154036
|
|
This reverts commit 6e821f0b3a83fa6bc4f821b26611068675070a5a since
it broke the libc-aarch64-ubuntu-fullbuild-dbg bot.
|
|
D152592 introduced LIBC_INCLUDE_DIR for the location of the include
directory, use it in relevant CMake rules.
Differential Revision: https://reviews.llvm.org/D154278
|
|
This test uses libc headers and need to explicitly include them.
Differential Revision: https://reviews.llvm.org/D154277
|
|
This patch adds the intial support for running an RPC server in
libomptarget to handle host services. We interface with the library
provided by the `libc` project to stand up a basic server. We introduce
a new type that is controlled by the plugin and has each device
intialize its interface. We then run a basic server to check the RPC
buffer.
This patch does not fully implement the interface. In the future each
plugin will want to define special handlers via the interface to support
things like malloc or H2D copies coming from RPC. We will also want to
allow the plugin to specify t he number of ports. This is currently
capped in the implementation but will be adjusted soon.
Right now running the server is handled by whatever thread ends up doing
the waiting. This is probably not a completely sound solution but I am
not overly familiar with the behaviour of OpenMP tasks and what would be
required here. This works okay with synchrnous regions, and somewhat
fine with `nowait` regions, but I've observed some weird behavior when
one of those regions calls `exit`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D154312
|
|
AMDGPU supports aliases now, so we can drop this case and leave it only
for the NVPTX target. Unfortunately it's unlikely that NVPTX will be
able to support this in the future due to their PTX language being very
limited.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D154704
|
|
For machines with a lot of cores, hardware prefetchers can saturate the memory bus when utilization is high.
In this case it is desirable to turn off the hardware prefetcher completely.
This has a big impact on the performance of memory functions such as `memcpy` that rely on the fact that the next cache line will be readily available.
This patch adds the 'LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING' compile time option that generates a version of memcpy with software prefetching. While not fully restoring the original performances it mitigates the impact to an acceptable level.
Reviewed By: rtenneti
Differential Revision: https://reviews.llvm.org/D154494
|
|
This reverts commit a4a26374aa11d48ac6bf65c78c2aaf8f16414287.
This was causing some problems with the CPU build and CUDA buildbot.
Revert until I can figure out what those issues are and fix them. I
believe it is just some CMake.
|