diff options
author | Patrick Palka <ppalka@redhat.com> | 2024-04-13 10:52:32 -0400 |
---|---|---|
committer | Patrick Palka <ppalka@redhat.com> | 2024-04-13 10:52:32 -0400 |
commit | 436ab7e8e8b16866d8a807af242560ad4fdff0d6 (patch) | |
tree | e1a8dd901a06321d1fa6c7d8e4c327fa9268794d /gcc/rust | |
parent | 5ec5791105caf859e06c47093188dd655862ccb3 (diff) | |
download | gcc-436ab7e8e8b16866d8a807af242560ad4fdff0d6.zip gcc-436ab7e8e8b16866d8a807af242560ad4fdff0d6.tar.gz gcc-436ab7e8e8b16866d8a807af242560ad4fdff0d6.tar.bz2 |
c++/modules: optimize tree flag streaming
One would expect consecutive calls to bytes_in/out::b for streaming
adjacent bits, as is done for tree flag streaming, to at least be
optimized by the compiler into individual bit operations using
statically known bit positions (and ideally combined into larger sized
reads/writes).
Unfortunately this doesn't happen because the compiler has trouble
tracking the values of this->bit_pos and this->bit_val across the
calls, likely because the compiler doesn't know the value of 'this'.
Thus for each consecutive bit stream operation, bit_pos and bit_val are
loaded from 'this', checked if buffering is needed, and finally the bit
is extracted from bit_val according to the (unknown) bit_pos, even
though relative to the previous operation (if we didn't need to buffer)
bit_val is unchanged and bit_pos is just 1 larger. This ends up being
quite slow, with tree_node_bools taking 10% of time when streaming in
the std module.
This patch improves this by making tracking of bit_pos and bit_val
easier for the compiler. Rather than bit_pos and bit_val being members
of the (effectively global) bytes_in/out objects, this patch factors out
the bit streaming code/state into separate classes bits_in/out that get
constructed locally as needed for bit streaming. Since these objects
are now clearly local, the compiler can more easily track their values
and optimize away redundant buffering checks.
And since bit streaming is intended to be batched it's natural for these
new classes to be RAII-enabled such that the bit stream is flushed upon
destruction.
In order to make the most of this improved tracking of bit position,
this patch changes parts where we conditionally stream a tree flag
to unconditionally stream (the flag or a dummy value). That way
the number of bits streamed and the respective bit positions are as
statically known as reasonably possible. In lang_decl_bools and
lang_type_bools this patch makes us flush the current bit buffer at the
start so that subsequent bit positions are in turn statically known.
And in core_bools, we can add explicit early exits utilizing invariants
that the compiler can't figure out itself (e.g. a tree code can't have
both TS_TYPE_COMMON and TS_DECL_COMMON, and if a tree code doesn't have
TS_DECL_COMMON then it doesn't have TS_DECL_WITH_VIS).
This patch also moves the definitions of the relevant streaming classes
into anonymous namespaces so that the compiler can make more informed
decisions about inlining their member functions.
After this patch, compile time for a simple Hello World using the std
module is reduced by 7% with a release compiler. The on-disk size of
the std module increases by 0.4% (presumably due to the extra flushing
done in lang_decl_bools and lang_type_bools).
The bit stream out performance isn't improved as much as the stream in
due to the spans/lengths instrumentation performed on stream out (which
maybe should be disabled for release builds?)
gcc/cp/ChangeLog:
* module.cc: Update comment about classes defined within.
(class data): Enclose in an anonymous namespace.
(data::calc_crc): Moved from bytes::calc_crc.
(class bytes): Remove. Move bit_flush to namespace scope.
(class bytes_in): Enclose in an anonymous namespace. Inherit
directly from data and adjust accordingly. Move b and bflush
members to bits_in.
(class bytes_out): As above. Remove is_set static data member.
(bit_flush): Moved from class bytes.
(struct bytes_in::bits_in): Define.
(struct bytes_out::bits_out): Define.
(bytes_in::stream_bits): Define.
(bytes_out::stream_bits): Define.
(bytes_out::bflush): Moved to bits_out/in.
(bytes_in::bflush): Likewise
(bytes_in::bfill): Removed.
(bytes_out::b): Moved to bits_out/in.
(bytes_in::b): Likewise.
(class trees_in): Enclose in an anonymous namespace.
(class trees_out): Enclose in an anonymous namespace.
(trees_out::core_bools): Add bits_out/in parameter and use it.
Unconditionally stream a bit for public_flag. Add early exits
as appropriate.
(trees_out::core_bools): Likewise.
(trees_out::lang_decl_bools): Add bits_out/in parameter and use
it. Flush the current bit buffer at the start. Unconditionally
stream a bit for module_keyed_decls_p.
(trees_in::lang_decl_bools): Likewise.
(trees_out::lang_type_bools): Add bits_out/in parameter and use
it. Flush the current bit buffer at the start.
(trees_in::lang_type_bools): Likewise.
(trees_out::tree_node_bools): Construct a bits_out object and
use/pass it.
(trees_in::tree_node_bools): Likewise.
(trees_out::decl_value): Likewise.
(trees_in::decl_value): Likewise.
(module_state::write_define): Likewise.
(module_state::read_define): Likewise.
Reviewed-by: Jason Merrill <jason@redhat.com>
Diffstat (limited to 'gcc/rust')
0 files changed, 0 insertions, 0 deletions