aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorAlex Bennée <alex.bennee@linaro.org>2020-07-09 15:13:16 +0100
committerAlex Bennée <alex.bennee@linaro.org>2020-07-11 15:53:00 +0100
commit4d7fe02be39e855cdd11376b4d17a85e343fd5c9 (patch)
tree78283a7abca20df673578bf694de157a3245ec8c /docs
parentc8c06e520d389dcde5963cc5a73d5ecbaf6b8e55 (diff)
downloadqemu-4d7fe02be39e855cdd11376b4d17a85e343fd5c9.zip
qemu-4d7fe02be39e855cdd11376b4d17a85e343fd5c9.tar.gz
qemu-4d7fe02be39e855cdd11376b4d17a85e343fd5c9.tar.bz2
docs/devel: add some notes on tcg-icount for developers
This attempts to bring together my understanding of the requirements for icount behaviour into one reference document for our developer notes. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Dovgalyuk <dovgaluk@ispras.ru> Cc: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20200709141327.14631-3-alex.bennee@linaro.org>
Diffstat (limited to 'docs')
-rw-r--r--docs/devel/index.rst1
-rw-r--r--docs/devel/tcg-icount.rst97
2 files changed, 98 insertions, 0 deletions
diff --git a/docs/devel/index.rst b/docs/devel/index.rst
index 4ecaea3..ae6eac7 100644
--- a/docs/devel/index.rst
+++ b/docs/devel/index.rst
@@ -23,6 +23,7 @@ Contents:
decodetree
secure-coding-practices
tcg
+ tcg-icount
multi-thread-tcg
tcg-plugins
bitops
diff --git a/docs/devel/tcg-icount.rst b/docs/devel/tcg-icount.rst
new file mode 100644
index 0000000..8d67b6c
--- /dev/null
+++ b/docs/devel/tcg-icount.rst
@@ -0,0 +1,97 @@
+..
+ Copyright (c) 2020, Linaro Limited
+ Written by Alex Bennée
+
+
+========================
+TCG Instruction Counting
+========================
+
+TCG has long supported a feature known as icount which allows for
+instruction counting during execution. This should not be confused
+with cycle accurate emulation - QEMU does not attempt to emulate how
+long an instruction would take on real hardware. That is a job for
+other more detailed (and slower) tools that simulate the rest of a
+micro-architecture.
+
+This feature is only available for system emulation and is
+incompatible with multi-threaded TCG. It can be used to better align
+execution time with wall-clock time so a "slow" device doesn't run too
+fast on modern hardware. It can also provides for a degree of
+deterministic execution and is an essential part of the record/replay
+support in QEMU.
+
+Core Concepts
+=============
+
+At its heart icount is simply a count of executed instructions which
+is stored in the TimersState of QEMU's timer sub-system. The number of
+executed instructions can then be used to calculate QEMU_CLOCK_VIRTUAL
+which represents the amount of elapsed time in the system since
+execution started. Depending on the icount mode this may either be a
+fixed number of ns per instruction or adjusted as execution continues
+to keep wall clock time and virtual time in sync.
+
+To be able to calculate the number of executed instructions the
+translator starts by allocating a budget of instructions to be
+executed. The budget of instructions is limited by how long it will be
+until the next timer will expire. We store this budget as part of a
+vCPU icount_decr field which shared with the machinery for handling
+cpu_exit(). The whole field is checked at the start of every
+translated block and will cause a return to the outer loop to deal
+with whatever caused the exit.
+
+In the case of icount, before the flag is checked we subtract the
+number of instructions the translation block would execute. If this
+would cause the instruction budget to go negative we exit the main
+loop and regenerate a new translation block with exactly the right
+number of instructions to take the budget to 0 meaning whatever timer
+was due to expire will expire exactly when we exit the main run loop.
+
+Dealing with MMIO
+-----------------
+
+While we can adjust the instruction budget for known events like timer
+expiry we cannot do the same for MMIO. Every load/store we execute
+might potentially trigger an I/O event, at which point we will need an
+up to date and accurate reading of the icount number.
+
+To deal with this case, when an I/O access is made we:
+
+ - restore un-executed instructions to the icount budget
+ - re-compile a single [1]_ instruction block for the current PC
+ - exit the cpu loop and execute the re-compiled block
+
+The new block is created with the CF_LAST_IO compile flag which
+ensures the final instruction translation starts with a call to
+gen_io_start() so we don't enter a perpetual loop constantly
+recompiling a single instruction block. For translators using the
+common translator_loop this is done automatically.
+
+.. [1] sometimes two instructions if dealing with delay slots
+
+Other I/O operations
+--------------------
+
+MMIO isn't the only type of operation for which we might need a
+correct and accurate clock. IO port instructions and accesses to
+system registers are the common examples here. These instructions have
+to be handled by the individual translators which have the knowledge
+of which operations are I/O operations.
+
+When the translator is handling an instruction of this kind:
+
+* it must call gen_io_start() if icount is enabled, at some
+ point before the generation of the code which actually does
+ the I/O, using a code fragment similar to:
+
+.. code:: c
+
+ if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
+ gen_io_start();
+ }
+
+* it must end the TB immediately after this instruction
+
+Note that some older front-ends call a "gen_io_end()" function:
+this is obsolete and should not be used.