Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Debug: suppress `PRNG seed ...` log messages when `gdbserver.py --list-tests <target>` used
|
|
|
|
|
|
testlib.py:run_all_tests()
|
|
https://github.com/riscv-software-src/riscv-tests/pull/531#issuecomment-2151081139
|
|
HwbpManual test was broken:
* Value read back from `tselect` was compared with `tdata1` value.
https://github.com/riscv-software-src/riscv-tests/blob/408e461da11e0b298c4b69e587729532787212f5/debug/gdbserver.py#L701-L703
This resulted in the test being reported as not supported, after all the
triggers were checked.
* `tdata1.type` field was not set to `mcontrol`.
* `tselect` value used to be changed by `handle_reset` and not restored.
https://github.com/riscv-software-src/riscv-tests/blob/408e461da11e0b298c4b69e587729532787212f5/debug/programs/entry.S#L79-L84
* Manual breakpoint used to be left behind.
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
|
|
debug: workaround for sporadic failures of some tests due to unexpected data present in pexpect match
|
|
Memory sampling tests fail sporadically for spike targets. A typical
failure looks as follows (ROI from test log):
```
---------------------------------[ Message ]----------------------------------
139670831 not less than 124104544
--------------------------------[ Traceback ]---------------------------------
... SECTION IS SKIPPED FOR READABILITY ...
raise TestFailed(f"{a!r} not less than {b!r}", comment)
testlib.TestFailed
```
Few observations:
- 139670831 is 0x0853352f in hex, while 124104544 is 0x0765af60
- Now, the assert which is failing corresponds to the following
expression:
```
assertLess(value, previous_value + tolerance)
```
- tolerance is `0x500000`. (124104544 - 0x500000) is 0x0715af60
If we look at the sampling output for such failing test, we'll see:
```
...
0x1212340c5c: 0x0715af60
timestamp after: 878087500
timestamp before: 878088133
0x1212340c5c: 0x0853352f
...
```
The log above demonstrates the reason for the failure. Since memory
sampling occures every poll (which by default happens approximately
every 100ms) a value of the counter may exceed the threshold if the time
between subsequent polls is increased (for whatever reason).
In my opinion the failing assert can be safely removed, since the checks
it perform are quite brittle and cannot be generalized. The assert
violation is affected by CPU performance and sporadic delays between
polls.
For now, instead of assert removal we just avoid checks in-between
memory sample bursts. This way we still can be certain that memory
samples are frequent enough and hopefully this will avoid sporadic
failures.
|
|
present in pexpect match
Problem was observed on UnavailableMultiTest - this test was sporadically failing.
When the failure was observed the log of the failing test looked as follows:
```
File "/whatever/RISCVTests/debug/testlib.py", line 504, in <genexpr>
if all(targets[hart.id]["State"] == "running" for hart in harts):
~~~~~~~~~~~~~~~~^^^^^^^^^
KeyError: 'State'
```
Adding this modification to testlib.py
```
--- a/debug/testlib.py
+++ b/debug/testlib.py
@@ -498,6 +498,10 @@ class Openocd:
for line in lines[2:]:
if line.strip():
data.append(dict(zip(headers, line.split()[1:])))
+ str_data = str(data)
+ sys.stdout.flush()
+ sys.stdout.write(f"parsed targets:\n{result}\n===\n{str_data}\n---\n")
+ sys.stdout.flush()
return data
```
Allowed me to root cause the issue. Namely we have the following
situation:
```
parsed targets:
Exception ignored in: <function _TemporaryFileCloser.__del__ at 0x7f2dee64d1c0>
Traceback (most recent call last):
File "/usr/local/lib/python3.11/tempfile.py", line 450, in __del__
self.close()
File "/usr/local/lib/python3.11/tempfile.py", line 446, in close
unlink(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/gdb@38873-8s6ud03x.log'
...
TargetName Type Endian TapName State
-- ------------------ ---------- ------ ------------------ ------------
0 riscv.cpu0 riscv little riscv.cpu running
1* riscv.cpu1 riscv little riscv.cpu running
===
[{'Exception': '"/usr/local/lib/python3.11/tempfile.py",', 'ignored': 'line', 'in:': '450,', ...
```
The only reasonable (to me) explanation for the observed behavior is below.
Here is how we connect to TCL-RPC server:
```
self.openocd_cli = pexpect.spawn(f"nc localhost {self.tclrpc_port}")
tty.setraw(self.openocd_cli.child_fd)
```
Later we request target list by issuing "targets" command:
```
self.command("targets")
```
Internally, pexpect.spawn implemented as follows:
- we fork the process
- set up pty and then call execve
- all these steps are written in python
"Exception ignored" messages are result of exceptions thrown from
finalizers of NamedTemporaryFile objects. When exception is thrown from
the finalizer - python unconditionally prints a "warning" to stderr. It
seems that these messages are polluting our output from "nc" since python
GC can be invoked before the execve syscall.
The workaround is just to make sure that execve was executed before we
rely on the format of command output. To have such a guarantee we just
issue a dummy "echo" command and check that we have a proper reply in the
output stream.
While this explanation looks convincing, the behavior above still looks
strange, given that we have https://bugs.python.org/issue14548 which
was resolved long ago.
However, the proposed workaround fixes the issue.
|
|
Remove old warning check in RepeatReadTest
|
|
debug: Fix nonexistent trigger registers trap handle in entry.S
|
|
This patch fixes the case when we are using an empty exception list (for example just a YAML file with comments but without any test items to skip).
|
|
Spike simulator is very demanding to CPU resources. This causes debug
tests to sporadically fail on slower machines. Increasing of gdb's
`remotetimeout` should get rid of such failures, unless we run the
testsuite on a potato.
The only downside is that if OpenOCD is broken, tests can run longer.
However, I think this is the sacrifice we can make, since execution time
is not affected if everything works as expected.
|
|
|
|
Check the mcontrol triggers, no other triggers.
|
|
Clear breakpoints so that gdb will not single step
|
|
improvements to debug tests infrastructure to help with triaging process
|
|
Add virtual memory synchronization after completing the page tables
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
Change-Id: Iac914aef8080411e6acd9039c4bdfa728533103c
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
Change-Id: I7a4a24972cfa2ddc307a5f06fe3fd5380794719f
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
Change-Id: Ida1490338d204541c5c7f143aec3b8d79d83d7f4
|
|
fixes setting of `remotetimeout`. It was silently overwritten by default
values from platform definition even if user specified one.
|
|
introduce a new option to log communications over GDB remote serial
protocol which is helpful for debugging some tests.
|
|
Previously the seed was not printed and this created problems with
reproduction of the issues. It's still not an ideal - meaning
interactions between spike/gdb/openocd are inherently non-determistic
(since time is involved), but at least we should get the same sources
for the same seed now.
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
|
|
log file
Quick and dirty fix for https://github.com/riscv-software-src/riscv-tests/issues/520
|
|
This is taking into account that the hardware limits count to 1.
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
|
|
Make the non-existent csr configurable
|
|
Make CLINT address configurable
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
|
|
Test behavior when a hart becomes unavailable while halted.
|
|
Resolves #510.
|
|
Disable timer interrupt to fix some bugs
|
|
Signed-off-by: liangzhen <zhen.liang@spacemit.com>
|
|
debug: Add Openocd.set_available()
|
|
debug: Better interlock when interacting with gdb CLI.
|
|
This helper uses dmi_write commands to mark harts
available/unavailable.
|
|
debug: Make Openocd.targets() tolerate blank lines.
|
|
debug: Fix interrupt_all() to restore state.
|
|
Actually wait for the command to be echoed back. This means we won't be
confused if there are extra newlines in gdb output.
|
|
|
|
|
|
This lets you reproduce a test running on a specific hart.
|
|
During the github workflow this character is \n, while on my computer
it's ' '. I'm sure there's a good reason for that, but it doesn't seem
worth figuring out what that reason is.
|
|
|
|
Just doing this to make a change in the debug files, which should now
cause the pylint workflow to execute.
|
|
They have issues when run in a github workflow.
|
|
debug: Test OpenOCD behavior when harts become unavailable, using new spike mechanism
|