Opened 7 years ago
Closed 7 years ago
#1395 closed defect (fixed)
scsh-process: tests hang
Reported by: | Mario Domenech Goulart | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | someday |
Component: | extensions | Version: | 4.12.0 |
Keywords: | scsh-process, tests, hang | Cc: | |
Estimated difficulty: |
Description (last modified by )
For some months now scsh-process' tests have been randomly hanging on the Linux salmonella machines (maybe also on the FreeBSD machine?). When the test script hangs, it blocks the whole salmonella execution. On the Linux salmonella machines, it hangs at least twice per week (that's a very conservative estimate).
I think the problem started around 9 of June. I realized it on the 14th (I know that because I checked the IRC log when I mentioned that). We have a gap between 9 and 14 in the salmonella linux x86-64: https://salmonella-linux-x86-64.call-cc.org/master-debugbuild/gcc/linux/x86-64/2017/06/ (I probably only realized the hang on the 14th). The logs of the FreeBSD machine also have a gap starting on the 9th: https://salmonella-freebsd-x86-64.call-cc.org/master/clang/freebsd/x86-64/2017/06/
Apparently, the chicken core repo didn't have any relevant change around that time.
The issue might be related to releases 0.8 (2017-05-23) or 0.8.1 (2017-06-03) of scsh-process. The change in release 0.8.1 looks particularly suspicious (86310f955).
I've installed a hack on the Linux salmonella machines to periodically check if scsh-process' test script is hanging and kill the parent of the defunct process (see ec0cffb94f in the chicken-infrastructure repo).
The bad news is that I cannot reproduce the issue on my machine. I tried to run csi -s run.scm < /dev/null > /dev/null 2>&1
in a loop and it ran flawlessly for more than 3500 iterations.
Change History (3)
comment:1 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 7 years ago by
Description: | modified (diff) |
---|
comment:3 Changed 7 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Version 1.1.0 of scsh-process seems to fix the issue. At least I could not make tests hang anymore on the test machines where I could previously reproduce the problem.
On my notebook (Debian Jessie, 8 CPU cores), tests ran for more than 5500 iterations without problems.
On the salmonella-linux-x86-64 machine (Debian Stretch, 4 CPU cores), tests ran for more than 5500 iterations without problems.
On the salmonella-linux-x86 machine (Debian Stretch, 1 CPU core), tests ran for more than 2000 iterations without problems.