Opened 7 years ago

Closed 6 years ago

#1395 closed defect (fixed)

scsh-process: tests hang

Reported by: Mario Domenech Goulart Owned by:
Priority: major Milestone: someday
Component: extensions Version: 4.12.0
Keywords: scsh-process, tests, hang Cc:
Estimated difficulty:

Description (last modified by Mario Domenech Goulart)

For some months now scsh-process' tests have been randomly hanging on the Linux salmonella machines (maybe also on the FreeBSD machine?). When the test script hangs, it blocks the whole salmonella execution. On the Linux salmonella machines, it hangs at least twice per week (that's a very conservative estimate).

I think the problem started around 9 of June. I realized it on the 14th (I know that because I checked the IRC log when I mentioned that). We have a gap between 9 and 14 in the salmonella linux x86-64: https://salmonella-linux-x86-64.call-cc.org/master-debugbuild/gcc/linux/x86-64/2017/06/ (I probably only realized the hang on the 14th). The logs of the FreeBSD machine also have a gap starting on the 9th: https://salmonella-freebsd-x86-64.call-cc.org/master/clang/freebsd/x86-64/2017/06/

Apparently, the chicken core repo didn't have any relevant change around that time.

The issue might be related to releases 0.8 (2017-05-23) or 0.8.1 (2017-06-03) of scsh-process. The change in release 0.8.1 looks particularly suspicious (86310f955).

I've installed a hack on the Linux salmonella machines to periodically check if scsh-process' test script is hanging and kill the parent of the defunct process (see ec0cffb94f in the chicken-infrastructure repo).

The bad news is that I cannot reproduce the issue on my machine. I tried to run csi -s run.scm < /dev/null > /dev/null 2>&1 in a loop and it ran flawlessly for more than 3500 iterations.

Change History (3)

comment:1 Changed 7 years ago by Mario Domenech Goulart

Description: modified (diff)

comment:2 Changed 7 years ago by Mario Domenech Goulart

Description: modified (diff)

comment:3 Changed 6 years ago by Mario Domenech Goulart

Resolution: fixed
Status: newclosed

Version 1.1.0 of scsh-process seems to fix the issue. At least I could not make tests hang anymore on the test machines where I could previously reproduce the problem.

On my notebook (Debian Jessie, 8 CPU cores), tests ran for more than 5500 iterations without problems.

On the salmonella-linux-x86-64 machine (Debian Stretch, 4 CPU cores), tests ran for more than 5500 iterations without problems.

On the salmonella-linux-x86 machine (Debian Stretch, 1 CPU core), tests ran for more than 2000 iterations without problems.

Note: See TracTickets for help on using tickets.