#1380 closed defect (invalid)
make check fails on Alpine Linux ppc64le
Reported by: | Roberto Oliveira | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | someday |
Component: | compiler | Version: | 4.12.0 |
Keywords: | Cc: | ||
Estimated difficulty: |
Description
I compiled chicken 4.12.0 in Alpine Linux ppc64le (make PLATFORM=linux) and when I try to run tests I got the following error:
$ make PLATFORM=linux check
cd tests; sh runtests.sh
======================================== version tests ...
Checking major and minor version numbers against chicken-version... ok
Checking the registered feature chicken-<major>.<minor>... ok
======================================== compiler tests ...
/home/alpine/aports/community/chicken/src/chicken-4.12.0/tests/../chicken 'compiler-tests.scm' -output-file 'a.c' -types ../types.db -ignore-repository -verbose -include-path /home/alpine/aports/community/chicken/src/chicken-4.12.0/tests/..
[panic] unrecoverable segmentation violation - execution terminated
I saw in docs that chicken is supported on PowerPc? platform but not sure if it works fine with musl libc.
Attachments (1)
Change History (15)
comment:1 Changed 7 years ago by
comment:2 Changed 7 years ago by
Are you absolutely sure you're on 4.12.0? Because in that version, the line on which it's failing reads:
serious_signal_occurred = 0;
That's just an assignment to a global variable. So if that's really the crash point, something very fishy is going on.
In any case, PowerPC is certainly supported, I test occasionally on PPC 32 bit. I don't have access to a 64-bit PowerPC machine, unfortunately.
Can you build a fresh CHICKEN with DEBUGBUILD=1
on each make
invocation and try again? This adds lots of debugging info (it builds with -g
but also adds several sanity checks that could catch errors sooner)
comment:3 Changed 7 years ago by
Yes, I am using version 4.12.0.
I compiled with what you suggested and when I tried to run ./csc -help, I got the same error.
It is really pointing to runtime.c:1505, but I noticed that before this line it does a call to "C_sigsetjmp(C_restart, 0);" and maybe it is corrupting the stack?
I tested chicken 4.12.0 in an Ubuntu (that uses glibc) and it worked, but in Alpine (that uses musl) I am getting this issue.
I have access to a ppc64le machine running Alpine, in case you need/want it.
comment:4 Changed 7 years ago by
I think that would be helpful. So far I've tried to build CHICKEN with musl (on x86_64), but failed due to Debian's musl package being totally crippled.
If I send you my SSH public key via a PGP-signed e-mail, do you consider that an acceptable level of security? My PGP key can be found at http://www.more-magic.net/peter-bex.asc and it's been signed by several other core committers.
comment:5 Changed 7 years ago by
I guess so, btw, I will ping you in 'chicken' channel at freenode. My user is 'rdutra'
comment:6 Changed 7 years ago by
Resolution: | → invalid |
---|---|
Status: | new → closed |
rgdoliveira noticed that after the setjmp by the GC, register r2 contains a strange value.
According to the PowerPC processor ABI supplement, r2 contains a TOC which acts as a base pointer to the local data section (all the static
stuff in a compilation unit).
When linking different compilation units together (and only if they're linked position-independently?) function calls may be patched up by the linker to restore r2 after returning. There's some blathering about function pointers being treated specially because they need to carry their TOC inside them, so the situations in which this makes a difference seem to be pretty specific.
The musl code for longjmp does not seem to restore r2 after returning, so it's quite likely that the crash on the local variable assignment is caused by this.
Therefore, I think this is not a CHICKEN issue but a musl issue. Please reopen if you disagree.
comment:7 Changed 7 years ago by
sjamman,
The musl code you sent here regarding longjmp implementation is from powerpc (32 bits version). The implementation for ppc64le is this one: http://git.musl-libc.org/cgit/musl/tree/src/setjmp/powerpc64/longjmp.s and it restores the r2.
I will reopen the bug for now.
comment:8 Changed 7 years ago by
Resolution: | invalid |
---|---|
Status: | closed → reopened |
comment:9 Changed 7 years ago by
The problem seems to be that the call to C_alloc
in runtime.c:1512 returns a pointer into an already used portion of the stack frame. The following C_memcpy
will overwrite the register %r2
stored in $r1+24
after the C_setjmp
in runtime.c:1502.
To reproduce the problem: set a breakpoint in runtime.c:1505, and continue until this line is executed the second time. The first non-local return from the setjmp is fine, then the memcpy will overwrite part of the stack frame and the next non-local return will pick up a clobbered value for $r2
and other registers later, including $r1
, I think.
What causes this is not clear to me, perhaps an invalid interaction between musl and the C compiler, or even a compiler bug, as musl seems to use __builtin_alloca
when compiled with gcc.
comment:10 Changed 7 years ago by
I compiled this test program and it crashes after the 1st longjmp
.
I'm not sure if I'm doing something wrong, or whether alloca
and setjmp
are not supposed to work in this way. On my x86 machine this runs fine.
Changed 7 years ago by
simple test program using setjmp/alloca in the manner used in CHICKEN_run
comment:11 Changed 7 years ago by
I tried this also on arm and on the test machine (not in the VM), but linked with glibc on the latter and both work fine. I still don't quite get what the problem is, but can only assume now that the ppc implementation of setjmp
in musl must be broken.
comment:12 Changed 7 years ago by
I sent a bug report to the musl maintainers: http://www.openwall.com/lists/musl/2017/07/31/5
(see also followups)
I think a workaround may be possible by factoring out the part for copying the argvector (done in CHICKEN_run
and C_callback
) into a separate function and thus into a new stack-frame that does not share with the one in where the setjmp
is done. Do we want to do this?
comment:13 Changed 7 years ago by
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
I don't think a workaround is worthwhile. It only happens on one uncommon platform with one C library that's not very common either and it looks like musl has fixed it.
comment:14 Changed 7 years ago by
Thanks for your help sjamaan and felix!
The fix provided by musl worked fine and now chicken builds fine on alpine ppc64le and running the tests.
Just for the records, the ppc64le package is available at: http://rsync.alpinelinux.org/alpine/edge/community/ppc64le/chicken-4.12.0-r2.apk
Just an update, the backtrace from gdb shows:
(gdb) bt
# 0 0x00003fffb7e59c68 in CHICKEN_run (toplevel=<optimized out>) at runtime.c:1505
# 1 0x00003fffb7e5ba20 in CHICKEN_main (argc=<optimized out>, argv=<optimized out>,
# 2 0x0000000020044720 in main (argc=<optimized out>, argv=<optimized out>) at chicken.c:1151