Opened 3 months ago

Closed 4 weeks ago

Last modified 4 weeks ago

#1380 closed defect (invalid)

make check fails on Alpine Linux ppc64le

Reported by: rgdoliveira Owned by:
Priority: major Milestone: someday
Component: compiler Version: 4.12.0
Keywords: Cc:
Estimated difficulty:

Description

I compiled chicken 4.12.0 in Alpine Linux ppc64le (make PLATFORM=linux) and when I try to run tests I got the following error:

$ make PLATFORM=linux check

cd tests; sh runtests.sh
======================================== version tests ...
Checking major and minor version numbers against chicken-version... ok
Checking the registered feature chicken-<major>.<minor>... ok
======================================== compiler tests ...
/home/alpine/aports/community/chicken/src/chicken-4.12.0/tests/../chicken 'compiler-tests.scm' -output-file 'a.c' -types ../types.db -ignore-repository -verbose -include-path /home/alpine/aports/community/chicken/src/chicken-4.12.0/tests/..
[panic] unrecoverable segmentation violation - execution terminated

I saw in docs that chicken is supported on PowerPc? platform but not sure if it works fine with musl libc.

Attachments (1)

x.c (317 bytes) - added by felix 8 weeks ago.
simple test program using setjmp/alloca in the manner used in CHICKEN_run

Download all attachments as: .zip

Change History (15)

comment:1 Changed 2 months ago by rgdoliveira

Just an update, the backtrace from gdb shows:

(gdb) bt
# 0 0x00003fffb7e59c68 in CHICKEN_run (toplevel=<optimized out>) at runtime.c:1505
# 1 0x00003fffb7e5ba20 in CHICKEN_main (argc=<optimized out>, argv=<optimized out>,

toplevel=0x20045500 <C_toplevel>) at runtime.c:590

# 2 0x0000000020044720 in main (argc=<optimized out>, argv=<optimized out>) at chicken.c:1151

comment:2 Changed 2 months ago by sjamaan

Are you absolutely sure you're on 4.12.0? Because in that version, the line on which it's failing reads:

  serious_signal_occurred = 0;

That's just an assignment to a global variable. So if that's really the crash point, something very fishy is going on.

In any case, PowerPC is certainly supported, I test occasionally on PPC 32 bit. I don't have access to a 64-bit PowerPC machine, unfortunately.

Can you build a fresh CHICKEN with DEBUGBUILD=1 on each make invocation and try again? This adds lots of debugging info (it builds with -g but also adds several sanity checks that could catch errors sooner)

comment:3 Changed 2 months ago by rgdoliveira

Yes, I am using version 4.12.0.

I compiled with what you suggested and when I tried to run ./csc -help, I got the same error.

It is really pointing to runtime.c:1505, but I noticed that before this line it does a call to "C_sigsetjmp(C_restart, 0);" and maybe it is corrupting the stack?

I tested chicken 4.12.0 in an Ubuntu (that uses glibc) and it worked, but in Alpine (that uses musl) I am getting this issue.

I have access to a ppc64le machine running Alpine, in case you need/want it.

comment:4 Changed 2 months ago by sjamaan

I think that would be helpful. So far I've tried to build CHICKEN with musl (on x86_64), but failed due to Debian's musl package being totally crippled.

If I send you my SSH public key via a PGP-signed e-mail, do you consider that an acceptable level of security? My PGP key can be found at http://www.more-magic.net/peter-bex.asc and it's been signed by several other core committers.

comment:5 Changed 2 months ago by rgdoliveira

I guess so, btw, I will ping you in 'chicken' channel at freenode. My user is 'rdutra'

comment:6 Changed 2 months ago by sjamaan

  • Resolution set to invalid
  • Status changed from new to closed

rgdoliveira noticed that after the setjmp by the GC, register r2 contains a strange value.

According to the PowerPC processor ABI supplement, r2 contains a TOC which acts as a base pointer to the local data section (all the static stuff in a compilation unit).

When linking different compilation units together (and only if they're linked position-independently?) function calls may be patched up by the linker to restore r2 after returning. There's some blathering about function pointers being treated specially because they need to carry their TOC inside them, so the situations in which this makes a difference seem to be pretty specific.

The musl code for longjmp does not seem to restore r2 after returning, so it's quite likely that the crash on the local variable assignment is caused by this.

Therefore, I think this is not a CHICKEN issue but a musl issue. Please reopen if you disagree.

comment:7 Changed 2 months ago by rgdoliveira

sjamman,

The musl code you sent here regarding longjmp implementation is from powerpc (32 bits version). The implementation for ppc64le is this one: http://git.musl-libc.org/cgit/musl/tree/src/setjmp/powerpc64/longjmp.s and it restores the r2.

I will reopen the bug for now.

comment:8 Changed 2 months ago by rgdoliveira

  • Resolution invalid deleted
  • Status changed from closed to reopened

comment:9 Changed 8 weeks ago by felix

The problem seems to be that the call to C_alloc in runtime.c:1512 returns a pointer into an already used portion of the stack frame. The following C_memcpy will overwrite the register %r2 stored in $r1+24 after the C_setjmp in runtime.c:1502.

To reproduce the problem: set a breakpoint in runtime.c:1505, and continue until this line is executed the second time. The first non-local return from the setjmp is fine, then the memcpy will overwrite part of the stack frame and the next non-local return will pick up a clobbered value for $r2 and other registers later, including $r1, I think.

What causes this is not clear to me, perhaps an invalid interaction between musl and the C compiler, or even a compiler bug, as musl seems to use __builtin_alloca when compiled with gcc.

comment:10 Changed 8 weeks ago by felix

I compiled this test program and it crashes after the 1st longjmp.

I'm not sure if I'm doing something wrong, or whether alloca and setjmp are not supposed to work in this way. On my x86 machine this runs fine.

Changed 8 weeks ago by felix

simple test program using setjmp/alloca in the manner used in CHICKEN_run

comment:11 Changed 8 weeks ago by felix

I tried this also on arm and on the test machine (not in the VM), but linked with glibc on the latter and both work fine. I still don't quite get what the problem is, but can only assume now that the ppc implementation of setjmp in musl must be broken.

comment:12 Changed 7 weeks ago by felix

I sent a bug report to the musl maintainers: http://www.openwall.com/lists/musl/2017/07/31/5
(see also followups)

I think a workaround may be possible by factoring out the part for copying the argvector (done in CHICKEN_run and C_callback) into a separate function and thus into a new stack-frame that does not share with the one in where the setjmp is done. Do we want to do this?

comment:13 Changed 4 weeks ago by sjamaan

  • Resolution set to invalid
  • Status changed from reopened to closed

I don't think a workaround is worthwhile. It only happens on one uncommon platform with one C library that's not very common either and it looks like musl has fixed it.

comment:14 Changed 4 weeks ago by rgdoliveira

Thanks for your help sjamaan and felix!

The fix provided by musl worked fine and now chicken builds fine on alpine ppc64le and running the tests.

Just for the records, the ppc64le package is available at: http://rsync.alpinelinux.org/alpine/edge/community/ppc64le/chicken-4.12.0-r2.apk

Note: See TracTickets for help on using tickets.