Opened 8 years ago
Closed 8 years ago
#1322 closed defect (wontfix)
Locale can influence how CHICKEN reads numbers
Reported by: | sjamaan | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 4.12.0 |
Component: | core libraries | Version: | 4.11.0 |
Keywords: | number parsing | Cc: | |
Estimated difficulty: | hard |
Description (last modified by )
Because CHICKEN uses the libc strtol
/strtoll
and strtod
functions when reading flonums and fixnums, locale settings may influence how CHICKEN reads numbers, especially in decode_literal
.
Hugo Arregui provided the following simple test:
;; Compile this with the -embedded option, since it defines its own main() (import chicken scheme foreign) #> #include <locale.h> int main(int argc, char** argv) { setlocale(LC_NUMERIC, "es_AR.UTF-8"); CHICKEN_run(C_toplevel); return 0; } <# (return-to-host)
This fails because the runtime system has several encoded floating-point numbers, which will no longer be read correctly. Also note that strtod
might incorrectly "parse" a floating-point number like 1.002
if it happens to be valid in the current locale using thousands separators.
Parsing floating-point numbers in C is going to be pretty damn tricky, so we might just try and use setlocale()
to set the locale to C
and restore it to whatever it was before after doing so. I have no idea what the effects are of calling these functions often in the same program, and if there's a performance impact (it might be loading the strings or formatting rules for this locale every single time, on the fly, since it'll be designed for "normal" programs in which setlocale()
will be called only a handful of times)
See also https://github.com/JuliaLang/julia/pull/5988 for example
Change History (4)
comment:1 Changed 8 years ago by
Component: | unknown → core libraries |
---|
comment:2 Changed 8 years ago by
Description: | modified (diff) |
---|
comment:3 Changed 8 years ago by
comment:4 Changed 8 years ago by
Resolution: | → wontfix |
---|---|
Status: | new → closed |
It's probably not worth fixing this in the 4 series.
Note that this particular situation will have been fixed in CHICKEN 5 already; we simply encode flonums as a packed byte sequence, and "large" fixnums (> 30 bits) as bignums, which will be simplified to fixnums after reading. The bignum reader doesn't use
strtod
. Note that there's still some compatibility code in runtime.c that still triggers the old code path. This is to make it possible to compile CHICKEN 5 through a boot-chicken with CHICKEN 4.Regardless of this being fixed in CHICKEN 5, there could still be issues lurking due to locale mismatch, we should really try to figure out a way to catch these stupid bugs :(