Opened 8 years ago

Last modified 8 years ago

#1322 closed defect

Locale can influence how CHICKEN reads numbers — at Version 2

Reported by: sjamaan Owned by:
Priority: major Milestone: 4.12.0
Component: core libraries Version: 4.11.0
Keywords: number parsing Cc:
Estimated difficulty: hard

Description (last modified by sjamaan)

Because CHICKEN uses the libc strtol/strtoll and strtod functions when reading flonums and fixnums, locale settings may influence how CHICKEN reads numbers, especially in decode_literal.

Hugo Arregui provided the following simple test:

;; Compile this with the -embedded option, since it defines its own main()
(import chicken scheme foreign)

#>
#include <locale.h>

int main(int argc, char** argv) {
   setlocale(LC_NUMERIC, "es_AR.UTF-8");
   CHICKEN_run(C_toplevel);
   return 0;
}
<#

(return-to-host)

This fails because the runtime system has several encoded floating-point numbers, which will no longer be read correctly. Also note that strtod might incorrectly "parse" a floating-point number like 1.002 if it happens to be valid in the current locale using thousands separators.

Parsing floating-point numbers in C is going to be pretty damn tricky, so we might just try and use setlocale() to set the locale to C and restore it to whatever it was before after doing so. I have no idea what the effects are of calling these functions often in the same program, and if there's a performance impact (it might be loading the strings or formatting rules for this locale every single time, on the fly, since it'll be designed for "normal" programs in which setlocale() will be called only a handful of times)

See also https://github.com/JuliaLang/julia/pull/5988 for example

Change History (2)

comment:1 Changed 8 years ago by sjamaan

Component: unknowncore libraries

comment:2 Changed 8 years ago by sjamaan

Description: modified (diff)
Note: See TracTickets for help on using tickets.