Opened 5 days ago

#1851 new defect

utf8 egg: Missing char sets and outdated tables

Reported by: Zipheir Owned by:
Priority: minor Milestone: someday
Component: unknown Version: 5.4.0
Keywords: unicode Cc:
Estimated difficulty:

Description

The unicode-char-sets module of the utf8 egg is missing several character sets. In particular, there is no set for characters with the Numeric property (making it impossible to implement a Unicode-aware 'char-numeric?' in CHICKEN) or for any of the punctuation properties. The utf8-srfi-14 module includes char-set:digit and char-set:punctuation, but these are throwaway ASCII-only implementations (in a file that begins with "Unicode capable char-sets", no less!). These sets should be added.

Furthermore, the sets that unicode-char-sets does provide seem to be built on data that is extremely out-of-date. The header comment in unicode-char-sets.scm claims the tables were generated in 2007.

Change History (0)

Note: See TracTickets for help on using tickets.