source: project/wiki/chicken-5-roadmap @ 32685

Last change on this file since 32685 was 32685, checked in by sjamaan, 5 years ago

Update roadmap with completion statuses

File size: 20.4 KB
Line 
1== CHICKEN 5 roadmap
2
3[[toc:]]
4
5Here's a proposed list of things we would like to see in CHICKEN 5.
6Feel free to add more details if you know of a way to implement
7something or have an idea how to improve some part.  Please, no
8editing flamewars here!
9
10=== Modularising the compiler [done]
11
12This work has been completed: the compiler now is composed of modules
13in the {{chicken-5}} branch (prefixed with {{chicken.compiler}}), but
14the following "nice to haves" are not yet implemented:
15
16* Define an "official API" for users of the compiler.  Basically everything that's currently being done through ugly {{##compiler#}} hacks should have a supported, documented way to do it.  Later, we can expose more features.
17** Hooks for adding new foreign types. Used by {{bind}}.
18** Hooks for adding new compiler literals?  Examples of this are the CHICKEN 4 {{numbers}} egg or when we want to turn srfi-4 into an egg.
19** Some standard way to determine the current source file (ideally this would be a library procedure which works the same way in compiled and evaluated code).  Used for things like the {{s48-modules}} egg.
20** Perhaps a way to define new compilation stages.
21
22These should be considered after CHICKEN 5 is released.  Of course, if
23you want to tackle one of these before, feel free to submit a patch.
24
25=== Reworking the core modules ("units")
26
27Right now the modules supplied by core are somewhat arbitrarily named,
28and too many unrelated things are grouped together.  We should go
29through the system and look at what we have, then make logical names.
30Suggestion to appear later on this page, for further discussion.  We
31should attempt to align it with the r7rs naming conventions, to make
32things easy for that egg, and for people new to CHICKEN but familiar
33with other r7rs implementations.  This probably means "scheme" should
34be renamed and split up to "scheme.base", "scheme.load", etc.  A
35possible generalisation (or "convenience hack") could be to define the
36"scheme" module to import all of the underlying submodules.
37
38* As I've posted to the mailing list, I think using hyphen makes more sense than using dot.  --John Cowan
39** I think we're pretty much resolved to using dots, for various reasons appearing on the list (and because there's momentum in the other direction with e.g. the compiler modules). -- eh
40
41==== Replacing SRFI-14 with cset implementation from irregex? [irrelevant]
42
43This has been discussed ages ago.  It might be more memory-friendly
44and performant.  One problem with the current SRFI-14 module is that
45it assumes Latin1 encoding (and therefore can only handle 256
46different characters), whereas most other CHICKEN components and eggs
47assume UTF-8.
48
49* Strong +1.  --John Cowan
50* Note that in the "Proposed removal from core" section below, srfi-14 is proposed to be removed from the core. --mario
51
52This is not needed, because SRFI-14 is no longer part of core.  The
53egg could still benefit from it, but it's not something that will hold
54up the CHICKEN 5 release.
55
56
57==== Refactoring the CHICKEN test suite to use a core library? [status uncertain]
58
59As we remove a lot of cruft from core which it doesn't need, it may be
60a good idea to add some things that we ''do'' need.  Like the {{test}}
61egg: there is a lot of macro code duplication in core's test suite.
62It's probably better to ship a well-designed testing library with
63core, which core itself can also use.  This would make it easier, if
64we decide to do this later, to format test output on Salmonella in a
65consistent manner for both core and eggs.
66
67* That could even be done for CHICKEN 4, since it wouldn't break anything. -- mario
68
69==== Proposed libraries [incomplete]
70
71Let's follow R7RS for these:
72
73* chicken.base
74* chicken.case-lambda
75* chicken.char
76* chicken.complex
77* chicken.cxr
78* chicken.eval
79* chicken.file
80* chicken.inexact
81* chicken.lazy
82* chicken.load
83* chicken.process-context
84* chicken.read
85* chicken.repl
86* chicken.time (need this? want this?)
87* chicken.write
88
89What will we do with the SRFIs we implement?  It would make sense to
90define the following, but it would be tedious to import all these:
91
92; srfi-2 : and-let*
93; srfi-8 : receive
94; srfi-31 : rec
95; srfi-26 : cut, cute
96; srfi-17 : setter, getter-with-setter
97; srfi-10 : define-reader-ctor
98; srfi-39 : parameter objects
99
100* I'm planning to propose some of these (2, 8, 31, 26, 17) in a single R7RS-large library, probably called (scheme control) or (scheme control simple).  --John Cowan
101** Since this hasn't been standardised yet, and for improved compatibility and consistency with other Schemes, it's probably a good idea to define them as separate modules anyway.  Note that this does not preclude re-exporting them elsewhere as well. --Peter Bex
102*** I agree that they should be defined in their own modules and then reexported by some larger modules, e.g. some {{chicken}} library includes {{and-let*}}, etc. -- eh
103
104Also, is it {{srfi-2}} or {{srfi.2}}?  The latter would match up with
105{{(srfi 2)}} usage which is reserved by R7RS for SRFIs.
106
107* If we get rid of dots, then it's just {{srfi-2}} without special-casing it as the R7RS egg apparently does right now.
108** I don't quite understand this comment. {{srfi.2}} will map without special casing, is that what's meant by that? -- eh
109
110The list below is just a proposal, can be changed at any time.  We
111should also keep an eye on R7RS WG2, which may define a few things
112CHICKEN currently defines already.
113
114See also the concrete proposal taking shape
115[[/core-libraries-reorganization|here]], which subsumes the following.
116
117; chicken.modules : module, import, export, reexport, define-interface, module-environment, functor, use
118; chicken.types : {{:}}, the, assume, define-type, define-specialization, compiler-typecase
119; chicken.reader-extensions : set-read-syntax!, set-sharp-read-syntax!, set-parameterized-read-syntax!, copy-read-table, current-read-table (perhaps re-export define-reader-ctor?)
120; chicken.fixnum : fx+, fx-, fx/, fx*, fx<, fx<=, fx=, fx>, fx>=, fxand, fxeven?, fxior, fxmax, fxmin, fxmod, fxneg, fxnot, fxodd?, fxshl, fxshr, fxxor, fixnum-bits(?), fixnum-precision, most-positive-fixnum,  most-negative-fixnum, fixnum-bits, fixnum-precision, fixnum?
121; chicken.flonum : fp+, fp-, fp/, fp*, fp<, fp<=, fp=, fp>, fp>=, fpfloor, fpceiling, fptruncate, fpround, fpsin, fpcos, fptan, fpasin, fpacos, fpatan, fpatan2 (?), fplog, fpexp, fpexpt, fpsqrt, fpabs, fpinteger?, maximum-flonum, minimum-flonum, flonum-radix, flonum-epsilon, flonum-precision, flonum-decimal-precision, flonum-maximum-exponent, flonum-minimum-exponent, flonum-maximum-decimal-exponent, flonum-minimum-decimal-exponent, flonum?
122; chicken.syntax : er-macro-transformer, ir-macro-transformer, gensym(?), expand (is this useful at all?), get-line-number, strip-syntax.
123; chicken.bitwise : the subset of srfi-60 we support: bit-set?, bitwise-and, bitwise-not, bitwise-ior, bitwise-xor.  Possibly complete it with the remaining operations, and call it just "srfi-60"?
124; chicken.ports : The current stuff in ports, except for the string ports in scheme.base (also, see below).  Perhaps get rid of port-fold, copy-port, port, map, port-for-each?
125; chicken.exceptions (or srfi-12?  Would make more sense, but what about our extensions?  Put those in chicken.srfi-12?) : All the exception handling stuff.
126; chicken.load : If we want to keep them, load-noisily, load-relative, load-library
127; chicken.format : {{[fs]?printf}}, format (do we need this?), pp, pretty-print, pretty-print-width
128
129* If you put {{use}} in a module, how do you get access to that module?  I favor the R7RS solution, in which {{import}} does what Chicken {{use}} does, and is special-cased in terms of the module system so that it is always available.  --John Cowan
130** Right now, I think the module import/export forms are always available inside a module form.  This is no different from special-casing it, I think (unless I'm misunderstanding something). --Peter Bex
131
132* There will be R7RS (scheme fixnum) and (scheme flonum) modules.  I'm currently proposing to base the fixnums on R6RS and the flonums on {{math.h}} (not the egg of that name, but the whole C interface).  --John Cowan
133** That sounds like they'll be somewhat different from the list of identifiers we have.  And it will take a while before it's finalized I guess, so it's safer to define our own and later add the r7rs versions if we deem it acceptable. --Peter Bex
134
135
136==== Proposed removal from core
137
138The list below is just one hacker's idea of what could go.  Please add more.
139
140===== SRFIs [done]
141
142SRFI-1, SRFI-13, SRFI-14, SRFI-18 might be removed. SRFI-69 will be
143removed, as discussed in CR #1142.
144
145As pointed out several times by John Cowan, SRFI-15 (fluid-let) is
146unsafe in the presence of threads, and any use is most likely broken
147and should be replaced with R7RS/SRFI-39 parameters.  Currently, core
148uses it in a few places, in a possibly dangerous way.
149
150Most importantly, there is no reason it has to be in core, because it
151uses only basic primitives.  I think it's best to delegate it to an
152egg.
153
154===== queue datatype (data-structures), binary-search (data-structures), mmapped files (posix), object-evict (lolevel) [done]
155
156Proposal already accepted in CR #1142.
157
158* I'm proposing a queue library for R7RS-large.  --John Cowan
159** It would be great if it could be inspired by CHICKEN's, but that's not strictly necessary, as there is plenty of room for multiple queue eggs --Peter Bex
160
161===== combinators [status uncertain]
162
163Some of the combinators from data-structures are very nice, but there
164only a handful of them are actually useful.  There is no technical
165reason to keep them in core, they might fit better in an egg.
166
167* I'm proposing a similar library for R7RS-large.  --John Cowan
168** Maybe we can rip it out of core and wait for R7RS before implementing the egg. --Peter Bex
169
170===== Various ill-conceived POSIX things [status uncertain]
171
172These things I don't like, but doesn't mean it *has* to go.  It may
173always be put in an egg of course.
174
175* file-select (but see the section about refactoring the scheduler!)
176* file-control (no need to be in core)
177* file-mkstemp (too tricky to use properly? maybe a different API)
178* file-read and file-write (too low-level)
179* file-stat (might be changed return a record type?)
180* set-file-position! (see the section on I/O refactoring)
181* All the time stuff.  It's too broken/difficult to use, and might be better off in an egg.  Core uses some of it, so we may need to reconsider and just improve the API.
182* terminal-name, terminal-port?, terminal-size (but chicken-status uses it!)
183* The process-stuff.  There are too many procedures which is confusing.  Boil it down to just one or two essential ones.  Possibly make a "fork&exec" implementation, which maps better to the Windows model, and still works fine on UNIX.
184
185===== Better API for continuations [status uncertain]
186
187Nobody seems to use the "better API for continuations" by Feeley:
188continuation-graft, continuation-capture, continuation-return,
189continuation?
190
191If it doesn't benefit anyone (core doesn't use it, only two eggs do:
192shift-reset and continuations), it can be taken out.  It might be put
193into an egg.
194
195* +1 for an egg.  I'm going to propose this for R7RS-large.  --John Cowan
196
197* FWIW this seems to be pretty deeply-seated in core/runtime.c (to me at least!) -- eh
198
199=== Reworking the way libraries are loaded [incomplete]
200
201Right now there are just too many confusing things, like require, require-extension, use, import, load, load-library, require-library.
202
203* Import (with the function of use) should be the main API.  Load is necessary because it can load things whose names are determined at run time.  It should be able to load either source or binaries.   Include also belongs here.  --John Cowan
204
205Units and modules are confusing also.  This could just be a
206documentation issue.
207
208* Units should IMO be deprecated, with a compiler switch to turn off deprecation when compiling Chicken itself.  --John Cowan
209** I disagree: there's no reason why core should be "special" in any way.  We could de-emphasize their importance in the manual, instead. --Peter Bex
210
211* I disagree that units should be deprecated at all. I agree that import should be the primary API, with an alternative form for importing just identifiers (perhaps even {{(import-identifiers (foo bar))}}). -- eh
212
213==== Make the library load path a search path [incomplete]
214
215This keeps cropping up on IRC: people expect to be able to load libraries from their eggs using a search path containing multiple entries. This would allow you to {{(use ...)}} a module from your application without installing it as an egg.
216
217This is rather tricky: what happens when you compile it and install the whole program into some other location? Also, changing the way it's implemented is nontrivial, as it has been attempted before (see [[http://bugs.call-cc.org/ticket/736|#736]]).
218
219=== Refactoring the scheduler [incomplete]
220
221One missing ability in the scheduler is for threads to block on more
222than one object.  This would allow us to generalise {{file-select}} to
223ports.
224
225=== Refactoring the I/O (ports) system [incomplete]
226
227Currently, ports are somewhat ill-defined: they're a hand-coded record
228type with a bunch of slots, with comments indicating which slot is
229used for what.  It would be cleaner and easier to understand the code
230if this was changed to a "proper" record type.
231
232The {{current-*-port}} identifiers should be rewritten to be proper
233parameters instead of fake ones which are rebound through fluid-let.
234
235Recently I discovered that set-file-position! does not work on string
236ports.  Port position should be part of the official interface, so
237that this is extensible, and if a port implements it, it can be
238rewound.  This makes sense at least for file-backed ports and string
239ports.
240
241* Well, not all file-backed ports are seekable.  --John Cowan
242** That's okay; they can throw a "not implemented" exception. --Peter Bex
243
244This is also a good opportunity to look at why I/O is so slow.
245
246One small improvement I'd like to make is to change write-string to
247accept an offset into the string from which to write.  This would mean
248writing substrings does not have the overhead of first having to copy
249the substring to a new string and '''then''' writing it.  I ran into
250this once and I thought it was a shame, because it's such a trivial
251(but incompatible) modification.
252
253=== Integrating the full numeric tower [done]
254
255This work has been completed: full support for the complete numeric
256tower is available in the {{chicken-5}} branch.  This includes support
257for literals in compiled code as well as full integration with the
258FFI.
259
260=== String encoding [status uncertain]
261
262==== Reject all NUL bytes
263
264If we reject all NUL bytes inside strings, we can encode strings more conveniently
265by adding a NUL terminator to all strings (nothing else changes).  If we do this,
266the FFI does not need to copy strings, which makes it much more lightweight.
267
268Things to look into:
269
270* What if the foreign code mutates the string and inserts a NUL?
271* How do we deal with the length?  Currently the internal operation is ##sys#size, which simply unmasks and returns the string's header.  The GC knows about this general principle.  By adding the NUL byte, we add another special case to the GC.  This is ugly and complicated.
272* Possibly the operations we support on blobs need to be extended, so that all current abuse cases for strings can be handled by blobs.
273
274==== Unicode
275
276This at least needs some additional thought.  Do we want to make UTF-8
277the "official" encoding?  If so, ideally, all string operations should
278reject invalidly encoded byte sequences (should we still allow NUL
279bytes to be represented?).  What to do with the Unicode case folding
280lookup tables, string-ref?
281
282* Go full Unicode.  If Chibi can do it, so can we.  R7RS is factored to push the big Unicode tables into (scheme char).  However, IMO the NUL character is completely worthless as a character: it has no semantics worth mentioning.  We can forbid it in strings, as R7RS-small allows.  --John Cowan
283** Seems sensible. --Peter Bex
284
285If we go full Unicode, the SRFI-4/blob types might need some
286attention, because strings can no longer be (ab)used as byte vectors.
287
288* Why are there both u8vectors and blobs?  IMO they should be the same thing, and should be R7RS bytevectors.  I'm working on a R7RS-large numeric vector library that allows either SRFI 4 style (separate data types for different kinds) or the style used in later SRFIs and R6RS (everything is just a view on top of bytevectors).  --John Cowan
289** u8vectors are less "core" than blobs (which is a consequence of the low-level representation).  In fact, we might be able to take srfi-4 out of core. --Peter Bex
290
291=== Improve the egg system [incomplete]
292
293Since this is a rather comprehensive point, there is now a
294[[chicken-5-roadmap-egg-system|separate document for it]].
295
296=== Make set!'ing of unbound variables an error [incomplete]
297
298R7RS recommends making this an error for modules but allowing it in the REPL.
299
300* We already check for renaming already bound identifiers, maybe that's not so hard after all. I will investigate this --Christian Kellermann
301
302=== Determine how to make CHICKEN 4 eggs live alongside CHICKEN 5 eggs [incomplete]
303
304Currently, "THE SYSTEM" does not have any special considerations for
305the major CHICKEN release used.  This could be considered an
306oversight.  To make it possible to continue using CHICKEN 4 eggs while
307CHICKEN 5 is being developed and matured, there needs to be some sort
308of way to do this.
309
310Currently, we have the master list of available eggs, which lives in
311the svn repo.  THE SYSTEM is extremely simple and doesn't really care
312much about how eggs are supplied, so we could just fire up a second
313instance of henrietta-cache which fetches from a ''different'' master
314list containing the CHICKEN 5 eggs.  However, what can we do to make
315life easier for egg maintainers?
316
317The official CHICKEN egg repo (SVN) already has taken care of this due
318to the {{/release/N}} namespacing.  The thing that needs to be changed
319is the location of the henrietta CGI, to include a version number, or
320we could add an extra URL parameter and teach it about the versions.
321
322For user repos, a simple way is to simply start a second repository
323and call it a day.  However, this will probably result in awkward
324names.  Making a new branch results in the same problem: the master
325branch would correspond to an outdated release!
326
327==== The simplest approach: just carry on
328
329Just continuing in the old repository is possible, if no new releases
330need to be tagged for the old CHICKEN release.  This mostly precludes
331emergency bugfix releases, but these could be continued on a different
332branch (release-info only takes into account tarballs which get
333generated from a tag name, after all!).
334
335To prevent version tag clashes, the egg's major version should be
336bumped for CHICKEN 5.  Let's take for example an egg which has
337released 1.0, 1.2 and 1.3 for CHICKEN 4.  If we bump the major
338version, we can release 2.0, 2.1, etc for CHICKEN 5.  If an important
339bugfix needs to be made for the CHICKEN 4 version, we can continue
340with 1.4.  If we don't bump the major version, the egg would be forced
341to use micro version numbers for those, like 1.3.1.  Both approaches
342are fine, depending on how much effort is expected to be put into the
343"old" branch.
344
345The old release-info file will be untouched and continue to be used by
346the CHICKEN 4 version of Henrietta-cache.  For CHICKEN 5, a new file
347is made (ie myegg.chicken-5.release-info) which starts out empty, and
348as new releases are made will continue with the number where the
349CHICKEN 4 branch left off.
350
351==== Rework each egg's release namespace
352
353Another, possibly cleaner, approach is the following:
354
355* When an egg is ported to CHICKEN 5, rename or copy all existing tags,
356prefixing them like {{chicken-4/1.2}}, for example.
357* Make a chicken-4 branch from master and update the release-info
358file's location in the master egg list for CHICKEN 4.
359* Clear the release-info file in master, and submit its location for
360inclusion in the master egg list for CHICKEN 5.
361
362This way, new eggs and old eggs will ''always'' have the master branch
363point to the active version.  It does mean a little bit more work on
364every major release.
365
366To avoid having to clear the release-info file every time, we could
367also extend it to include a major release version number (and if it's
368missing, assume "4"?).  This means the release-info file would list
369both CHICKEN 4 and CHICKEN 5 (and later CHICKEN 6) releases in the
370same file.  This might make maintenance a little easier, but requires
371a small change in henrietta-cache.
372
373* IMO the brains should be in the henrietta web API.  --John Cowan
374** I don't think that's necessary.  In any case, there '''must''' be some way for the egg authors to indicate for which CHICKEN version the egg is. --Peter Bex
Note: See TracBrowser for help on using the repository browser.