source: project/gazette/src/issues/5.wiki @ 20538

Last change on this file since 20538 was 20538, checked in by Alex Shinn, 9 years ago

adding "The Core" section

File size: 8.6 KB
Line 
1((title . "Issue 5")
2 (authors "Alex Shinn")
3 (date . 1285489159))
4
5== 0. Introduction
6
7Welcome to issue 5 of the Chicken Gazette!  We're tentatively
8switching the Gazette publication to Monday this week, to give people
9something to read after coming back from the weekend.
10
11== 1. The Hatching Farm - New Eggs & The Egg Repository
12
13This week
14[[http://wiki.call-cc.org/users/mario-domenech-goulart|Mario Domenech Goulart]]
15released a new egg called
16[[http://wiki.call-cc.org/eggref/4/accents-substitute|accents-substitute]]
17to replaced accented Latin characters with either HTML entities or their
18non-accented ASCII equivalents, for when you need to work with ASCII-only
19text.
20
21[[http://wiki.call-cc.org/users/ivan-raikov|Ivan Raikov]]
22has been busy, and in addition to many egg updates has released
23a new egg called [[http://wiki.call-cc.org/egg/cis|cis]]
24(compact integer sets) as a possible alternate to last week's featured egg
25[[http://wiki.call-cc.org/egg/iset|iset]].  It's less efficient
26in terms of time and space, but has a simpler implementation for
27when performance doesn't matter.
28
29Our fearless leader
30[[http://wiki.call-cc.org/users/felix-winkelmann|Felix]] also added a
31new egg [[http://wiki.call-cc.org/eggref/4/system|system]], inspired
32by the CL defsystem macro.  {{system}} lets you define groups of files
33and their dependencies which can be loaded or compiled, and even
34re-loaded or compiled keeping track of modified files.  Use it for
35rapid development in the repl (or in the near future as a {{make}}
36alternative in your .setup files).
37
38== 2. The Core - Bleeding Edge Development
39
40It's been another busy week for core development:
41
42Overflow-detection for basic arithmetic ops (`+', `-', `*' and `/')
43has been changed to use bit-twiddling instead of "parallel flonum"
44computations, since 64-bit IEEE doubles have not enough precision to
45hold the full range of fixnums on 64-bit systems
46
47A serious compiler bug related to inlining was fixed (found with much
48help by Sven Hartrumpf), and several other bugs reported by Kon Lovett
49were fixed.
50
51A new foreign type `pointer-vectors' (vectors of native unboxed
52pointers) was added, with an API in lolevel.
53
54A simpler alternative to `er-macro-transformer',
55`ir-macro-transformer' (implicit renaming macros) was added by
56[[http://wiki.call-cc.org/users/peter-bex|Peter Bex]].
57See ticket #394 on trac.
58
59But the biggest change: irregex is now the official regex API, and has
60full library unit status, regex unit is removed and available as an
61egg (should be fully backwards compatible, as long as "(use regex)
62(import irregex)" idiom is used; dependencies on regex unit not in egg
63repo, though), upgraded irregex version, many upstream bugfixes and
64optimizations, with many thanks to
65[[http://wiki.call-cc.org/users/peter-bex|Peter Bex]]
66and [[http://wiki.call-cc.org/users/alex-shinn|Alex Shinn]].
67
68And thanks to Felix for help with the summary!
69
70== 3. Chicken Talk
71
72The exciting news on the mailing list this week was a
73[[http://lists.nongnu.org/archive/html/chicken-users/2010-09/msg00074.html|performance boost]]
74mentioned by [[http://wiki.call-cc.org/users/mario-domenech-goulart|Mario Domenech Goulart]]
75where the [[http://wiki.call-cc.org/eggref/4/awful|awful]]
76web framework ran a benchmark almost 7x faster.  This is
77likely due to a new GC improvement by Felix.
78
79Taylor Venable [[http://lists.nongnu.org/archive/html/chicken-users/2010-09/msg00068.html|brought up an issue]]
80in the new [[http://wiki.call-cc.org/eggref/4/coops|coops]]
81object system involving class redefinition and {{define-method}}
82on a generic not first provided with {{define-generic}}.
83It turns out Chicken will do an implicit {{define-generic}} for
84you as a convenience, but it's probably best to define each
85generic once explicitly.  Also be aware that redefining a
86class will create a completely new class which instances of
87the old class will not belong to.
88
89Richard Hollos [[http://lists.nongnu.org/archive/html/chicken-users/2010-09/msg00066.html|reported an error]]
90compiling on AMD64 Linux, which turned out to be just a chicken and
91egg problem.
92
93Markus Klotzbuecher [[http://lists.nongnu.org/archive/html/chicken-users/2010-09/msg00075.html|provided a patch]]
94for the {{cairo}} egg, showing activity in the graphical
95library front.
96
97Finally Felix just
98[[http://lists.nongnu.org/archive/html/chicken-users/2010-09/msg00083.html|announced]]
99[[http://wiki.call-cc.org/eggref/4/coops|coops]] version 1.0!
100If you've been using {{tinyclos}}, give {{coops}} a try.
101
102== 4. Omelette Recipes - Tips and Tricks
103
104We've got a wide variety of options for parsing in Chicken, from the
105built-in {{read}} and extensions thereon, to regular expressions, to a
106plethora of both specific and general purpose parsing libraries.  This
107week, I want to take a look at one of the general purpose libraries,
108[[http://wiki.call-cc.org/egg/packrat|the packrat egg]] by Tony
109Garnock-Jones.
110
111A packrat parser is a parser for Parsing Expression Grammars (PEGs),
112which is essentially a recursive decent parser with backtracking plus
113memoization to promise a linear time parse.  They look similar to
114Context Free Grammars (CFGs), but are unambiguous by virtue of being
115ordered - the leftmost matching rule is always chosen.  PEGs also
116allow {{and}} and {{not}} rules, which allow them to parse languages
117that can't be described by CFGs.
118
119The
120[[http://bugs.call-cc.org/export/20226/release/4/packrat/doc/packrat.pdf|documentation for packrat]]
121is not too user-friendly, and among other things doesn't include any
122examples of parsing from actual text.  This is because {{packrat}} is
123written to work over abstract streams of tokens, not just text, but
124text is usually what people want to work with when looking at parsers.
125So we'll start by writing a version of the expression parser example
126to work on text.
127
128<enscript highlight="scheme">
129  ;; Start with the base textual generator from the documentation:
130  (define (generator filename port)
131    (let ((ateof #f)
132          (pos (top-parse-position filename)))
133      (lambda ()
134        (if ateof
135            (values pos #f)
136            (let ((x (read-char port)))
137              (if (eof-object? x)
138                  (begin
139                    (set! ateof #t)
140                    (values pos #f))
141                  (let ((old-pos pos))
142                    (set! pos (update-parse-position pos x))
143                    (values old-pos (cons x x)))))))))
144
145  ;; Now define a character-oriented version of the expression parser:
146  (define expr-parser
147    (packrat-parser expr
148      (expr ((a <- mulexp '#\+ b <- mulexp) (+ a b)) ((a <- mulexp) a))
149      (mulexp ((a <- simple '#\* b <- simple) (* a b)) ((a <- simple) a))
150      (simple ((a <- num) a) (('#\( a <- expr '#\)) a))
151      (num ((a <- digits) (string->number (list->string a))))
152      (digits ((a <- digit b <- digits) (cons a b)) ((a <- digit) (list a)))
153      (digit ((a <- '#\0) a) ((a <- '#\1) a) ((a <- '#\2) a)
154             ((a <- '#\3) a) ((a <- '#\4) a) ((a <- '#\5) a)
155             ((a <- '#\6) a) ((a <- '#\7) a) ((a <- '#\8) a)
156             ((a <- '#\9) a))))
157
158  ;; ... and a utility function to parse from strings or ports,
159  ;; and return the semantic results on success or #f on failure:
160  (define (expr-parse x)
161    (let ((g (base-generator->results
162              (generator #f (if (string? x) (open-input-string x) x)))))
163      (let ((res (expr-parser g)))
164        (and (parse-result-successful? res)
165             (parse-result-semantic-value res)))))
166
167  ;; Some examples:
168  (expr-parse "2") => 2
169  (expr-parse "2+2") => 4
170  (expr-parse "2+2*7") => 16
171  (expr-parse "(2+2)*7") => 28
172  (expr-parse "3*4+5*6") => 42
173</enscript>
174
175As you can see, writing the grammar is very natural.  We've chosen to
176compute the values directly in the actions, but could also return them
177as s-expressions, e.g. using {{`(+ ,a ,b)}} to get an AST to work
178with.
179
180The egg is still rather primitive, and is missing some useful features
181such as charsets (which would simplify our "digit" rule greatly), and
182arbitrary predicates, but is definitely something that can be built
183on.  For similar efforts you can search for Scheme "parser
184combinators," which often refer to the same thing without the
185memoization.  A parser combinator project worth checking out is Taylor
186Campbell's [[http://mumble.net/~campbell/darcs/parscheme/|parscheme]].
187
188== 5. About the Chicken Gazette
189
190The Gazette is produced weekly by a volunteer from the Chicken
191community. The latest issue can be found at
192[[http://gazette.call-cc.org]] or you can follow it in your feed
193reader at [[http://gazette.call-cc.org/feed.atom]]. If you'd like to
194write an issue,
195[[http://bugs.call-cc.org/browser/gazette/README.txt|check out the instructions]]
196and come and find us in #chicken on Freenode!
Note: See TracBrowser for help on using the repository browser.