source: project/wiki/eggref/4/crunch @ 19692

Last change on this file since 19692 was 19692, checked in by Mario Domenech Goulart, 11 years ago

crunch (wiki): semantic markup; scheme highlight for code snippet; added author section

File size: 18.6 KB
Line 
1[[tags: egg]]
2[[toc:]]
3
4== crunch
5
6=== Introduction
7
8'''crunch''' compiles a severely restricted statically typed subset of
9[[http://www.schemers.org/Documents/Standards/R5RS|R5RS]] Scheme to
10C++. It can be used to generate standalone executables or code
11embedded into Scheme programs.
12
13''This extension is highly experimental and likely to contain many bugs
14and incomplete functionality.''
15
16=== Author
17
18[[/users/felix-winkelmann|Felix Winkelmann]]
19
20=== Requirements
21
22[[/eggref/4/miscmacros|miscmacros]],
23[[/eggref/4/defstruct|defstruct]],
24[[/eggref/4/defstruct|matchable]]
25
26
27=== Usage
28
29To use crunch in your Scheme code, simply wrap toplevel forms to be compiled
30in a {{(crunch ...)}} expression. All toplevel procedure definitions are
31accessible as global or local (depending on the context where the {{crunch}}
32form occurs) procedures callable from Scheme. The {{crunch}} macro can only
33be used in compiled code. To use the macro, put
34
35  (import crunch)
36
37in your code.
38
39Alternatively, the {{chicken-crunch}} program can be used to translate and
40compile code in the crunch Scheme dialect into C++. The generated code has
41no dependencies. Only the headerfile {{crunch.h}} must be available and in
42your C++ compiler's include path. When installing this extension with
43{{chicken-install}}, the file will be located in your default include path,
44usually {{$PREFIX/include}}.
45
46The compiler can also be used through its procedural API, see {{crunch-compile}}.
47In that case, load the runtime-part of the compiler with
48
49  (require-extension crunch-compiler)
50
51Crunched procedures are in every respect identical to C/C++ functions
52called via the usual CHICKEN foreign function interface. Crunch does not know
53anything about Scheme data or memory management. Translated code can call back
54into Scheme (see {{define-crunch-callback}}) - callbacks are usually automatically
55detected and the generated Scheme wrapper function for a crunched procedure
56will be of the appropiate type, if required.
57
58Crunch uses its own macro expander, a modified version of Al
59Petrofsky's ''alexpander'', a R5RS compliant implementation of
60{{syntax-rules}} macros.
61
62No garbage collector is used. All dynamically allocated data (strings and
63number vectors) are managed using reference-counting.
64
65The dialect of R5RS Scheme supported is extremely limited. See ''Bugs and limitations''
66for more information.
67
68Note that if you use the {{crunch}} macro in your code, you must compile the
69file generated by chicken in C++ mode (just pass {{-c++}} to {{csc}} when compiling).
70
71To get maximum performance, inlining must be enabled in the C++ compiler
72when compiling crunch-generated code. The default optimization options do not
73enable inlining unless the default C compiler options have been overridden
74during installation of CHICKEN.
75Passing {{-C -O3}} to {{csc}} for crunched code will usually optimize the C++
76code considerably.
77
78
79=== Programming interface
80
81==== crunch-compile
82
83<procedure>(crunch-compile EXPRESSION [PORT debug: DBGMODE entry-point: SYMBOL])</procedure>
84
85Compiles the toplevel expression {{EXPRESSION}} into a C++ code, writing the generated
86code to {{PORT}}, which defaults to the value of {{(current-output-port)}}.
87If {{DBGMODE}} is given, debugging output will be written to the current output port.
88{{DBGMODE}} can be a boolean or a number between 1 and 3. Debug mode 1 shows some information
89about each compiled procedure, debug mode 3 generates loads of diagnostic output about
90the type-inferencing process and expanded code.
91
92If the entry-point name {{SYMBOL}} is given, then the (normally hidden) toplevel
93variable of the same name holding a pointer to the associated C++ function can be
94accessed from C/C++ code, i.e. it is exposed under the same name. Note that the
95exposed variable is a ''pointer'' to a function.
96
97Each invocation of {{crunch-compile}} creates its own private namespace, global variables
98are not visible in subsequent compilation runs in the same process. Syntax definitions
99''are'' persistent over several invocations, though.
100
101==== crunch-expand
102
103<procedure>(crunch-expand EXPRESSION)</procedure>
104
105Expands all macros in the given toplevel expression and returns the expansion.
106
107==== crunch
108
109<macro>(crunch EXPRESSION ...)</macro>
110
111Compiles the given toplevel expressions and expands into a set of
112function definitions and an invocation of compiled toplevel expressions
113in {{EXPRESSION}}. The form can be used in a definition context but ends
114in a non-definition form (and so can with some macro systems not be followed
115by other definitions). Calls to Scheme callbacks are detected automatically
116and generate the appropriate {{foreign-safe-lambda}} definition. The result
117of the executed toplevel code is unspecified.
118
119==== define-crunch-primitives
120
121<macro>(define-crunch-primitives ((NAME ARGTYPE ...) -> RESULTTYPE [C-NAME]) ...)</macro>
122
123Define additional primitives with the given names and argument- and result types. if {{C-NAME}}
124is given, it specifies the name of the actual C/C++ function to be called. Otherwise {{NAME}}
125is used.
126
127==== define-crunch-callback
128
129<macro>(define-crunch-callback (NAME (ARGFTYPE1 VAR1) ...) RESULTFTYPE BODY ...)</macro>
130
131Equivalent to {{define-external}}, but makes the callback accessible in subsequent
132translations of crunch code.
133
134Note that you have to pass {{-emit-external-prototypes-first}} to
135{{csc}} (or {{chicken}}) when you use crunch callbacks to place
136function prototypes for the callbacks in front of code generated by
137crunch.
138
139
140=== Standalone compiler
141
142The program {{chicken-crunch}} can be used to generate a standalone program
143or module that has no CHICKEN dependencies.
144
145  usage: chicken-crunch OPTION | FILENAME ...
146   
147    -h            show this message
148    -o FILENAME   set output filename
149    -d            enable debug output
150    -dd           enable more debug output
151    -ddd          enable massive debug output
152    -cc CC        select C++ compiler (default: "c++")
153    -expand       only show code after expansion
154    -entry NAME   set entry-point procedure
155    -translate    only generate C++, don't compile
156   
157    All other options (arguments beginning with "-") are passed to
158    the C++ compiler. FILENAME may be "-", which reads source code
159    from stdin.
160
161Provided the file {{crunch.h}} is in the include path, the generated
162C++ code can be compiled by itself. To link, you may have to add the
163{{-lm}} switch to the linker, depending on the platform on which you
164are compiling the code.
165
166=== The type system
167
168Crunch performs type-inference to find out the types of local and global variables.
169It currently knows about these types:
170
171<table>
172<tr><th>Crunch type</th><th>C type</th><th>Description</th></tr>
173<tr><td>{{int}} {{short}} {{long}}</td><td>{{int}} {{short}} {{long}}</td><td>integer numbers</td></tr>
174<tr><td>{{float}} {{double}}</td><td>{{float}} {{double}}</td><td>floating point numbers</td></tr>
175<tr><td>{{bool}}</td><td>{{bool}}</td><td>boolean type</td></tr>
176<tr><td>{{char}}</td><td>{{char}}</td><td>characters</td></tr>
177<tr><td>{{void}}</td><td>{{void}}</td><td>the type of the "unspecified" value</td></tr>
178<tr><td>{{c-string}}</td><td>{{char *}}</td><td>strings</td></tr>
179<tr><td>{{blob}}</td><td>{{void *}}</td><td>a shapeless byte sequence</td></tr>
180<tr><td>{{c-pointer}}</td><td>{{void *}}</td><td>an opaque pointer</td></tr>
181<tr><td>{{u8vector}} {{s8vector}} {{u16vector}} {{s16vector}} {{u32vector}} {{s32vector}} {{f32vector}}
182{{f64vector}}</td>
183<td>{{unsigned char *}} {{signed char *}} {{unsigned short *}} {{short *}} {{unsigned int *}}
184{{int *}} {{float *}} {{double *}}</td>
185<td>[[http://srfi.schemers.org/srfi-4/|SRFI-4]] homogenous number vectors</td></tr>
186</table>
187
188Important: callbacks are likely to trigger a garbage
189collection, which will invalidate references to number-vectors or
190strings allocated in normal Scheme code. This does not apply to data allocated
191inside crunched code, which is not subject to garbage collection.
192
193Variables defined with {{define}} or {{set!}} or bound with {{let}} or
194in a {{lambda}} list can be declared to have a particular type by
195suffixing them with {{::}} followed by a typename:
196
197<enscript highlight=scheme>
198  (crunch
199    (let ((a::int (* 8 (sin 1))))
200      (display a::int)))               ; shows "8"
201</enscript>
202
203Note that the name of variable really is {{a::int}}, not {{a}}. You usually
204don't need these declarations, though.
205
206Note also the absence of any other data types, in particular lists, vectors
207or record structures.
208
209Crunched functions may return results of the following types:
210
211  char
212  int
213  short
214  long
215  float
216  double
217  c-string
218  c-pointer
219
220Polymorphic procedures are not supported.
221
222
223=== Available syntax
224
225The following non-standard macros are provided:
226
227  cond-expand
228  when
229  unless
230  switch
231  rec
232
233{{cond-expand}} recognizes the feature identifiers {{crunch}}, {{srfi-0}},
234{{highlevel-macros}} and {{syntax-rules}}. When code is compiled to
235a standalone program with {{chicken-crunch}}, the feature identifier
236{{crunch-standalone}} is defined as well.
237
238
239=== Available primitives
240
241All primitives take a fixed number of arguments, optional or "rest"
242arguments are not supported.  Primitives may not be redefined. Uses of
243primitives in non-operator position are treated as
244{{(lambda (tmp1 ...) (<primitive> tmp1 ...))}}.
245
246Argument type abbreviations:
247
248<table>
249<tr><td>O O1 O2</td><td>any data object</td></tr>
250<tr><td>X Y</td><td>number</td></tr>
251<tr><td>N N1 N2</td><td>integer</td></tr>
252<tr><td>K K1 K2</td><td>positive integer</td></tr>
253<tr><td>R R1 R2</td><td>inexact number</td></tr>
254<tr><td>S S1 S2</td><td>string</td></tr>
255<tr><td>C C1 C2</td><td>character</td></tr>
256<tr><td>B</td><td>blob</td></tr>
257<tr><td>U8 S8 U16 S16 U32 S32 F32 F64</td><td>SRFI-4 number vector</td></tr>
258<tr><td>P</td><td>pointer</td></tr>
259</table>
260
261The following R5RS procedures are provided:
262
263  (not O)
264
265  (eq? O1 O2)
266  (eqv? O1 O2)
267  (equal? O1 O2)
268
269  (+ X Y)
270  (- X Y)
271  (* X Y)
272  (/ X Y)
273  (= X Y)
274  (> X Y)
275  (< X Y)
276  (>= X Y)
277  (<= X Y)
278  (abs X)
279  (acos R)
280  (asin R)
281  (atan R)
282  (ceiling X)
283  (cos R)
284  (display O)
285  (even? N)
286  (exact? X)
287  (exact->inexact X)
288  (exp R)
289  (expt R1 R2)
290  (floor X)
291  (inexact? X)
292  (inexact->exact X)
293  (integer? X)
294  (log R)
295  (max X Y)
296  (min X Y)
297  (modulo N1 N2)
298  (negative? X)
299  (odd? N)
300  (positive? X)
301  (quotient N1 N2)
302  (remainder N1 N2)
303  (round X)
304  (sin R)
305  (sqrt X)
306  (tan R)
307  (truncate X)
308  (zero? X)
309
310{{max}}, {{min}} and {{expt}} are not exactness preserving. {{expt}}
311always returns an inexact result.
312
313  (char=? C1 C2)
314  (char>? C1 C2)
315  (char<? C1 C2)
316  (char>=? C1 C2)
317  (char<=? C1 C2)
318  (char->integer C)
319  (char-alphabetic? C)
320  (char-ci=? C1 C2)
321  (char-ci>? C1 C2)
322  (char-ci<? C1 C2)
323  (char-ci>=? C1 C2)
324  (char-ci<=? C1 C2)
325  (char-downcase C)
326  (char-lower-case? C)
327  (char-numeric? C)
328  (char-upper-case? C)
329  (char-upcase C)
330  (char-whitespace? C)
331  (integer->char K)
332
333  (number->string X K)
334  (make-string N C)
335  (string=? S1 S2)
336  (string>? S1 S2)
337  (string<? S1 S2)
338  (string>=? S1 S2)
339  (string<=? S1 S2)
340  (string->number S K)
341  (string-ci=? S1 S2)
342  (string-ci>? S1 S2)
343  (string-ci<? S1 S2)
344  (string-ci>=? S1 S2)
345  (string-ci<=? S1 S2)
346  (string-append S1 S2)
347  (string-copy S)
348  (string-fill! S1 C)
349  (string-length S)
350  (string-ref S K)
351  (string-set! S K C)
352  (substring S K1 K2)
353
354{{string->number}} does not detect invalid numerical syntax and simply
355wraps {{strtol(3)}}/{{strtod(3)}}.  If a radix different from 10 is
356given, the result will always be converted with {{strtol(3)}}.
357
358{{number->string}}
359ignores the radix argument if the converted number is inexact.
360
361  (display X)
362  (newline)
363  (write-char C)
364
365{{write-char}}, {{display}} and {{newline}} always write to stdout.
366
367
368Non-R5RS procedures (see the [[/man/4|The User's Manual]] for more information):
369
370  (add1 X)
371  (atan2 R1 R2)
372  (arithmetic-shift N1 N2)
373  (bitwise-and N1 N2)
374  (bitwise-ior N1 N2)
375  (bitwise-not N)
376  (bitwise-xor N1 N2)
377  (sub1 X)
378
379  (f32vector-length F32)
380  (f32vector-ref F32 K)
381  (f32vector-set! F32 K R)
382  (f64vector-length F64)
383  (f64vector-ref F64 K)
384  (f64vector-set! F64 K R)
385  (make-f32vector K R)
386  (make-f64vector K R)
387  (make-s16vector K N)
388  (make-s32vector K N)
389  (make-s8vector K N)
390  (make-u16vector K1 K2)
391  (make-u32vector K1 K2)
392  (make-u8vector K1 K2)
393  (s16vector-length S16)
394  (s16vector-ref S16 K)
395  (s16vector-set! S16 K N)
396  (s32vector-length S32)
397  (s32vector-ref S32 K)
398  (s32vector-set! S32 K N)
399  (s8vector-length S8)
400  (s8vector-ref S8 K)
401  (s8vector-set! S8 K N)
402  (subf32vector F32 K1 K2)
403  (subf64vector F64 K1 K2)
404  (subs16vector S16 K1 K2)
405  (subs32vector S32 K1 K2)
406  (subs8vector S8 K1 K2)
407  (subu16vector U16 K1 K2)
408  (subu32vector U32 K1 K2)
409  (subu8vector U8 K1 K2)
410  (u16vector-length U16)
411  (u16vector-ref U16 K)
412  (u16vector-set! U16 K1 K2)
413  (u32vector-length U32)
414  (u32vector-ref U32 K)
415  (u32vector-set! U32 K1 K2)
416  (u8vector-length U8)
417  (u8vector-ref U8 K)
418  (u8vector-set! U8 K1 K2)
419
420  (blob->f32vector B)
421  (blob->f32vector/shared B)
422  (blob->f64vector B)
423  (blob->f64vector/shared B)
424  (blob->s16vector B)
425  (blob->s16vector/shared B)
426  (blob->s32vector B)
427  (blob->s32vector/shared B)
428  (blob->s8vector B)
429  (blob->s8vector/shared B)
430  (blob->string B)
431  (blob->string/shared B)
432  (blob->u16vector B)
433  (blob->u16vector/shared B)
434  (blob->u32vector B)
435  (blob->u32vector/shared B)
436  (blob->u8vector B)
437  (blob->u8vector/shared B)
438  (f32vector->blob F32)
439  (f32vector->blob/shared F32)
440  (f64vector->blob F64)
441  (f64vector->blob/shared F64)
442  (s16vector->blob S16)
443  (s16vector->blob/shared S16)
444  (s32vector->blob S32)
445  (s32vector->blob/shared S32)
446  (s8vector->blob S8)
447  (s8vector->blob/shared S8)
448  (string->blob S)
449  (string->blob/shared S)
450  (u16vector->blob U16)
451  (u16vector->blob/shared U16)
452  (u32vector->blob U32)
453  (u32vector->blob/shared U32)
454  (u8vector->blob U8)
455  (u8vector->blob/shared U8)
456
457The {{.../shared}} conversion procedures return data objects that share the actual
458storage with the argument objects, this can be used for interesting applications.
459
460  (flush-output)
461
462  (void)
463  (error S)
464  (exit N)
465  (argc)
466  (argv-ref K)
467
468{{error}} shows a message and invokes {{abort(3)}}. {{argc}} returns the number of
469arguments passed to the process (including the program name) and {{argv-ref}} returns
470the command line argument with the given index (or the program name, when the index is zero).
471
472  (pointer-u8-ref P N)
473  (pointer-s8-ref P N)
474  (pointer-u16-ref P N)
475  (pointer-s16-ref P N)
476  (pointer-u32-ref P N)
477  (pointer-s32-ref P N)
478  (pointer-f32-ref P N)
479  (pointer-f64-ref P N)
480  (pointer-u8-set! P N1 N2)
481  (pointer-s8-set! P N1 N2)
482  (pointer-u16-set! P N1 N2)
483  (pointer-s16-set! P N1 N2)
484  (pointer-u32-set! P N1 N2)
485  (pointer-s32-set! P N1 N2)
486  (pointer-f32-set! P N R)
487  (pointer-f64-set! P N R)
488
489
490=== Notes
491
492* Pass {{-DDBGALLOC}} to the C++ compiler (either through {{chicken-crunch}} or to {{csc}} via {{-C -DDBGALLOC}}) to see log messages about the allocation and de-allocation of dynamic number vectors or strings.
493* Runtime errors invoke {{abort(3)}} and thus can not be caught.
494
495
496=== Bugs and limitations
497
498* Lexical scope is not supported, only references to global variables and variables local to the current {{lambda}} construct (including {{let}} bound variables) are visible. Expressions of the form {{((lambda (...) ...) ...)}} are converted in the corresponding {{let}} construct.
499* Local procedures are not available
500* {{letrec}} is not supported (it makes no sense without local procedures)
501* Continuations are not supported.
502* Multiple values are not supported.
503* Tail calls are only detected in self-recursive functions.
504* Rest-arguments (dotted lambda lists) are not supported.
505* Numeric overflow of fixnum operations is not detected.
506* Nearly no error checks are made at runtime.
507* Named {{let}} is always assumed to be a looping construct, calls to the loop variable '''must''' be in tail position.
508* {{do}} and named {{let}} loops always return an unspecified value.
509* The correctness of the C++ template code is unclear. C++ is insane.
510* If a homogenous number vector or string is passed from Scheme to C++ code generated by crunch, then the length of the passed array is not known and the associated {{...-length}} primitive and primitives that require the length of the vector will abort.
511* Type-related errors do not always produce particularly useful context information
512* Error messages are generally pretty bad
513
514
515=== Examples
516
517<enscript highlight="scheme">
518(use crunch)
519
520(crunch
521  (define (string-reverse str)
522    (let* ((n (string-length str))
523              (s2 (make-string n #\space)))
524         (do ((i 0 (add1 i)))
525             ((>= i n))
526           (string-set! s2 (sub1 (- n i)) (string-ref str i)))
527         s2)) )
528
529(print (string-reverse "this is a test!"))
530</enscript>
531
532
533=== License
534
535 Copyright (c) 2007-2009, Felix L. Winkelmann
536 The "alexpander" is Copyright (c) 2002-2004, Al Petrofsky
537 
538 All rights reserved.
539 
540 Redistribution and use in source and binary forms, with or without
541 modification, are permitted provided that the following conditions
542 are met:
543 
544  Redistributions of source code must retain the above copyright
545    notice, this list of conditions and the following disclaimer.
546  Redistributions in binary form must reproduce the above copyright
547    notice, this list of conditions and the following disclaimer in the
548    documentation and/or other materials provided with the distribution.
549  Neither the name of the author nor the names of its contributors
550    may be used to endorse or promote products derived from this
551    software without specific prior written permission.
552 
553 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
554 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
555 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
556 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
557 HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
558 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
559 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
560 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
561 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
562 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
563 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
564
565
566=== Version history
567
568; 0.7.5 : fixed bug related to callbacks with {{void}} result type
569; 0.7.4 : removed unused test files
570; 0.7.3 : two bugfixes (Thanks to Jeronimo)
571; 0.7.2 : fixed silly mistake
572; 0.7.1 : fixed bug in setup script
573; 0.7 : ported to CHICKEN 4
574; 0.6 : updated to newest alexpander
575; 0.5 : fixed buggy formatting directive
576; 0.4 : support for libarena by Ivan Raikov
577; 0.3 : fixed bugs in character handling [thanks to Alex Shinn]
578; 0.2 : fixed bugs in naming of {{char->integer}} and {{integer->char}}
579; 0.1 : initial release
Note: See TracBrowser for help on using the repository browser.