source: project/wiki/eggref/5/fmt @ 36030

Last change on this file since 36030 was 36030, checked in by Ivan Raikov, 2 years ago

added fmt C5 egg doc

File size: 42.1 KB
Line 
1[[tags: egg]]
2[[toc:]]
3
4== fmt
5
6=== Introduction
7
8A library of procedures for formatting Scheme objects to text in
9various ways, and for easily concatenating, composing and extending
10these formatters efficiently without resorting to capturing and
11manipulating intermediate strings.
12
13Chicken eggs are provided for formatting of arbitrary objects
14('''fmt'''), Unicode ('''fmt-unicode'''), C code ('''fmt-c''') and
15ANSI color output ('''fmt-color''').
16
17This is a copy of the [[http://synthcode.com/scheme/fmt/|canonical
18documentation]] (off-site).
19
20
21=== Background
22
23There are several approaches to text formatting. Building strings to
24{{display}} is not acceptable, since it doesn't scale to very large
25output. The simplest realistic idea, and what people resort to in
26typical portable Scheme, is to interleave {{display}} and {{write}}
27and manual loops, but this is both extremely verbose and doesn't
28compose well. A simple concept such as padding space can't be achieved
29directly without somehow capturing intermediate output.
30
31The traditional approach is to use templates - typically strings,
32though in theory any object could be used and indeed Emacs' mode-line
33format templates allow arbitrary sexps. Templates can use either
34escape sequences (as in C's {{printf}} and
35[[http://www.harlequin.com/education/books/HyperSpec/|CL's]]
36{{format}}) or pattern matching (as in Visual Basic's Format,
37[[http://www.perl.com/lpt/a/819|Perl6's]] {{form}}, and SQL date
38formats). The primary disadvantage of templates is the relative
39difficulty (usually impossibility) of extending them, their
40opaqueness, and the unreadability that arises with complex
41formats. Templates are not without their advantages, but they are
42already addressed by other libraries such as
43[[http://srfi.schemers.org/srfi-28/|SRFI-28]] and
44[[http://srfi.schemers.org/srfi-48/|SRFI-48]].
45
46This library takes a combinator approach. Formats are nested chains of
47closures, which are called to produce their output as needed. The
48primary goal of this library is to have, first and foremost, a
49maximally expressive and extensible formatting library. The next most
50important goal is scalability - to be able to handle arbitrarily large
51output and not build intermediate results except where necessary. The
52third goal is brevity and ease of use.
53
54=== Usage
55
56The primary interface is the {{fmt}} procedure:
57
58<procedure>(fmt <output-dest> <format> ...)</procedure>
59
60where {{<output-dest>}} has the same semantics as with {{format}} -
61specifically it can be an output-port, {{#t}} to indicate the current
62output port, or {{#f}} to accumulate output into a string.
63
64Each {{<format>}} should be a format closure as discussed below. As a
65convenience, non-procedure arguments are also allowed and are
66formatted similar to display, so that
67
68<enscript highlight=scheme>
69(fmt #f "Result: " res nl)
70</enscript>
71
72would return the string {{"Result: 42n"}}, assuming {{res}} is bound
73to {{42}}.
74
75{{nl}} is the newline format combinator.
76
77
78=== Specification
79
80The procedure names have gone through several variations, and I'm
81still open to new suggestions. The current approach is to use
82abbreviated forms of standard output procedures when defining an
83equivalent format combinator (thus {{display}} becomes {{dsp}} and
84{{write}} becomes {{wrt}}), and to use an {{fmt-}} prefix for
85utilities and less common combinators. Variants of the same formatter
86get a {{/<variant>}} suffix.
87
88
89==== Formatting Objects
90
91===== dsp
92<procedure>(dsp <obj>)</procedure>
93
94Outputs {{<obj>}} using {{display}} semantics. Specifically, strings
95are output without surrounding quotes or escaping and characters are
96written as if by {{write-char}}. Other objects are written as with
97{{write}} (including nested strings and chars inside {{<obj>}}). This
98is the default behavior for top-level formats in {{fmt}}, {{cat}} and
99most other higher-order combinators.
100
101
102===== wrt
103<procedure>(wrt <obj>)</procedure>
104
105Outputs {{<obj>}} using write semantics. Handles shared structures as
106in [[http://srfi.schemers.org/srfi-38/|SRFI-38]].
107
108
109===== wrt/unshared
110<procedure>(wrt/unshared <obj>)</procedure>
111
112As above, but doesn't handle shared structures. Infinite loops can
113still be avoided if used inside a combinator that truncates data (see
114{{trim}} and {{fit}} below).
115
116
117===== pretty
118<procedure>(pretty <obj>)</procedure>
119
120Pretty-prints {{<obj>}}. Also handles shared structures. Unlike many
121other pretty printers, vectors and data lists (lists that don't begin
122with a (nested) symbol), are printed in tabular format when there's
123room, greatly saving vertical space.
124
125
126===== pretty/unshared
127<procedure>(pretty/unshared <obj>)</procedure>
128
129As above but without sharing.
130
131===== slashified
132<procedure>(slashified <str> [<quote-ch> <esc-ch> <renamer>])</procedure>
133
134Outputs the string {{<str>}}, escaping any quote or escape
135characters. If {{<esc-ch>}} is {{#f}} escapes only the {{<quote-ch>}}
136by doubling it, as in SQL strings and CSV values. If {{<renamer>}} is
137provided, it should be a procedure of one character which maps that
138character to its escape value, e.g. {{#\newline => #\n}}, or {{#f}} if
139there is no escape value.
140
141<enscript highlight=scheme>
142(fmt #f (slashified "hi, "bob!""))
143
144=> "hi, "bob!""
145</enscript>
146
147
148===== maybe-slashified
149<procedure>(maybe-slashified <str> <pred> [<quote-ch> <esc-ch> <renamer>])</procedure>
150
151Like {{slashified}}, but first checks if any quoting is required (by
152the existence of either any quote or escape characters, or any
153character matching {{<pred>}}), and if so outputs the string in quotes
154and with escapes. Otherwise outputs the string as is.
155
156<enscript highlight=scheme>
157(fmt #f (maybe-slashified "foo" char-whitespace? #\"))
158
159=> "foo"
160
161(fmt #f (maybe-slashified "foo bar" char-whitespace? #\"))
162
163=> ""foo bar""
164
165(fmt #f (maybe-slashified "foo"bar"baz" char-whitespace? #\"))
166
167=> ""foo"bar"baz""
168</enscript>
169
170
171==== Formatting Numbers
172
173===== num
174<procedure>(num <n> [<radix> <precision> <sign> <comma> <comma-sep> <decimal-sep>])</procedure>
175
176Formats a single number {{<n>}}. You can optionally specify any
177{{<radix>}} from {{2}} to {{36}} (even if {{<n>}} isn't an
178integer). {{<precision>}} forces a fixed-point format.
179
180A {{<sign>}} of {{#t}} indicates to output a plus sign ({{+}}) for
181positive integers. However, if {{<sign>}} is a character, it means to
182wrap the number with that character and its mirror opposite if the
183number is negative. For example, {{#\(}} prints negative numbers in
184parenthesis, financial style: {{-3.14 => (3.14)}}.
185
186{{<comma>}} is an integer specifying the number of digits between
187commas. Variable length, as in subcontinental-style, is not yet
188supported.
189
190{{<comma-sep>}} is the character to use for commas, defaulting to
191{{#\,}}.
192
193{{<decimal-sep>}} is the character to use for decimals, defaulting to
194{{#\.}}, or to {{#\}}, (European style) if {{<comma-sep>}} is already
195{{#\.}}.
196
197These parameters may seem unwieldy, but they can also take their
198defaults from state variables, described below.
199
200
201===== num/comma
202<procedure>(num/comma <n> [<base> <precision> <sign>])</procedure>
203
204Shortcut for {{num}} to print with commas.
205
206<enscript highlight=scheme>
207(fmt #f (num/comma 1234567))
208
209=> "1,234,567"
210</enscript>
211
212===== num/si
213<procedure>(num/si <n> [<base> <suffix>])</procedure>
214
215Abbreviates {{<n>}} with an SI suffix as in the -h or --si option to
216many GNU commands. The base defaults to {{1024}}, using suffix names
217like Ki, Mi, Gi, etc. Other bases (e.g. the standard 1000) have the
218suffixes k, M, G, etc.
219
220The {{<suffix>}} argument is appended only if an abbreviation is used.
221
222<enscript highlight=scheme>
223(fmt #f (num/si 608))
224
225=> "608"
226
227(fmt #f (num/si 3986))
228
229=> "3.9Ki"
230
231(fmt #f (num/si 3986 1000 "B"))
232
233=> "4kB"
234</enscript>
235
236See [[http://www.bipm.org/en/si/si_brochure/chapter3/prefixes.html|http://www.bipm.org/en/si/si_brochure/chapter3/prefixes.html]].
237
238
239===== num/fit
240<procedure>(num/fit <width> <n> . <ARGS>)</procedure>
241
242Like {{num}}, but if the result doesn't fit in {{<width>}}, output
243instead a string of hashes (with the current {{<precision>}}) rather
244than showing an incorrectly truncated number. For example
245
246<enscript highlight=scheme>
247(fmt #f (fix 2 (num/fit 4 12.345))) => "#.##"
248</enscript>
249
250
251===== num/roman
252<procedure>(num/roman <n>)</procedure>
253
254Formats the number as a Roman numeral:
255
256<enscript highlight=scheme>
257(fmt #f (num/roman 1989)) => "MCMLXXXIX"
258</enscript>
259
260
261===== num/old-roman
262<procedure>(num/old-roman <n>)</procedure>
263
264Formats the number as an old-style Roman numeral, without the
265subtraction abbreviation rule:
266
267<enscript highlight=scheme>
268(fmt #f (num/old-roman 1989)) => "MDCCCCLXXXVIIII"
269</enscript>
270
271
272==== Formatting Space
273
274===== nl
275<constant>nl</constant>
276
277Outputs a newline.
278
279
280===== fl
281<constant>fl</constant>
282
283Short for "fresh line," outputs a newline only if we're not already at
284the start of a line.
285
286===== space-to
287<procedure>(space-to <column>)</procedure>
288
289Outputs spaces up to the given {{<column>}}. If the current column is
290already >= {{<column>}}, does nothing.
291
292
293===== tab-to
294<procedure>(tab-to [<tab-width>])</procedure>
295
296Outputs spaces up to the next tab stop, using tab stops of width
297{{<tab-width>}}, which defaults to {{8}}. If already on a tab stop,
298does nothing. If you want to ensure you always tab at least one space,
299you can use {{(cat " " (tab-to width))}}.
300
301===== fmt-null
302<constant>fmt-null</constant>
303
304Outputs nothing (useful in combinators and as a default noop in
305conditionals).
306
307
308==== Concatenation
309
310===== cat
311<procedure>(cat <format> ...)</procedure>
312
313Concatenates the output of each {{<format>}}.
314
315
316===== apply-cat
317<procedure>(apply-cat <list>)</procedure>
318
319Equivalent to {{(apply cat <list>)}} but may be more efficient.
320
321
322===== fmt-join
323<procedure>(fmt-join <formatter> <list> [<sep>])</procedure>
324
325Formats each element {{<elt>}} of {{<list>}} with {{(<formatter>
326<elt>)}}, inserting {{<sep>}} in between. {{<sep>}} defaults to the
327empty string, but can be any format.
328
329<enscript highlight=scheme>
330(fmt #f (fmt-join dsp '(a b c) ", "))
331
332=> "a, b, c"
333</enscript>
334
335
336===== fmt-join/prefix, fmt-join-suffix
337<procedure>(fmt-join/prefix <formatter> <list> [<sep>])</procedure><br>
338<procedure>(fmt-join/suffix <formatter> <list> [<sep>])</procedure>
339
340<enscript highlight=scheme>
341(fmt #f (fmt-join/prefix dsp '(usr local bin) "/"))
342
343=> "/usr/local/bin"
344</enscript>
345
346As {{fmt-join}}, but inserts {{<sep>}} before/after every element.
347
348
349===== fmt-join/last
350<procedure>(fmt-join/last <formatter> <last-formatter> <list> [<sep>])</procedure>
351
352As {{fmt-join}}, but the last element of the list is formatted with
353{{<last-formatter>}} instead.
354
355
356===== fmt-join/dot
357<procedure>(fmt-join/dot <formatter> <dot-formatter> <list> [<sep>])</procedure>
358
359As {{fmt-join}}, but if the list is a dotted list, then formats the
360dotted value with {{<dot-formatter>}} instead.
361
362
363==== Padding and Trimming
364
365===== pad, pad/left, pad/both
366<procedure>(pad <width> <format> ...)</procedure><br>
367<procedure>(pad/left <width> <format> ...)</procedure><br>
368<procedure>(pad/both <width> <format> ...)</procedure>
369
370Analogs of [[http://srfi.schemers.org/srfi-13/srfi-13.html|SRFI-13]]
371{{string-pad}}, these add extra space to the left, right or both sides
372of the output generated by the {{<format>}}s to pad it to
373{{<width>}}. If {{<width>}} is exceeded has no effect. {{pad/both}}
374will include an extra space on the right side of the output if the
375difference is odd.
376
377{{pad}} does not accumulate any intermediate data.
378
379Note these are column-oriented padders, so won't necessarily work with
380multi-line output (padding doesn't seem a likely operation for
381multi-line output).
382
383
384===== trim, trim/left, trim/both
385<procedure>(trim <width> <format> ...)</procedure><br>
386<procedure>(trim/left <width> <format> ...)</procedure><br>
387<procedure>(trim/both <width> <format> ...)</procedure>
388
389Analogs of [[http://srfi.schemers.org/srfi-13/srfi-13.html|SRFI-13]]
390{{string-trim}}, truncates the output of the {{<format>}}s to force it
391in under {{<width>}} columns. As soon as any of the {{<format>}}s
392exceed {{<width>}}, stop formatting and truncate the result, returning
393control to whoever called {{trim}}. If {{<width>}} is not exceeded has
394no effect.
395
396If a truncation ellipse is set (e.g. with the {{ellipses}} procedure
397below), then when any truncation occurs {{trim}} and {{trim/left}}
398will append and prepend the ellipse, respectively. {{trim/both}} will
399both prepend and append. The length of the ellipse will be considered
400when truncating the original string, so that the total width will
401never be longer than {{<width>}}.
402
403<enscript highlight=scheme>
404(fmt #f (ellipses "..." (trim 5 "abcde")))
405
406=> "abcde"
407
408(fmt #f (ellipses "..." (trim 5 "abcdef")))
409
410=> "ab..."
411</enscript>
412
413
414===== trim/length
415<procedure>(trim/length <width> <format> ...)</procedure>
416
417A variant of {{trim}} which acts on the actual character count rather
418than columns, useful for truncating potentially cyclic data.
419
420
421===== fit, fit/left, fit/both
422<procedure>(fit <width> <format> ...)</procedure><br>
423<procedure>(fit/left <width> <format> ...)</procedure><br>
424<procedure>(fit/both <width> <format> ...)</procedure>
425
426A combination of {{pad}} and {{trunc}}, ensures the output width is
427exactly {{<width>}}, truncating if it goes over and padding if it goes
428under.
429
430
431==== Format Variables
432
433You may have noticed many of the formatters are aware of the current
434column. This is because each combinator is actually a procedure of one
435argument, the current format state, which holds basic information such
436as the row, column, and any other information that a format combinator
437may want to keep track of. The basic interface is:
438
439===== fmt-let, fmt-bind
440<procedure>(fmt-let <name> <value> <format> ...)</procedure><br>
441<procedure>(fmt-bind <name> <value> <format> ...)</procedure>
442
443{{fmt-let}} sets the name for the duration of the {{<format>}}s, and
444restores it on return. {{fmt-bind}} sets it without restoring it.
445
446A convenience control structure can be useful in combination with
447these states:
448
449===== fmt-if
450<procedure>(fmt-if <pred> <pass> [<fail>])</procedure>
451
452{{<pred>}} takes one argument (the format state) and returns a boolean
453result. If true, the {{<pass>}} format is applied to the state,
454otherwise {{<fail>}} (defaulting to the identity) is applied.
455
456Many of the previously mentioned combinators have behavior which can
457be altered with state variables. Although {{fmt-let}} and {{fmt-bind}}
458could be used, these common variables have shortcuts:
459
460
461===== radix, fix
462<procedure>(radix <k> <format> ...)</procedure><br>
463<procedure>(fix <k> <format> ...)</procedure>
464
465These alter the radix and fixed point precision of numbers output with
466{{dsp}}, {{wrt}}, {{pretty}} or {{num}}. These settings apply
467recursively to all output data structures, so that
468
469<enscript highlight=scheme>
470(fmt #f (radix 16 '(70 80 90)))
471</enscript>
472
473will return the string {{"(#x46 #x50 #x5a)"}}. Note that read/write
474invariance is essential, so for {{dsp}}, {{wrt}} and {{pretty}} the
475radix prefix is always included when not decimal. Use {{num}} if you
476want to format numbers in alternate bases without this prefix. For
477example,
478
479<enscript highlight=scheme>
480(fmt #f (radix 16 "(" (fmt-join num '(70 80 90) " ") ")"))
481</enscript>
482
483would return {{"(46 50 5a)"}}, the same output as above without the
484{{"#x"}} radix prefix.
485
486Note that fixed point formatting supports arbitrary precision in
487implementations with exact non-integral rationals. When trying to
488print inexact numbers more than the machine precision you will
489typically get results like
490
491<enscript highlight=scheme>
492(fmt #f (fix 30 #i2/3))
493
494=> "0.666666666666666600000000000000"
495</enscript>
496
497but with an exact rational it will give you as many digits as you
498request:
499
500<enscript highlight=scheme>
501(fmt #f (fix 30 2/3))
502
503=> "0.666666666666666666666666666667"
504</enscript>
505
506
507===== decimal-align
508<procedure>(decimal-align <k> <format> ...)</procedure>
509
510Specifies an alignment for the decimal place when formatting numbers,
511useful for outputting tables of numbers.
512
513<enscript highlight=scheme>
514  (define (print-angles x)
515     (fmt-join num (list x (sin x) (cos x) (tan x)) " "))
516
517  (fmt #t (decimal-align 5 (fix 3 (fmt-join/suffix print-angles (iota 5) nl))))
518</enscript>
519
520would output
521
522   0.000    0.000    1.000    0.000
523   1.000    0.842    0.540    1.557
524   2.000    0.909   -0.416   -2.185
525   3.000    0.141   -0.990   -0.142
526   4.000   -0.757   -0.654    1.158
527
528
529===== comma-char, decimal-char
530<procedure>(comma-char <k> <format> ...)</procedure><br>
531<procedure>(decimal-char <k> <format> ...)</procedure>
532
533{{comma-char}} and {{decimal-char}} set the defaults for number
534formatting.
535
536
537===== pad-char
538<procedure>(pad-char <k> <format> ...)</procedure>
539
540The {{pad-char}} sets the character used by {{space-to}}, {{tab-to}},
541{{pad/*}}, and {{fit/*}}, and defaults to {{#\space}}.
542
543<enscript highlight=scheme>
544  (define (print-table-of-contents alist)
545    (define (print-line x)
546      (cat (car x) (space-to 72) (pad/left 3 (cdr x))))
547    (fmt #t (pad-char #\. (fmt-join/suffix print-line alist nl))))
548
549  (print-table-of-contents
550   '(("An Unexpected Party" . 29)
551     ("Roast Mutton" . 60)
552     ("A Short Rest" . 87)
553     ("Over Hill and Under Hill" . 100)
554     ("Riddles in the Dark" . 115)))
555</enscript>
556
557would output
558
559  An Unexpected Party.....................................................29
560  Roast Mutton............................................................60
561  A Short Rest............................................................87
562  Over Hill and Under Hill...............................................100
563  Riddles in the Dark....................................................115
564
565
566===== ellipse
567<procedure>(ellipse <ell> <format> ...)</procedure>
568
569Sets the truncation ellipse to {{<ell>}}, would should be a string or
570character.
571
572
573===== with-width
574<procedure>(with-width <width> <format> ...)</procedure>
575
576Sets the maximum column width used by some formatters. The default is
577{{78}}.
578
579
580
581==== Columnar Formatting
582
583Although {{tab-to}}, {{space-to}} and padding can be used to manually
584align columns to produce table-like output, these can be awkward to
585use. The optional extensions in this section make this easier.
586
587
588===== columnar
589<procedure>(columnar <column> ...)</procedure>
590
591Formats each {{<column>}} side-by-side, i.e. as though each were
592formatted separately and then the individual lines concatenated
593together. The current column width is divided evenly among the
594columns, and all but the last column are right-padded. For example
595
596<enscript highlight=scheme>
597(fmt #t (columnar (dsp "abc\ndef\n") (dsp "123\n456\n")))
598</enscript>
599
600outputs
601
602     abc     123
603     def     456
604
605assuming a 16-char width (the left side gets half the width, or 8
606spaces, and is left aligned). Note that we explicitly use DSP instead
607of the strings directly. This is because {{columnar}} treats raw
608strings as literals inserted into the given location on every line, to
609be used as borders, for example:
610
611<enscript highlight=scheme>
612  (fmt #t (columnar "/* " (dsp "abc\ndef\n")
613                    " | " (dsp "123\n456\n")
614                    " */"))
615</enscript>
616
617would output
618
619  /* abc | 123 */
620  /* def | 456 */
621
622You may also prefix any column with any of the symbols {{'left}},
623{{'right}} or {{'center}} to control the justification. The symbol
624{{'infinite}} can be used to indicate the column generates an infinite
625stream of output.
626
627You can further prefix any column with a width modifier. Any positive
628integer is treated as a fixed width, ignoring the available width. Any
629real number between 0 and 1 indicates a fraction of the available
630width (after subtracting out any fixed widths). Columns with
631unspecified width divide up the remaining width evenly.
632
633Note that {{columnar}} builds its output incrementally, interleaving
634calls to the generators until each has produced a line, then
635concatenating that line together and outputting it. This is important
636because as noted above, some columns may produce an infinite stream of
637output, and in general you may want to format data larger than can fit
638into memory. Thus columnar would be suitable for line numbering a file
639of arbitrary size, or implementing the Unix {{yes(1)}} command, etc.
640
641As an implementation detail, {{columnar}} uses first-class
642continuations to interleave the column output. The core {{fmt}} itself
643has no knowledge of or special support for {{columnar}}, which could
644complicate and potentially slow down simpler {{fmt}} operations. This
645is a testament to the power of {{call/cc}} - it can be used to
646implement coroutines or arbitrary control structures even where they
647were not planned for.
648
649===== tabular
650<procedure>(tabular <column> ...)</procedure>
651
652Equivalent to columnar except that each column is padded at least to
653the minimum width required on any of its lines. Thus
654
655<enscript highlight=scheme>
656(fmt #t (tabular "|" (dsp "a\nbc\ndef\n") "|" (dsp "123\n45\n6\n") "|"))
657</enscript>
658
659outputs
660
661 |a  |123|
662 |bc |45 |
663 |def|6  |
664
665This makes it easier to generate tables without knowing widths in
666advance. However, because it requires generating the entire output in
667advance to determine the correct column widths, {{tabular}} cannot
668format a table larger than would fit in memory.
669
670
671===== fmt-columns
672<procedure>(fmt-columns <column> ...)</procedure>
673
674The low-level formatter on which {{columnar}} is based. Each
675{{<column>}} must be a list of 2-3 elements:
676
677 (<line-formatter> <line-generator> [<infinite?>])
678
679where {{<line-generator>}} is the column generator as above, and the
680<line-formatter> is how each line is formatted. Raw concatenation of
681each line is performed, without any spacing or width
682adjustment. {{<infinite?>}}, if true, indicates this generator
683produces an infinite number of lines and termination should be
684determined without it.
685
686
687===== wrap-lines
688<procedure>(wrap-lines <format> ...)</procedure>
689
690Behaves like {{cat}}, except text is accumulated and lines are
691optimally wrapped to fit in the current width as in the Unix
692{{fmt(1)}} command.
693
694===== justify
695<procedure>(justify <format> ...)</procedure>
696
697Like wrap-lines except the lines are full-justified.
698
699<enscript highlight=scheme>
700  (define func
701    '(define (fold kons knil ls)
702       (let lp ((ls ls) (acc knil))
703         (if (null? ls) acc (lp (cdr ls) (kons (car ls) acc))))))
704
705  (define doc
706    (string-append
707      "The fundamental list iterator.  Applies KONS to each element "
708      "of LS and the result of the previous application, beginning "
709      "with KNIL.  With KONS as CONS and KNIL as '(), equivalent to REVERSE."))
710
711  (fmt #t (columnar (pretty func) " ; " (justify doc)))
712</enscript>
713
714outputs
715
716  (define (fold kons knil ls)          ; The   fundamental   list   iterator.
717    (let lp ((ls ls) (acc knil))       ; Applies  KONS  to  each  element  of
718      (if (null? ls)                   ; LS  and  the  result of the previous
719          acc                          ; application,  beginning  with  KNIL.
720          (lp (cdr ls)                 ; With  KONS  as CONS and KNIL as '(),
721              (kons (car ls) acc)))))  ; equivalent to REVERSE.
722
723
724===== fmt-file
725<procedure>(fmt-file <pathname>)</procedure>
726
727Simply displays the contents of the file {{<pathname>}} a line at a
728time, so that in typical formatters such as {{columnar}} only constant
729memory is consumed, making this suitable for formatting files of
730arbitrary size.
731
732
733===== line-numbers
734<procedure>(line-numbers [<start>])</procedure>
735
736A convenience utility, just formats an infinite stream of numbers (in
737the current radix) beginning with {{<start>}}, which defaults to
738{{1}}.
739
740The Unix {{nl(1)}} utility could be implemented as:
741
742<enscript highlight=scheme>
743  (fmt #t (columnar 6 'right 'infinite (line-numbers)
744                    " " (fmt-file "read-line.scm")))
745</enscript>
746
747     1
748     2 (define (read-line . o)
749     3   (let ((port (if (pair? o) (car o) (current-input-port))))
750     4     (let lp ((res '()))
751     5       (let ((c (read-char port)))
752     6         (if (or (eof-object? c) (eqv? c #\newline))
753     7             (list->string (reverse res))
754     8             (lp (cons c res)))))))
755
756
757=== C Formatting
758
759==== C Formatting Basics
760
761For purposes such as writing wrappers, code-generators, compilers or
762other language tools, people often need to generate or emit C
763code. Without a decent library framework it's difficult to maintain
764proper indentation. In addition, for the Scheme programmer it's
765tedious to work with all the context sensitivities of C, such as the
766expression vs. statement distinction, special rules for writing
767preprocessor macros, and when precedence rules require
768parenthesis. Fortunately, context is one thing this formatting library
769is good at keeping track of. The C formatting interface tries to make
770it as easy as possible to generate C code without getting in your way.
771
772There are two approaches to using the C formatting extensions -
773procedural and sexp-oriented (described in "C as S-Expressions"
774bellow). In the procedural interface, C operators are made available
775as formatters with a "c-" prefix, literals are converted to their C
776equivalents and symbols are output as-is (you're responsible for
777making sure they are valid C identifiers). Indentation is handled
778automatically.
779
780<enscript highlight=scheme>
781(fmt #t (c-if 1 2 3))
782</enscript>
783
784outputs
785
786  if (1) {
787      2;
788  } else {
789      3;
790  }
791
792In addition, the formatter knows when you're in an expression and when
793you're in a statement, and behaves accordingly, so that
794
795<enscript highlight=scheme>
796(fmt #t (c-if (c-if 1 2 3) 4 5))
797</enscript>
798
799outputs
800
801  if (1 ? 2 : 3) {
802      4;
803  } else {
804      5;
805  }
806
807
808Similary, {{c-begin}}, used for sequencing, will separate with
809semi-colons in a statement and commas in an expression.
810
811Moreover, we also keep track of the final expression in a function and
812insert returns for you:
813
814<enscript highlight=scheme>
815(fmt #t (c-fun 'int 'foo '() (c-if (c-if 1 2 3) 4 5)))
816</enscript>
817
818outputs
819
820  int foo () {
821      if (1 ? 2 : 3) {
822          return 4;
823      } else {
824          return 5;
825      }
826  }
827
828although it knows that void functions don't return.
829
830Switch statements insert breaks by default if they don't return:
831
832<enscript highlight=scheme>
833  (fmt #t (c-switch 'y
834            (c-case 1 (c+= 'x 1))
835            (c-default (c+= 'x 2))))
836</enscript>
837
838  switch (y) {
839      case 1:
840          x += 1;
841          break;
842      default:
843          x += 2;
844          break;
845  }
846
847
848though you can explicitly fallthrough if you want:
849
850<enscript highlight=scheme>
851  (fmt #t (c-switch 'y
852            (c-case/fallthrough 1 (c+= 'x 1))
853            (c-default (c+= 'x 2))))
854</enscript>
855
856  switch (y) {
857      case 1:
858          x += 1;
859      default:
860          x += 2;
861          break;
862  }
863
864
865Operators are available with just a {{"c"}} prefix, e.g. {{c+}},
866{{c-}}, {{c*}}, {{c/}}, etc. {{c++}} is a prefix operator,
867{{c++/post}} is postfix. {{||}}, {{|}} and {{|=}} are written as
868{{c-or}}, {{c-bit-or}} and {{c-bit-or=}} respectively.
869
870Function applications are written with {{c-apply}}. Other control
871structures such as {{c-for}} and {{c-while}} work as expected. The
872full list is in the procedure index below.
873
874When a C formatter encounters an object it doesn't know how to write
875(including lists and records), it outputs them according to the format
876state's current {{'gen}} variable. This allows you to specify generators
877for your own types, e.g. if you are using your own AST records in a
878compiler.
879
880If the {{'gen}} variable isn't set it defaults to the {{c-expr/sexp}}
881procedure, which formats an s-expression as if it were C code. Thus
882instead of {{c-apply}} you can just use a list. The full API is
883available via normal s-expressions - formatters that aren't keywords
884in C are prefixed with a {{%}} or otherwise made invalid C identifiers
885so that they can't be confused with function application.
886
887==== C Preprocessor Formatting
888
889C preprocessor formatters also properly handle their surrounding
890context, so you can safely intermix them in the normal flow of C code.
891
892<enscript highlight=scheme>
893  (fmt #t (c-switch 'y
894            (c-case 1 (c= 'x 1))
895            (cpp-ifdef 'H_TWO (c-case 2 (c= 'x 4)))
896            (c-default (c= 'x 5))))
897</enscript>
898
899  switch (y) {
900      case 1:
901          x = 1;
902          break;
903
904  #ifdef H_TWO
905      case 2:
906          x = 4;
907          break;
908  #endif /* H_TWO */
909      default:
910          x = 5;
911          break;
912  }
913
914
915Macros can be handled with {{cpp-define}}, which knows to wrap
916individual variable references in parenthesis:
917
918<enscript highlight=scheme>
919(fmt #t (cpp-define '(min x y) (c-if (c< 'x 'y) 'x 'y)))
920</enscript>
921
922  #define min(x, y) (((x) < (y)) ? (x) : (y))
923
924As with all C formatters, the CPP output is pretty printed as needed,
925and if it wraps over several lines the lines are terminated with a
926backslash.
927
928To write a C header file that is included at most once, you can wrap
929the entire body in {{cpp-wrap-header}}:
930
931<enscript highlight=scheme>
932  (fmt #t (cpp-wrap-header "FOO_H"
933            (c-extern (c-prototype 'int 'foo '()))))
934</enscript>
935
936  #ifndef FOO_H
937  #define FOO_H
938
939  extern int foo ();
940
941  #endif /* ! FOO_H */
942
943
944==== Customizing C Style
945
946The output uses a simplified K&R style with 4 spaces for indentation
947by default. The following state variables let you override the style:
948
949===== 'indent-space
950
951how many spaces to indent bodies, default {{4}}
952
953===== 'switch-indent-space
954
955how many spaces to indent switch clauses, also defaults to {{4}}
956
957===== 'newline-before-brace?
958
959insert a newline before an open brace (non-K&R), defaults to {{#f}}
960
961===== 'braceless-bodies?
962
963omit braces when we can prove they aren't needed
964
965===== 'non-spaced-ops?
966
967omit spaces between operators and operands for groups of variables and literals (e.g. {{"a+b+3"}} instead of {{"a + b + 3"}})
968
969===== 'no-wrap?
970
971Don't wrap function calls and long operator groups over mulitple
972lines. Functions and control structures will still use multiple lines.
973
974The C formatters also respect the {{'radix}} and {{'precision}} settings.
975
976
977====  C Formatter Index
978
979===== c-if
980<procedure>(c-if <condition> <pass> [<fail> [<condition2> <pass2> ...]])</procedure>
981
982Print a chain of if/else conditions. Use a final condition of
983{{'else}} for a final else clause.
984
985
986===== c-for, c-while
987<procedure>(c-for <init> <condition> <update> <body> ...)</procedure><br>
988<procedure>(c-while <condition> <body> ...)</procedure>
989
990Basic loop constructs.
991
992
993===== c-fun, c-prototype
994<procedure>(c-fun <type> <name> <params> <body> ...)</procedure>
995<procedure>(c-prototype <type> <name> <params>)</procedure>
996
997Output a function or function prototype. The parameters should be a
998list 2-element lists of the form {{(<param-type> <param-name>)}},
999which are output with DSP. A parameter can be abbreviated as just the
1000symbol name, or {{#f}} can be passed as the type, in which case the
1001'default-type state variable is used. The parameters may be a dotted
1002list, in which case ellipses for a C variadic are inserted - the
1003actual name of the dotted value is ignored.
1004
1005Types are just typically just symbols, or lists of symbols such as
1006{{'(const char)}}. A complete description is given below in section "C
1007Types".
1008
1009These can also accessed as {{%fun}} and {{%prototype}} at the head of
1010a list.
1011
1012
1013===== c-var
1014<procedure>(c-var <type> <name> [<init-value>])</procedure>
1015
1016Declares and optionally initializes a variable. Also accessed as
1017{{%var}} at the head of a list.
1018
1019
1020===== c-begin
1021<procedure>(c-begin <expr> ...)</procedure>
1022
1023Outputs each of the {{<expr>}}s, separated by semi-colons if in a
1024statement or commas if in an expression.
1025
1026
1027===== c-switch, c-case, c-case/fallthrough, c-default
1028<procedure>(c-switch <clause> ...)</procedure><br>
1029<procedure>(c-case <values> <body> ...)</procedure><br>
1030<procedure>(c-case/fallthrough <values> <body> ...)</procedure><br>
1031<procedure>(c-default <body> ...)</procedure>
1032
1033Switch statements. In addition to using the clause formatters, clauses
1034inside a switch may be handled with a Scheme CASE-like list, with the
1035car a list of case values and the cdr the body.
1036
1037
1038===== c-label, c-goto, c-return, c-break, c-continue
1039<procedure>(c-label <name>)</procedure><br>
1040<procedure>(c-goto <name>)</procedure><br>
1041<procedure>(c-return [<result>])</procedure><br>
1042<constant>c-break</constant><br>
1043<constant>c-continue</constant>
1044
1045Manual labels and jumps. Labels can also be accessed as a list
1046beginning with a colon, e.g. {{'(: label1)}}.
1047
1048
1049===== c-const, c-static, c-volatile, c-restrict, c-register, c-auto, c-inline, c-extern
1050<procedure>(c-const <expr>)</procedure><br>
1051<procedure>(c-static <expr>)</procedure><br>
1052<procedure>(c-volatile <expr>)</procedure><br>
1053<procedure>(c-restrict <expr>)</procedure><br>
1054<procedure>(c-register <expr>)</procedure><br>
1055<procedure>(c-auto <expr>)</procedure><br>
1056<procedure>(c-inline <expr>)</procedure><br>
1057<procedure>(c-extern <expr>)</procedure>
1058
1059Declaration modifiers. May be nested.
1060
1061===== c-extern/C
1062<procedure>(c-extern/C <body> ...)</procedure>
1063
1064Wraps body in an
1065
1066  extern "C" { ... }
1067
1068for use with C++.
1069
1070
1071===== c-cast
1072<procedure>(c-cast <type> <expr>)</procedure>
1073
1074Casts an expression to a type. Also {{%cast}} at the head of a list.
1075
1076
1077===== c-typedef
1078<procedure>(c-typedef <type> <new-name> ...)</procedure>
1079Creates a new type definition with one or more names.
1080
1081
1082===== c-struct, c-union, c-class, c-attribute
1083<procedure>(c-struct [<name>] <field-list> [<attributes>])</procedure><br>
1084<procedure>(c-union [<name>] <field-list> [<attributes>])</procedure><br>
1085<procedure>(c-class [<name>] <field-list> [<attributes>])</procedure><br>
1086<procedure>(c-attribute <values> ...)</procedure>
1087
1088Composite type constructors. Attributes may be accessed as
1089{{%attribute}} at the head of a list.
1090
1091<enscript highlight=scheme>
1092  (fmt #f (c-struct 'employee
1093                      '((short age)
1094                        ((char *) name)
1095                        ((struct (year month day)) dob))
1096                      (c-attribute 'packed)))
1097</enscript>
1098
1099  struct employee {
1100      short age;
1101      char* name;
1102      struct {
1103          int year;
1104          int month;
1105          int day;
1106      } dob;
1107  } __attribute__ ((packed));
1108
1109
1110===== c-enum
1111<procedure>(c-enum [<name>] <enum-list>)</procedure>
1112
1113Enumerated types. <enum-list> may be strings, symbols, or lists of
1114string or symbol followed by the enum's value.
1115
1116===== c-comment
1117<procedure>(c-comment <formatter> ...)</procedure>
1118
1119Outputs the <formatter>s wrapped in C's {{/* ... */}}
1120comment. Properly escapes nested comments inside in an Emacs-friendly
1121style.
1122
1123==== C Preprocessor Formatter Index
1124
1125===== cpp-include
1126<procedure>(cpp-include <file>)</procedure>
1127
1128If file is a string, outputs in it "quotes", otherwise (as a symbol or
1129arbitrary formatter) it outputs it in brackets.
1130
1131<enscript highlight=scheme>
1132(fmt #f (cpp-include 'stdio.h))
1133
1134=> "#include <stdio.h>\n"
1135
1136(fmt #f (cpp-include "config.h"))
1137
1138=> "#include "config.h\n"
1139</enscript>
1140
1141
1142===== cpp-define
1143<procedure>(cpp-define <macro> [<value>])</procedure>
1144
1145Defines a preprocessor macro, which may be just a name or a list of
1146name and parameters. Properly wraps the value in parenthesis and
1147escapes newlines. A dotted parameter list will use the C99 variadic
1148macro syntax, and will also substitute any references to the dotted
1149name with __VA_ARGS__:
1150
1151<enscript highlight=scheme>
1152(fmt #t (cpp-define '(eprintf . args) '(fprintf stderr args)))
1153</enscript>
1154
1155  #define eprintf(...) (fprintf(stderr, __VA_ARGS__))
1156
1157
1158===== cpp-if,cpp-ifdef, cpp-ifndef, cpp-elif, cpp-else
1159<procedure>(cpp-if <condition> <pass> [<fail> ...])</procedure><br>
1160<procedure>(cpp-ifdef <condition> <pass> [<fail> ...])</procedure><br>
1161<procedure>(cpp-ifndef <condition> <pass> [<fail> ...])</procedure><br>
1162<procedure>(cpp-elif <condition> <pass> [<fail> ...])</procedure><br>
1163<procedure>(cpp-else <body> ...)</procedure>
1164
1165Conditional compilation.
1166
1167===== cpp-line
1168<procedure>(cpp-line <num> [<file>])</procedure>
1169
1170Line number information.
1171
1172
1173===== cpp-pragma, cpp-error, cpp-warning
1174<procedure>(cpp-pragma <args> ...)</procedure>
1175<procedure>(cpp-error <args> ...)</procedure>
1176<procedure>(cpp-warning <args> ...)</procedure>
1177
1178Additional preprocessor directives.
1179
1180
1181===== cpp-stringfy
1182<procedure>(cpp-stringify <expr>)</procedure>
1183
1184Stringifies {{<expr>}} by prefixing the {{#}} operator.
1185
1186
1187===== cpp-sym-cat
1188<procedure>(cpp-sym-cat <args> ...)</procedure>
1189
1190Joins the {{<args>}} into a single preprocessor token with the {{##}}
1191operator.
1192
1193
1194===== cpp-wrap-header
1195<procedure>(cpp-wrap-header <name> <body> ...)</procedure>
1196
1197Wrap an entire header to only be included once.
1198
1199Operators:
1200
1201  c++ c-- c+ c- c* c/ c% c& c^ c~ c! c&& c<< c>> c== c!=
1202  c< c> c<= c>= c= c+= c-= c*= c/= c%= c&= c^= c<<= c>>=
1203  c++/post c--/post c-or c-bit-or c-bit-or=
1204
1205
1206
1207==== C Types
1208
1209Typically a type is just a symbol such as {{'char}} or {{'int}}. You
1210can wrap types with modifiers such as {{c-const}}, but as a
1211convenience you can just use a list such as in {{'(const unsignedchar *)}}.
1212You can also nest these lists, so the previous example is
1213equivalent to {{'(* (const (unsigned char))).}}
1214
1215Pointers may be written as {{'(%pointer <type>)}} for readability -
1216{{%pointer}} is exactly equivalent to {{*}} in types.
1217
1218Unamed structs, classes, unions and enums may be used directly as types, using their respective keywords at the head of a list.
1219
1220Two special types are the {{%array}} type and function pointer
1221type. An array is written:
1222
1223<enscript highlight=scheme>
1224(%array <type> [<size>])
1225</enscript>
1226
1227where {{<type>}} is any other type (including another array or
1228function pointer), and {{<size>}}, if given, will print the array
1229size. For example:
1230
1231<enscript highlight=scheme>
1232(c-var '(%array (unsigned long) SIZE) 'table '#(1 2 3 4))
1233</enscript>
1234
1235 unsigned long table[SIZE] = {1, 2, 3, 4};
1236
1237A function pointer is written:
1238
1239<enscript highlight=scheme>
1240(%fun <return-type> (<param-types> ...))
1241</enscript>
1242
1243For example:
1244
1245<enscript highlight=scheme>
1246(c-typedef '(%fun double (double double int)) 'f)
1247</enscript>
1248
1249 typedef double (*f)(double, double, int);
1250
1251Wherever a type is expected but not given, the value of the
1252{{'default-type}} formatting state variable is used. By default this
1253is just {{'int}}.
1254
1255Type declarations work uniformly for variables and parameters, as well
1256for casts and typedefs.
1257
1258
1259==== C as S-Expressions
1260
1261Rather than building formatting closures by hand, it can be more
1262convenient to just build a normal s-expression and ask for it to be
1263formatted as C code. This can be thought of as a simple Scheme->C
1264compiler without any runtime support.
1265
1266In a s-expression, strings and characters are printed as C strings and
1267characters, booleans are printed as 0 or 1, symbols are displayed
1268as-is, and numbers are printed as C numbers (using the current
1269formatting radix if specified). Vectors are printed as comma-separated
1270lists wrapped in braces, which can be used for initializing arrays or
1271structs.
1272
1273A list indicates a C expression or statement. Any of the existing C
1274keywords can be used to pretty-print the expression as described with
1275the c-keyword formatters above. Thus, the example above
1276
1277<enscript highlight=scheme>
1278(fmt #t (c-if (c-if 1 2 3) 4 5))
1279</enscript>
1280
1281could also be written
1282
1283<enscript highlight=scheme>
1284(fmt #t (c-expr '(if (if 1 2 3) 4 5)))
1285</enscript>
1286
1287C constructs that are dependent on the underlying syntax and have no
1288keyword are written with a {{%}} prefix ({{%fun}}, {{%var}},
1289{{%pointer}}, {{%array}}, {{%cast}}), including C preprocessor
1290constructs ({{%include}}, {{%define}}, {{%pragma}}, {{%error}},
1291{{%warning}}, {{%if}}, {{%ifdef}}, {{%ifndef}}, {{%elif}}). Labels are
1292written as {{(: <label-name>)}}. You can write a sequence as {{(%begin <expr> ...)}}.
1293
1294For example, the following definition of the fibonacci sequence, which
1295apart from the return type of {{#f}} looks like a Lisp definition:
1296
1297<enscript highlight=scheme>
1298(fmt #t (c-expr '(%fun #f fib (n) (if (<= n 1) 1 (+ (fib (- n 1)) (fib (- n 2)))))))
1299</enscript>
1300
1301prints the working C definition:
1302
1303 int fib (int n) {
1304     if (n <= 1) {
1305         return 1;
1306     } else {
1307         return fib((n - 1)) + fib((n - 2));
1308     }
1309 }
1310
1311
1312=== Formatting with Color
1313
1314The {{fmt-color}} library provides the following utilities:
1315
1316  (fmt-red <formatter> ...)
1317  (fmt-blue <formatter> ...)
1318  (fmt-green <formatter> ...)
1319  (fmt-cyan <formatter> ...)
1320  (fmt-yellow <formatter> ...)
1321  (fmt-magenta <formatter> ...)
1322  (fmt-white <formatter> ...)
1323  (fmt-black <formatter> ...)
1324  (fmt-bold <formatter> ...)
1325  (fmt-underline <formatter> ...)
1326
1327and more generally
1328
1329<procedure>(fmt-color <color> <formatter> ...)</procedure>
1330
1331where {{<color>}} can be a symbol name or {{#xRRGGBB}} numeric
1332value. Outputs the formatters colored with ANSI escapes. In addition
1333
1334<procedure>(fmt-in-html <formatter> ...)</procedure>
1335
1336can be used to mark the format state as being inside HTML, which the
1337above color formats will understand and output HTML {{<span>}} tags
1338with the appropriate style colors, instead of ANSI escapes.
1339
1340
1341
1342=== Unicode
1343
1344The {{fmt-unicode}} library provides the fmt-unicode formatter, which
1345just takes a list of formatters and overrides the string-length for
1346padding and trimming, such that Unicode double or full width
1347characters are considered 2 characters wide (as they typically are in
1348fixed-width terminals), while treating combining and non-spacing
1349characters as 0 characters wide.
1350
1351It also recognizes and ignores ANSI escapes, in particular useful if
1352you want to combine this with the {{fmt-color}} utilities.
1353
1354
1355=== Optimizing
1356
1357The library is designed for scalability and flexibility, not speed,
1358and I'm not going to think about any fine tuning until it's more
1359stabilised. One aspect of the design, however, was influenced for the
1360sake of future optimizations, which is that none of the default format
1361variables are initialized by global parameters, which leaves room for
1362inlining and subsequent simplification of format calls.
1363
1364If you don't have an aggressively optimizing compiler, you can easily
1365achieve large speedups on common cases with CL-style compiler macros.
1366
1367
1368=== Common Lisp Format Cheat Sheet
1369
1370A quick reference for those of you switching over from Common Lisp's format.
1371
1372 format         fmt
1373 ~a     dsp
1374 ~c     dsp
1375 ~s     wrt/unshared
1376 ~w     wrt
1377 ~y     pretty
1378 ~x     (radix 16 ...) or (num <n> 16)
1379 ~o     (radix 8 ...) or (num <n> 8)
1380 ~b     (radix 2 ...) or (num <n> 2)
1381 ~f     (fix <digits> ...) or (num <n> <radix> <digits>)
1382 ~%     nl
1383 ~&     fl
1384 ~[...~]        normal if or fmt-if (delayed test)
1385 ~{...~}        (fmt-join ... <list> [<sep>])
1386
1387
1388
1389=== Author
1390
1391[[/users/alex-shinn|Alex Shinn]]
1392
1393
1394=== Version history
1395
1396; 0.8.11 : ported to CHICKEN 5
1397; 0.802 : fix tests [Mario Domenech Goulart]
1398; 0.705 : fix error in wrap-lines for certain combinations of wrap widths and long lines [Jim Ursetto]
1399; 0.704 : prevent string-tokenize from usurping srfi-13 binding [noticed by Taylor Venable]
1400; 0.703 : patch wrap-lines to not fail on whitespace-only strings; disable last column padding skip optimization; 100% width no longer treated as fixed length 1; override string-tokenize to properly split strings containing utf8 sequences [Jim Ursetto]
1401; 0.700 : Sync to upstream fmt 0.7 [Jim Ursetto]
1402; 0.518 : initial fmt 0.5 port to Chicken 4 by Alex Shinn
Note: See TracBrowser for help on using the repository browser.