source: project/wiki/eggref/5/bitstring @ 37426

Last change on this file since 37426 was 37426, checked in by Kooda, 2 years ago

Document bitstring for CHICKEN 5

File size: 13.3 KB
Line 
1== bitstring
2
3== Description
4
5Easy binary data manipulation.
6Support binary encoding-decoding with matching and condition guards.
7Implements the subset of Erlang bit syntax.
8
9== Authors
10
11rivo
12
13== Requirements
14
15No requiremtents.
16
17== API
18
19== Basic syntax description
20
21<enscript highlight="scheme">
22(bitmatch binary-data
23 ((pattern ...) expression)
24 ...
25 (else expression))
26
27(bitconstruct
28 pattern ...)
29</enscript>
30 
31== Patterns description
32
33;(NAME) : Read byte from stream and bind to variable NAME or compare with immediate value if name not a symbol name. Supported immediate values types: integer char string
34
35;(NAME BITS) : Read n-BITS big endian unsigned integer, bind-or-compare to NAME.
36
37;(NAME BITS [big | little | host] [signed | unsigned]) : {{BITS}} integer with byte-oreder and signed/usigned attirbutes.
38
39;(NAME float [big | little | host]) : Read single precision floating point, bind-or-compare to NAME.
40
41;(NAME double [big | little | host]) : Read double precision floating point, bind-or-compare to NAME.
42
43;(NAME BITS float [big | little | host]) : Read 32/64 bits floating point value, bind-or-compare to NAME. Generic way to manipulate signle or double precision values.
44
45;(NAME boolean) : Read 8 bit boolean, bind-or-compare to NAME, (#f - all bits zeroes or #t otherwise).
46;(NAME BITS boolean [big | little | host]) : Boolean maybe saved as arbitrary bits integer also with endian attribute. Saving boolen will turns NAME to 0 for #f, 1 for #t.
47
48;(NAME BITS bitstring) : Read raw BITS from stream, bind-or-compare to NAME.
49
50;(NAME bitstring) : Greedy read, consume all available bits.
51
52;() : Empty bitstring
53
54;(PACKET-NAME bitpacket) : Read a packet defined by the (bitpacket PACKET-NAME ...) declaration. Bind each packet field to current lexical scope. ''!!! bitpacket is an experimental feature !!!''
55
56;(check EXPRESSION) : user guard EXPRESSION. Matching will continue only when this evaluates to a true value.
57
58== Matching inputs against bit patterns
59
60<macro>(bitmatch binary-data patterns-list else-guard)</macro>
61
62* Match {{binary-data}} against the patterns from {{patterns-list}}.
63* {{binary-data}} may be a bitstring object or a value of any of the following data types: u8vector, string or regular vector.
64* If nothing matches and an {{else-guard}} clause was not specified, an exception of type {{bitstring-match-failure}} is raised.
65* Else guard is optional.
66
67== Constructing bitstrings from input based on bit patterns
68
69<macro>(bitconstruct pattern)</macro>
70
71* Construct bitstring based on pattern.  The pattern will construct the bitstring from identifiers taken from the current lexical scope.
72* If nothing matches an exception of type {{bitstring-match-failure}} is raised.
73* Supports special pattern for concatenating bitstrings.
74
75; ((EXPRESSION ...) bitstring) : EXPRESSION should evaluate to bitstring during constructing.
76
77== Defining custom bitstring forms
78
79<macro>(bitpacket PACKET-NAME fields ...)</macro>
80
81Define well-known set of fields. Fields syntax the same as bitmatch pattern syntax.
82
83== Dealing with bitstring objects
84
85Bitstring objects represent strings of bits of arbitrary length.  This
86means they can store any number of unaligned bits, rather like
87bitfields in C.  Bitfields can also share memory with other bitfields,
88which means you can easily create sub-bitstrings from other bitstrings.
89
90<procedure>(bitstring=? bitstring1 bitstring2)</procedure>
91
92Compare bitstrings.
93
94<procedure>(bitstring-bit-set? bitstring bit-index)</procedure>
95
96Test bit value at {{bit-index}} position.
97
98(bitstring-bit-set? (->bitstring '#${80 00}) 0) => #t
99(bitstring-bit-set? (->bitstring '#${00 01}) -1) => #t
100
101<procedure>(bitstring->list bitstring [bits [byte-order]])</procedure>
102
103Convert bitstring to list of bits.
104
105Optional group {{bits}}, default value 1, indicates how many bits each
106entry in the list should span.  For example, to see the contents
107grouped by octet, use 8 here.
108
109Optional {{byte-order}} {{'little}} - little-endian, {{'big}} - big-endian,
110{{'host}} host system byte-order, default value {{'big}}.  This has an effect only if {{bit}}
111is larger than 8.
112
113<procedure>(bitstring-reverse bitstring [bits [byte-order]])</procedure>
114
115Reverse bitstring, optional group {{bits}} (default 1) with {{byte-order}} (default {{'big}}).
116
117(bitstring-reverse (->bitstring '#${01 02}) 1) => #${40 80}
118
119(bitstring-reverse (->bitstring '#${01 02}) 8 little) => #${02 01}
120
121<procedure>(bitstring-not bitstring)</procedure>
122
123Invert each bit value.
124
125(bitstring-not (->bitstring '#${FE 00})) => #${01 FF}
126
127<procedure>(bitstring? obj)</procedure>
128
129Returns {{#t}} or {{#f}} depending on whether {{obj}} is a bitstring
130or another type of object.
131
132<procedure>(bitstring-length bitstring)</procedure>
133
134Return length of the bitstring in bits.
135
136<procedure>(bitstring-append bitstring1 bitstringN ...)</procedure>
137
138Concatenate bitstrings.
139
140<procedure>(bitstring-append! dest-bitstring bitstringN ...)</procedure>
141
142Concatenate bitstrings, and store result into dest-bitstring.
143
144<procedure>(->bitstring obj)</procedure>
145
146Construct bitstring from arbitrary object.
147
148<procedure>(vector->bitstring vec)</procedure>
149Each element of vector {{vec}} should be integer in range of 0 - 255.
150
151<procedure>(u8vector->bitstring vec)</procedure>
152<procedure>(string->bitstring str)</procedure>
153
154<procedure>(bitstring->blob bitstring [zero-extendind])</procedure>
155<procedure>(bitstring->u8vector bitstring [zero-extendind])</procedure>
156
157If bitstring not aligned on 8 bit boundary rest bits extending with zeroes.
158{{zero-extendind}} optional argument, {{'left}} you get an integer value of rest bit,
159{{'right}} give you internal bitstring repsesentation where bits follow one by one, default value {{'left}}.
160
161zero-extending to left
162<bitstring 0 9 (1 1 1 1 1 1 1 1 1)> turn into #u8(#xff #x01)
163
164zero-extending to right, this might be usefull when you want to store your string to the disc and then load back.
165<bitstring 0 9 (1 1 1 1 1 1 1 1 1)> turn into #u8(#xff #x80)
166
167== Examples
168
169<enscript highlight="scheme">
170
171; Example 1. Tagged data structure.
172;
173; struct Tagged {
174;  enum { IntegerType = 1, FloatType = 2 };
175;  unsigned char Tag; // integer type = 1, float type = 2
176;  union {
177;   unsigned int IValue;
178;   float FValue;
179;  };
180; };
181;
182
183(import bitstring)
184
185; The following will print "integer:3721182122",
186; which is the decimal value of #xDDCCBBAA
187(bitmatch "\x01\xAA\xBB\xCC\xDD"
188  (((#x01) (IValue 32 little))
189      (print "integer:" IValue))
190  (((#x02) (FValue 32 float))
191      (print "float:" FValue)))
192
193; Example 2. Fixed length string.
194;
195; struct FixedString {
196;  short Length; // length of StringData array
197;  char StringData[0];
198; };
199;
200
201(import bitstring)
202
203; This will print "StringData:(65 66 67 68 69)"
204; First it reads the length byte of 5, bind it to Length and
205; then it will read a bit string with a length of that many octets.
206(bitmatch "\x05\x00ABCDE"
207  (((Length 16 little)
208    (StringData (* 8 Length) bitstring))
209      (print "StringData:" (bitstring->list StringData 8)))
210  (else
211      (print "invalid string")))
212
213; Example 3. IP packet parsing.
214;
215
216(use bitstring srfi-4)
217
218(define IPRaw `#u8( #x45 #x00 #x00 #x6c
219        #x92 #xcc #x00 #x00
220        #x38 #x06 #x00 #x00
221        #x92 #x95 #xba #x14
222        #xa9 #x7c #x15 #x95 ))
223
224(bitmatch IPRaw
225  (((Version 4)
226    (IHL 4)
227    (TOS 8)
228    (TL 16)
229    (Identification 16)
230    (Reserved 1) (DF 1) (MF 1)
231    (FramgentOffset 13)
232    (TTL 8)
233    (Protocol 8) (check (or (= Protocol 1)
234                            (= Protocol 2)
235                            (= Protocol 6)
236                            (= Protocol 17)))
237    (CheckSum 16)
238    (SourceAddr 32 bitstring)
239    (DestinationAddr 32 bitstring)
240    (Optional bitstring))
241      ; print packet filds
242      (print "\n Version: " Version
243             "\n IHL: " IHL
244             "\n TOS: " TOS
245             "\n TL:  " TL
246             "\n Identification: " Identification
247             "\n DF: " DF
248             "\n MF: " MF
249             "\n FramgentOffset: " FramgentOffset
250             "\n Protocol: " Protocol
251             "\n CheckSum: " CheckSum
252             "\n SourceAddr: "
253                 (bitmatch SourceAddr (((A)(B)(C)(D)) (list A B C D)))
254               "\n DestinationAddr: "
255                   (bitmatch DestinationAddr (((A)(B)(C)(D)) (list A B C D)))))
256  (else
257    (print "bad datagram")))
258
259
260; Example 3.1 Using bitconstruct.
261
262(define (construct-fixed-string str)
263  (bitconstruct
264    ((string-length str) 16) (str bitstring) ))
265
266; The following will print "#t".  First, it reads a 16-bit number length
267; and compares it to the immediate value of 7.  Then it will read a
268; string and compare it to the immediate value of "qwerty.".  If there
269; was any remaining data in the string, it would fail.
270(bitmatch (construct-fixed-string "qwerty.")
271  (((7 16) ("qwerty."))
272    (print #t))
273  (else
274    (print #f)))
275
276; Example 3.2 Concatenating bitstrings.
277
278(define (construct-complex-object)
279  (bitconstruct
280    ((construct-fixed-string "A") bitstring)
281    (#xAABB 16)
282    ((construct-fixed-string "RRR") bitstring)
283    (#\X)))
284
285(print (construct-complex-object))
286
287; Basic TGA image parser.
288; Support True-Image type format and Run-Length-Encoding compression.
289; SPEC: http://www.dca.fee.unicamp.br/~martino/disciplinas/ea978/tgaffs.pdf
290; Full Source: https://bitbucket.org/rivo/bitstring/src/tip/tests?at=default
291;
292; WARNING!!! bitpacket feature is experimental !!!
293(import bitstring srfi-4 (chicken file posix))
294
295(bitpacket TGA-Header
296  (ID-length 8)
297  (ColorMapType 8)
298  (ImageType 8)
299  (TGA-ColorMapSpec bitpacket)
300  (TGA-ImageSpec bitpacket))
301
302(bitpacket TGA-ColorMapSpec
303  (FirstEntryIndex 16 little)
304  (ColorMapLength 16 little)
305  (ColorMapEntrySize 8))
306
307(bitpacket TGA-ImageSpec
308  (X-Origin 16 little)
309  (Y-Origin 16 little)
310  (ImageWidth 16 little)
311  (ImageHeight 16 little)
312  (PixelDepth 8)
313  (ImageTransferOrder 2)
314  (#x00 2) ; reserved
315  (AttributesBitsPerPixel 4))
316
317(define (parse-tga file file-out)
318  (let* ((fi (file-open file (+ open/rdonly open/binary)))
319         (fo (file-open file-out (+ open/write open/creat open/trunc open/binary)))
320         (size (file-size fi))
321         (res (file-read fi size))
322         (data (car res)))
323    (bitmatch data
324      ; True-Color uncompressed
325      (((TGA-Header bitpacket)
326        (check (and (= 0 ColorMapType) (= 2 ImageType)))
327        (ID-data ID-length bitstring)
328        (Image-data (* ImageWidth ImageHeight PixelDepth) bitstring)
329        (Rest-data bitstring))
330                (begin
331                  (print "True-Color uncompressed")
332                  (print ImageWidth "x" ImageHeight "x" PixelDepth)
333                  (parse-image-uncompressed
334                    (lambda (color)
335                      (file-write fo (bitstring->blob color)))
336                    PixelDepth Image-data)))
337      ; True-Color compressed
338      (((TGA-Header bitpacket)
339        (check (and (= 0 ColorMapType) (= 10 ImageType)))
340        (ID-data ID-length bitstring)
341        (Image-data bitstring))
342                (begin
343                  (print "True-Color compressed")
344                  (print ImageWidth "x" ImageHeight "x" PixelDepth)
345                  (parse-image-compressed
346                      (lambda (color)
347                        (file-write fo (bitstring->blob color)))
348                      PixelDepth Image-data))))))
349
350(define (parse-image-uncompressed func depth image)
351  (bitmatch image
352    ((())
353        'ok)
354    (((Color depth bitstring) (Rest bitstring))
355      (begin
356        (func Color)
357        (parse-image-uncompressed func depth Rest)))))
358
359(define (parse-image-compressed func depth image)
360  (bitmatch image
361    ((())
362        'ok)
363    (((1 1) (Count 7) (Color depth bitstring) (Rest bitstring))
364        (let loop ((i 0))
365          (func Color)
366          (if (< i Count)
367            (loop (+ i 1))
368            (parse-image-compressed func depth Rest))))
369    (((0 1) (Count 7) (RAW-data (* depth (+ Count 1)) bitstring) (Rest bitstring))
370        (begin
371          (parse-image-uncompressed func depth RAW-data)
372          (parse-image-compressed func depth Rest)))))
373
374; Convert images to raw pixels
375(parse-tga "tests/24compressed.tga" "tests/24c.raw")
376(parse-tga "tests/24uncompressed.tga" "tests/24u.raw")
377
378</enscript>
379
380== License
381
382BSD
383
384== Repository
385
386Bitstring is maintained in [[https://bitbucket.org/rivo/bitstring|a hg bitbucket repository]].
387
388== Version History
389
3901.35
391Port to CHICKEN 5
392
3931.34
394Fix boolean parsing to support terms following boolean terms. (Jonathan Chan)
395
3961.33
397Fixed signed integers parsing.
398
399Implemented boolean type.
400
401New procs {{bitstring-bit-set?}} {{bitstring-reverse}} {{bitstring-not}}.
402
403Removed support of half-float precision.
404
405Implemented endian attribute for floating point types. Please note that now big-endian is defualt endianess for floating types, this can break or slowdonw your old code. Use "host" attribute to restore old behavior (VALUE float) -> (VALUE float host).
406
407
4081.11
409zero-extending option for bitstring->u8vector,bitstring->blob.
410Multiple argument for bitstring-append.
411
4121.1 Change a bit naming style. Speed-up with -O3 level compilation.
413
4141.0
415introduce bitstring-append, bitstring-append!
416fixed bug in bitstring-append size calculation (sjamaan)
417native "host" endian byteorder (sjamaan)
418signed/unsigned integer attributes
419
4200.5
421Restore (check EXPRESSION) syntax, cause this is not same as matchable (?)
422bytestring? function check if string is byte aligned.
423bytestring-fold helper function.
424
4250.4
426Multiline user expressions. Bitconstruct now accept only single pattern. bitstring-compare renamed to bitstring=?. Optimized immidiate value macro expansion. (check EXPRESSION) guard renamed to (? EXPRESSION).
427 
4280.3
429install bugfixes
430
4310.2
432introduce bitconstruct
433
4340.1
435first public release
Note: See TracBrowser for help on using the repository browser.