1 | == bitstring |
---|
2 | |
---|
3 | == Description |
---|
4 | |
---|
5 | Easy binary data manipulation. |
---|
6 | Support binary encoding-decoding with matching and condition guards. |
---|
7 | Implements the subset of Erlang bit syntax. |
---|
8 | |
---|
9 | == Authors |
---|
10 | |
---|
11 | rivo |
---|
12 | |
---|
13 | == Requirements |
---|
14 | |
---|
15 | No requiremtents. |
---|
16 | |
---|
17 | == API |
---|
18 | |
---|
19 | == Basic syntax description |
---|
20 | |
---|
21 | <enscript highlight="scheme"> |
---|
22 | (bitmatch binary-data |
---|
23 | ((pattern ...) expression) |
---|
24 | ... |
---|
25 | (else expression)) |
---|
26 | |
---|
27 | (bitconstruct |
---|
28 | pattern ...) |
---|
29 | </enscript> |
---|
30 | |
---|
31 | == Patterns description |
---|
32 | |
---|
33 | ;(NAME) : Read byte from stream and bind to variable NAME or compare with immediate value if name not a symbol name. Supported immediate values types: integer char string |
---|
34 | |
---|
35 | ;(NAME BITS) : Read n-BITS big endian unsigned integer, bind-or-compare to NAME. |
---|
36 | |
---|
37 | ;(NAME BITS [big | little | host] [signed | unsigned]) : {{BITS}} integer with byte-oreder and signed/usigned attirbutes. |
---|
38 | |
---|
39 | ;(NAME float [big | little | host]) : Read single precision floating point, bind-or-compare to NAME. |
---|
40 | |
---|
41 | ;(NAME double [big | little | host]) : Read double precision floating point, bind-or-compare to NAME. |
---|
42 | |
---|
43 | ;(NAME BITS float [big | little | host]) : Read 32/64 bits floating point value, bind-or-compare to NAME. Generic way to manipulate signle or double precision values. |
---|
44 | |
---|
45 | ;(NAME boolean) : Read 8 bit boolean, bind-or-compare to NAME, (#f - all bits zeroes or #t otherwise). |
---|
46 | ;(NAME BITS boolean [big | little | host]) : Boolean maybe saved as arbitrary bits integer also with endian attribute. Saving boolen will turns NAME to 0 for #f, 1 for #t. |
---|
47 | |
---|
48 | ;(NAME BITS bitstring) : Read raw BITS from stream, bind-or-compare to NAME. |
---|
49 | |
---|
50 | ;(NAME bitstring) : Greedy read, consume all available bits. |
---|
51 | |
---|
52 | ;() : Empty bitstring |
---|
53 | |
---|
54 | ;(PACKET-NAME bitpacket) : Read a packet defined by the (bitpacket PACKET-NAME ...) declaration. Bind each packet field to current lexical scope. ''!!! bitpacket is an experimental feature !!!'' |
---|
55 | |
---|
56 | ;(check EXPRESSION) : user guard EXPRESSION. Matching will continue only when this evaluates to a true value. |
---|
57 | |
---|
58 | == Matching inputs against bit patterns |
---|
59 | |
---|
60 | <macro>(bitmatch binary-data patterns-list else-guard)</macro> |
---|
61 | |
---|
62 | * Match {{binary-data}} against the patterns from {{patterns-list}}. |
---|
63 | * {{binary-data}} may be a bitstring object or a value of any of the following data types: u8vector, string or regular vector. |
---|
64 | * If nothing matches and an {{else-guard}} clause was not specified, an exception of type {{bitstring-match-failure}} is raised. |
---|
65 | * Else guard is optional. |
---|
66 | |
---|
67 | == Constructing bitstrings from input based on bit patterns |
---|
68 | |
---|
69 | <macro>(bitconstruct pattern)</macro> |
---|
70 | |
---|
71 | * Construct bitstring based on pattern. The pattern will construct the bitstring from identifiers taken from the current lexical scope. |
---|
72 | * If nothing matches an exception of type {{bitstring-match-failure}} is raised. |
---|
73 | * Supports special pattern for concatenating bitstrings. |
---|
74 | |
---|
75 | ; ((EXPRESSION ...) bitstring) : EXPRESSION should evaluate to bitstring during constructing. |
---|
76 | |
---|
77 | == Defining custom bitstring forms |
---|
78 | |
---|
79 | <macro>(bitpacket PACKET-NAME fields ...)</macro> |
---|
80 | |
---|
81 | Define well-known set of fields. Fields syntax the same as bitmatch pattern syntax. |
---|
82 | |
---|
83 | == Dealing with bitstring objects |
---|
84 | |
---|
85 | Bitstring objects represent strings of bits of arbitrary length. This |
---|
86 | means they can store any number of unaligned bits, rather like |
---|
87 | bitfields in C. Bitfields can also share memory with other bitfields, |
---|
88 | which means you can easily create sub-bitstrings from other bitstrings. |
---|
89 | |
---|
90 | <procedure>(bitstring=? bitstring1 bitstring2)</procedure> |
---|
91 | |
---|
92 | Compare bitstrings. |
---|
93 | |
---|
94 | <procedure>(bitstring-bit-set? bitstring bit-index)</procedure> |
---|
95 | |
---|
96 | Test bit value at {{bit-index}} position. |
---|
97 | |
---|
98 | (bitstring-bit-set? (->bitstring '#${80 00}) 0) => #t |
---|
99 | (bitstring-bit-set? (->bitstring '#${00 01}) -1) => #t |
---|
100 | |
---|
101 | <procedure>(bitstring->list bitstring [bits [byte-order]])</procedure> |
---|
102 | |
---|
103 | Convert bitstring to list of bits. |
---|
104 | |
---|
105 | Optional group {{bits}}, default value 1, indicates how many bits each |
---|
106 | entry in the list should span. For example, to see the contents |
---|
107 | grouped by octet, use 8 here. |
---|
108 | |
---|
109 | Optional {{byte-order}} {{'little}} - little-endian, {{'big}} - big-endian, |
---|
110 | {{'host}} host system byte-order, default value {{'big}}. This has an effect only if {{bit}} |
---|
111 | is larger than 8. |
---|
112 | |
---|
113 | <procedure>(bitstring-reverse bitstring [bits [byte-order]])</procedure> |
---|
114 | |
---|
115 | Reverse bitstring, optional group {{bits}} (default 1) with {{byte-order}} (default {{'big}}). |
---|
116 | |
---|
117 | (bitstring-reverse (->bitstring '#${01 02}) 1) => #${40 80} |
---|
118 | |
---|
119 | (bitstring-reverse (->bitstring '#${01 02}) 8 little) => #${02 01} |
---|
120 | |
---|
121 | <procedure>(bitstring-not bitstring)</procedure> |
---|
122 | |
---|
123 | Invert each bit value. |
---|
124 | |
---|
125 | (bitstring-not (->bitstring '#${FE 00})) => #${01 FF} |
---|
126 | |
---|
127 | <procedure>(bitstring? obj)</procedure> |
---|
128 | |
---|
129 | Returns {{#t}} or {{#f}} depending on whether {{obj}} is a bitstring |
---|
130 | or another type of object. |
---|
131 | |
---|
132 | <procedure>(bitstring-length bitstring)</procedure> |
---|
133 | |
---|
134 | Return length of the bitstring in bits. |
---|
135 | |
---|
136 | <procedure>(bitstring-append bitstring1 bitstringN ...)</procedure> |
---|
137 | |
---|
138 | Concatenate bitstrings. |
---|
139 | |
---|
140 | <procedure>(bitstring-append! dest-bitstring bitstringN ...)</procedure> |
---|
141 | |
---|
142 | Concatenate bitstrings, and store result into dest-bitstring. |
---|
143 | |
---|
144 | <procedure>(->bitstring obj)</procedure> |
---|
145 | |
---|
146 | Construct bitstring from arbitrary object. |
---|
147 | |
---|
148 | <procedure>(vector->bitstring vec)</procedure> |
---|
149 | Each element of vector {{vec}} should be integer in range of 0 - 255. |
---|
150 | |
---|
151 | <procedure>(u8vector->bitstring vec)</procedure> |
---|
152 | <procedure>(string->bitstring str)</procedure> |
---|
153 | |
---|
154 | <procedure>(bitstring->blob bitstring [zero-extendind])</procedure> |
---|
155 | <procedure>(bitstring->u8vector bitstring [zero-extendind])</procedure> |
---|
156 | |
---|
157 | If bitstring not aligned on 8 bit boundary rest bits extending with zeroes. |
---|
158 | {{zero-extendind}} optional argument, {{'left}} you get an integer value of rest bit, |
---|
159 | {{'right}} give you internal bitstring repsesentation where bits follow one by one, default value {{'left}}. |
---|
160 | |
---|
161 | zero-extending to left |
---|
162 | <bitstring 0 9 (1 1 1 1 1 1 1 1 1)> turn into #u8(#xff #x01) |
---|
163 | |
---|
164 | zero-extending to right, this might be usefull when you want to store your string to the disc and then load back. |
---|
165 | <bitstring 0 9 (1 1 1 1 1 1 1 1 1)> turn into #u8(#xff #x80) |
---|
166 | |
---|
167 | == Examples |
---|
168 | |
---|
169 | <enscript highlight="scheme"> |
---|
170 | |
---|
171 | ; Example 1. Tagged data structure. |
---|
172 | ; |
---|
173 | ; struct Tagged { |
---|
174 | ; enum { IntegerType = 1, FloatType = 2 }; |
---|
175 | ; unsigned char Tag; // integer type = 1, float type = 2 |
---|
176 | ; union { |
---|
177 | ; unsigned int IValue; |
---|
178 | ; float FValue; |
---|
179 | ; }; |
---|
180 | ; }; |
---|
181 | ; |
---|
182 | |
---|
183 | (use bitstring) |
---|
184 | |
---|
185 | ; The following will print "integer:3721182122", |
---|
186 | ; which is the decimal value of #xDDCCBBAA |
---|
187 | (bitmatch "\x01\xAA\xBB\xCC\xDD" |
---|
188 | (((#x01) (IValue 32 little)) |
---|
189 | (print "integer:" IValue)) |
---|
190 | (((#x02) (FValue 32 float)) |
---|
191 | (print "float:" FValue))) |
---|
192 | |
---|
193 | ; Example 2. Fixed length string. |
---|
194 | ; |
---|
195 | ; struct FixedString { |
---|
196 | ; short Length; // length of StringData array |
---|
197 | ; char StringData[0]; |
---|
198 | ; }; |
---|
199 | ; |
---|
200 | |
---|
201 | (use bitstring) |
---|
202 | |
---|
203 | ; This will print "StringData:(65 66 67 68 69)" |
---|
204 | ; First it reads the length byte of 5, bind it to Length and |
---|
205 | ; then it will read a bit string with a length of that many octets. |
---|
206 | (bitmatch "\x05\x00ABCDE" |
---|
207 | (((Length 16 little) |
---|
208 | (StringData (* 8 Length) bitstring)) |
---|
209 | (print "StringData:" (bitstring->list StringData 8))) |
---|
210 | (else |
---|
211 | (print "invalid string"))) |
---|
212 | |
---|
213 | ; Example 3. IP packet parsing. |
---|
214 | ; |
---|
215 | |
---|
216 | (use bitstring srfi-4) |
---|
217 | |
---|
218 | (define IPRaw `#u8( #x45 #x00 #x00 #x6c |
---|
219 | #x92 #xcc #x00 #x00 |
---|
220 | #x38 #x06 #x00 #x00 |
---|
221 | #x92 #x95 #xba #x14 |
---|
222 | #xa9 #x7c #x15 #x95 )) |
---|
223 | |
---|
224 | (bitmatch IPRaw |
---|
225 | (((Version 4) |
---|
226 | (IHL 4) |
---|
227 | (TOS 8) |
---|
228 | (TL 16) |
---|
229 | (Identification 16) |
---|
230 | (Reserved 1) (DF 1) (MF 1) |
---|
231 | (FramgentOffset 13) |
---|
232 | (TTL 8) |
---|
233 | (Protocol 8) (check (or (= Protocol 1) |
---|
234 | (= Protocol 2) |
---|
235 | (= Protocol 6) |
---|
236 | (= Protocol 17))) |
---|
237 | (CheckSum 16) |
---|
238 | (SourceAddr 32 bitstring) |
---|
239 | (DestinationAddr 32 bitstring) |
---|
240 | (Optional bitstring)) |
---|
241 | ; print packet filds |
---|
242 | (print "\n Version: " Version |
---|
243 | "\n IHL: " IHL |
---|
244 | "\n TOS: " TOS |
---|
245 | "\n TL: " TL |
---|
246 | "\n Identification: " Identification |
---|
247 | "\n DF: " DF |
---|
248 | "\n MF: " MF |
---|
249 | "\n FramgentOffset: " FramgentOffset |
---|
250 | "\n Protocol: " Protocol |
---|
251 | "\n CheckSum: " CheckSum |
---|
252 | "\n SourceAddr: " |
---|
253 | (bitmatch SourceAddr (((A)(B)(C)(D)) (list A B C D))) |
---|
254 | "\n DestinationAddr: " |
---|
255 | (bitmatch DestinationAddr (((A)(B)(C)(D)) (list A B C D))))) |
---|
256 | (else |
---|
257 | (print "bad datagram"))) |
---|
258 | |
---|
259 | ; Example 3.1 Using bitconstruct. |
---|
260 | |
---|
261 | (define (construct-fixed-string str) |
---|
262 | (bitconstruct |
---|
263 | ((string-length str) 16) (str bitstring) )) |
---|
264 | |
---|
265 | ; The following will print "#t". First, it reads a 16-bit number length |
---|
266 | ; and compares it to the immediate value of 7. Then it will read a |
---|
267 | ; string and compare it to the immediate value of "qwerty.". If there |
---|
268 | ; was any remaining data in the string, it would fail. |
---|
269 | (bitmatch (construct-fixed-string "qwerty.") |
---|
270 | (((7 16) ("qwerty.")) |
---|
271 | (print #t)) |
---|
272 | (else |
---|
273 | (print #f))) |
---|
274 | |
---|
275 | ; Example 3.2 Concatenating bitstrings. |
---|
276 | |
---|
277 | (define (construct-complex-object) |
---|
278 | (bitconstruct |
---|
279 | ((construct-fixed-string "A") bitstring) |
---|
280 | (#xAABB 16) |
---|
281 | ((construct-fixed-string "RRR") bitstring) |
---|
282 | (#\X))) |
---|
283 | |
---|
284 | (print (construct-complex-object)) |
---|
285 | |
---|
286 | ; Basic TGA image parser. |
---|
287 | ; Support True-Image type format and Run-Length-Encoding compression. |
---|
288 | ; SPEC: http://www.dca.fee.unicamp.br/~martino/disciplinas/ea978/tgaffs.pdf |
---|
289 | ; Full Source: https://bitbucket.org/rivo/bitstring/src/tip/tests?at=default |
---|
290 | ; |
---|
291 | ; WARNING!!! bitpacket feature is experimental !!! |
---|
292 | (use bitstring posix srfi-4) |
---|
293 | |
---|
294 | (bitpacket TGA-Header |
---|
295 | (ID-length 8) |
---|
296 | (ColorMapType 8) |
---|
297 | (ImageType 8) |
---|
298 | (TGA-ColorMapSpec bitpacket) |
---|
299 | (TGA-ImageSpec bitpacket)) |
---|
300 | |
---|
301 | (bitpacket TGA-ColorMapSpec |
---|
302 | (FirstEntryIndex 16 little) |
---|
303 | (ColorMapLength 16 little) |
---|
304 | (ColorMapEntrySize 8)) |
---|
305 | |
---|
306 | (bitpacket TGA-ImageSpec |
---|
307 | (X-Origin 16 little) |
---|
308 | (Y-Origin 16 little) |
---|
309 | (ImageWidth 16 little) |
---|
310 | (ImageHeight 16 little) |
---|
311 | (PixelDepth 8) |
---|
312 | (ImageTransferOrder 2) |
---|
313 | (#x00 2) ; reserved |
---|
314 | (AttributesBitsPerPixel 4)) |
---|
315 | |
---|
316 | (define (parse-tga file file-out) |
---|
317 | (let* ((fi (file-open file (+ open/rdonly open/binary))) |
---|
318 | (fo (file-open file-out (+ open/write open/creat open/trunc open/binary))) |
---|
319 | (size (file-size fi)) |
---|
320 | (res (file-read fi size)) |
---|
321 | (data (car res))) |
---|
322 | (bitmatch data |
---|
323 | ; True-Color uncompressed |
---|
324 | (((TGA-Header bitpacket) |
---|
325 | (check (and (= 0 ColorMapType) (= 2 ImageType))) |
---|
326 | (ID-data ID-length bitstring) |
---|
327 | (Image-data (* ImageWidth ImageHeight PixelDepth) bitstring) |
---|
328 | (Rest-data bitstring)) |
---|
329 | (begin |
---|
330 | (print "True-Color uncompressed") |
---|
331 | (print ImageWidth "x" ImageHeight "x" PixelDepth) |
---|
332 | (parse-image-uncompressed |
---|
333 | (lambda (color) |
---|
334 | (file-write fo (bitstring->blob color))) |
---|
335 | PixelDepth Image-data))) |
---|
336 | ; True-Color compressed |
---|
337 | (((TGA-Header bitpacket) |
---|
338 | (check (and (= 0 ColorMapType) (= 10 ImageType))) |
---|
339 | (ID-data ID-length bitstring) |
---|
340 | (Image-data bitstring)) |
---|
341 | (begin |
---|
342 | (print "True-Color compressed") |
---|
343 | (print ImageWidth "x" ImageHeight "x" PixelDepth) |
---|
344 | (parse-image-compressed |
---|
345 | (lambda (color) |
---|
346 | (file-write fo (bitstring->blob color))) |
---|
347 | PixelDepth Image-data)))))) |
---|
348 | |
---|
349 | (define (parse-image-uncompressed func depth image) |
---|
350 | (bitmatch image |
---|
351 | ((()) |
---|
352 | 'ok) |
---|
353 | (((Color depth bitstring) (Rest bitstring)) |
---|
354 | (begin |
---|
355 | (func Color) |
---|
356 | (parse-image-uncompressed func depth Rest))))) |
---|
357 | |
---|
358 | (define (parse-image-compressed func depth image) |
---|
359 | (bitmatch image |
---|
360 | ((()) |
---|
361 | 'ok) |
---|
362 | (((1 1) (Count 7) (Color depth bitstring) (Rest bitstring)) |
---|
363 | (let loop ((i 0)) |
---|
364 | (func Color) |
---|
365 | (if (< i Count) |
---|
366 | (loop (+ i 1)) |
---|
367 | (parse-image-compressed func depth Rest)))) |
---|
368 | (((0 1) (Count 7) (RAW-data (* depth (+ Count 1)) bitstring) (Rest bitstring)) |
---|
369 | (begin |
---|
370 | (parse-image-uncompressed func depth RAW-data) |
---|
371 | (parse-image-compressed func depth Rest))))) |
---|
372 | |
---|
373 | ; Convert images to raw pixels |
---|
374 | (parse-tga "tests/24compressed.tga" "tests/24c.raw") |
---|
375 | (parse-tga "tests/24uncompressed.tga" "tests/24u.raw") |
---|
376 | |
---|
377 | </enscript> |
---|
378 | |
---|
379 | == License |
---|
380 | |
---|
381 | BSD |
---|
382 | |
---|
383 | == Repository |
---|
384 | |
---|
385 | Bitstring is maintained in [[https://bitbucket.org/rivo/bitstring|a hg bitbucket repository]]. |
---|
386 | |
---|
387 | == Version History |
---|
388 | |
---|
389 | 1.34 |
---|
390 | Fix boolean parsing to support terms following boolean terms. (Jonathan Chan) |
---|
391 | |
---|
392 | 1.33 |
---|
393 | Fixed signed integers parsing. |
---|
394 | |
---|
395 | Implemented boolean type. |
---|
396 | |
---|
397 | New procs {{bitstring-bit-set?}} {{bitstring-reverse}} {{bitstring-not}}. |
---|
398 | |
---|
399 | Removed support of half-float precision. |
---|
400 | |
---|
401 | Implemented endian attribute for floating point types. Please note that now big-endian is defualt endianess for floating types, this can break or slowdonw your old code. Use "host" attribute to restore old behavior (VALUE float) -> (VALUE float host). |
---|
402 | |
---|
403 | |
---|
404 | 1.11 |
---|
405 | zero-extending option for bitstring->u8vector,bitstring->blob. |
---|
406 | Multiple argument for bitstring-append. |
---|
407 | |
---|
408 | 1.1 Change a bit naming style. Speed-up with -O3 level compilation. |
---|
409 | |
---|
410 | 1.0 |
---|
411 | introduce bitstring-append, bitstring-append! |
---|
412 | fixed bug in bitstring-append size calculation (sjamaan) |
---|
413 | native "host" endian byteorder (sjamaan) |
---|
414 | signed/unsigned integer attributes |
---|
415 | |
---|
416 | 0.5 |
---|
417 | Restore (check EXPRESSION) syntax, cause this is not same as matchable (?) |
---|
418 | bytestring? function check if string is byte aligned. |
---|
419 | bytestring-fold helper function. |
---|
420 | |
---|
421 | 0.4 |
---|
422 | Multiline user expressions. Bitconstruct now accept only single pattern. bitstring-compare renamed to bitstring=?. Optimized immidiate value macro expansion. (check EXPRESSION) guard renamed to (? EXPRESSION). |
---|
423 | |
---|
424 | 0.3 |
---|
425 | install bugfixes |
---|
426 | |
---|
427 | 0.2 |
---|
428 | introduce bitconstruct |
---|
429 | |
---|
430 | 0.1 |
---|
431 | first public release |
---|