source: project/wiki/eggref/4/intarweb @ 14586

Last change on this file since 14586 was 14586, checked in by sjamaan, 10 years ago

Fix the examples a bit, update copyright notice

File size: 21.2 KB
Line 
1[[tags: egg]]
2
3== Intarweb
4
5[[toc:]]
6
7=== Description
8
9Intarweb is an advanced http library.  It parses all headers into more
10useful Scheme values.
11
12=== Author
13
14[[Peter Bex]]
15
16=== Requirements
17
18Requires the [[defstruct]], [[base64]] and [[uri-common]] extensions.
19
20=== Documentation
21
22The intarweb egg is designed to be used from a variety of
23situations. For this reason, it does not try to be a full HTTP client
24or server. If you need that kind of functionality, see eggs like
25[[spiffy]].
26
27=== Requests
28
29A request object (a [[defstruct]]-type record) can be created using
30the following procedure:
31
32<procedure>(make-request #!key uri port (method 'GET) (major 1) (minor 1) (headers (make-headers '())))</procedure>
33
34An existing request can be picked apart using the following procedures:
35
36* <procedure>(request-uri REQUEST) => URI</procedure>
37* <procedure>(request-port REQUEST) => PORT</procedure>
38* <procedure>(request-method REQUEST) => SYMBOL</procedure>
39* <procedure>(request-major REQUEST) => NUMBER</procedure>
40* <procedure>(request-minor REQUEST) => NUMBER</procedure>
41* <procedure>(request-headers REQUEST) => HEADERS</procedure>
42
43The uri defines the entity to retrieve on the server, which should be
44a [[uri-common]]-type URI object. The port is the port where the
45request is written to or read from.  The method is a symbol that
46defines the HTTP method to use (case sensitive). major and minor
47identify the major and minor version of HTTP to use. Currently, 0.9,
481.0 and 1.1 are supported (but be careful with 0.9, it has some weird
49consequences and is not widely supported). Headers must be a headers
50object, which is described below.
51
52The client will generally write requests, while the server will read them.
53To write a request, use the following procedure:
54
55<procedure>(write-request REQUEST) => REQUEST</procedure>
56
57This will write a request line with headers to the server.  In case it
58is a request type that has any body data, this should be written to
59the the request's port. Beware that this port can be modified by
60write-request, so be sure to write to the port as it is returned by
61the write-request procedure!
62
63<procedure>(read-request PORT) => REQUEST</procedure>
64
65Reads a request object from the given input-port.  An optional request
66body can be read from the request-port after calling this procedure.
67
68Requests are parsed using parse procedures, which can be customized
69by overriding this parameter:
70
71<parameter>(request-parsers [LIST])</parameter>
72
73The list is one of procedures which accept a request line string,
74which produce a request object from that, or {{#f}} if the request
75is not of the type handled by that procedure.
76
77The predefined request parsers are the following:
78
79* <procedure>(http-0.9-request-parser STRING) => REQUEST</procedure>
80* <procedure>(http-1.x-request-parser STRING) => REQUEST</procedure>
81
82Requests are written using unparse procedures, which can be
83customized by overriding this parameter:
84
85<parameter>(request-unparsers [LIST])</parameter>
86
87The list is one of procedures which accept a request object and write
88to the request's output port and return the new, possibly updated
89request object. If the request object is not unparsed by this
90handler, it returns {{#f}}.
91
92The predefined request unparsers are the following:
93
94* <procedure>(http-0.9-request-unparser REQUEST) => REQUEST</procedure>
95* <procedure>(http-1.x-request-unparser REQUEST) => REQUEST</procedure>
96
97They return the request, and as a side effect they write the
98request to the request object's port.
99
100=== Responses
101
102A response is also a [[defstruct]]-type record, much like a request:
103
104<procedure>(make-response #!key port (code 200) (reason "OK") (major 1) (minor 1) (headers (make-headers '())))</procedure>
105
106An existing response can be picked apart using the following procedures:
107* <procedure>(response-port RESPONSE) => PORT</procedure>
108* <procedure>(response-code RESPONSE) => NUMBER</procedure>
109* <procedure>(response-reason RESPONSE) => STRING</procedure>
110* <procedure>(response-major RESPONSE) => NUMBER</procedure>
111* <procedure>(response-minor RESPONSE) => NUMBER</procedure>
112* <procedure>(response-headers RESPONSE) => HEADERS</procedure>
113
114The port, major, minor and headers are the same as for requests. code
115and reason are an integer status code and the short message that
116belongs to it, as defined in the spec (examples include: 200 OK, 301
117Moved Permanently, etc).
118
119A server will usually write a response, a client will read it.
120To write a response, use the following procedure:
121
122<procedure>(write-response RESPONSE) => RESPONSE</procedure>
123
124If there is a response body, this must be written to the response-port
125after sending the response headers.
126
127<procedure>(read-response PORT) => RESPONSE</procedure>
128
129Reads a response object from the port. An optional response body can
130be read from the response-port after calling this procedure.
131
132Responses are parsed using parse procedures, which can be customized
133by overriding this parameter:
134
135<parameter>(response-parsers [LIST])</parameter>
136
137The list is one of procedures which accept a response line string,
138which produce a response object from that, or {{#f}} if the response
139is not of the type handled by that procedure.
140
141The predefined response parsers are the following:
142
143* <procedure>(http-0.9-response-unparser REQUEST) => REQUEST</procedure>
144* <procedure>(http-1.x-response-unparser REQUEST) => REQUEST</procedure>
145
146Responses are written using unparse procedures, which can be
147customized by overriding this parameter:
148
149<parameter>(response-unparsers [LIST])</parameter>
150
151The list is one of procedures which accept a response object and write
152to the response's output port and return the new, possibly updated
153response object. If the response object is not unparsed by this
154handler, it returns {{#f}}.
155
156The predefined response unparsers are the following:
157
158* <procedure>(http-0.9-response-unparser REQUEST) => REQUEST</procedure>
159* <procedure>(http-1.x-response-unparser REQUEST) => REQUEST</procedure>
160
161=== Headers
162
163Requests and responses contain HTTP headers wrapped in a special
164header-object to ensure they are properly normalized.
165
166<procedure>(headers ALIST [HEADERS]) => HEADERS</procedure>
167
168This creates a header object based on an input list.
169
170<procedure>(headers->list HEADERS) => ALIST</procedure>
171
172This converts the header object back to a list.
173
174The above mentioned lists have header names (symbols) as keys, and
175lists of values as values:
176
177<examples><example>
178<expr>
179(headers `((host ("example.com" . 8080))
180           (accept #(text/html ((q . 0.5)))
181                   #(text/xml ((q . 0.1)))))
182          old-headers)
183</expr>
184</example></examples>
185
186This adds the named headers to the existing headers in
187{{old-headers}}. The host header is either a string with the hostname
188or a pair of hostname/port. The accept header is a list of allowed
189mime-type symbols. As can be seen here, optional parameters or
190"attributes" can be added to a header value by wrapping the value in a
191vector of length 2. The first entry in the vector is the header value,
192the second is an alist of attribute name/value pairs.
193
194To obtain the value of any particular header, you can use
195
196<procedure>(header-values NAME HEADERS) => LIST</procedure>
197
198The name of the header is a symbol, and it will return all the values
199of the header (for example, the Accept header will have several values
200that indicate the set of acceptable mime-types).
201
202If you know in advance that a header has only one value, you can use:
203
204<procedure>(header-value NAME HEADERS [DEFAULT]) => value</procedure>
205
206This will return the first value in the list, or the provided default
207if there is no value for that header.
208
209These are just shortcuts, the underlying procedures to query the raw
210contents of a header are these:
211
212* <procedure>(header-contents NAME HEADERS) => VECTOR</procedure>
213* <procedure>(get-value VECTOR) => value</procedure>
214* <procedure>(get-params VECTOR) => ALIST</procedure>
215* <procedure>(get-param PARAM VECTOR [DEFAULT]) => value</procedure>
216
217Header contents are 2-element vectors; the first value containing the
218value for the header and the second value containing an alist with
219"parameters" for that header value. Parameters are attribute/value
220pairs that define further specialization of a header's value. For
221example, the {{accept}} header consists of a list of mime-types, which
222optionally can have a quality parameter that defines the preference
223for that mime-type.  All parameter names are downcased symbols, just
224like header names.
225
226There are special-purpose procedures for obtaining information about
227specific header parameters when the spec defines certain behaviour for
228them, as well:
229
230* <procedure>(get-quality PARAM VECTOR [DEFAULT]) =>
231value</procedure>
232
233This obtains the value of the {{q}} parameter for the given header
234value, if any, or {{1}} if there is none defined.
235
236==== Header types
237
238The headers all have their own different types.  Here follows a list
239of headers with their value types:
240
241<table>
242<tr><th>Header name</th><th>Value type</th><th>Example value</th></tr>
243<tr>
244<td>{{accept}}</td>
245<td>List of mime-types (symbols), with optional {{q}} attribute
246indicating "quality" (preference level)</td>
247<td>{{(text/html #(text/xml ((q . 0.1))))}}</td>
248</tr>
249<tr>
250<td>{{accept-charset}}</td>
251<td>List of charset-names (symbols), with optional {{q}} attribute</td>
252<td>{{(utf-8 #(iso-8859-5 ((q . 0.1))))}}</td>
253</tr>
254<tr>
255<td>{{accept-encoding}}</td>
256<td>List of encoding-names (symbols), with optional {{q}} attribute</td>
257<td>{{(gzip #(identity ((q . 0))))}}</td>
258</tr>
259<tr>
260<td>{{accept-language}}</td>
261<td>List of language-names (symbols), with optional {{q}} attribute</td>
262<td>{{(en-gb #(nl ((q . 0.5))))}}</td>
263</tr>
264<tr>
265<td>{{accept-ranges}}</td>
266<td>List of range types acceptable (symbols). The spec only defines
267{{bytes}} and {{none}}.</td>
268<td>{{(bytes)}}</td>
269</tr>
270<tr>
271<td>{{age}}</td>
272<td>Age in seconds (number)</td>
273<td>{{(3600)}}</td>
274</tr>
275<tr>
276<td>{{allow}}</td>
277<td>List of methods that are allowed (symbols).</td>
278<td>{{(GET POST PUT DELETE)}}</td>
279</tr>
280<tr>
281<td>{{authorization}}</td>
282<td>Authorization information. This consists of a symbol identifying the
283authentication scheme, with scheme-specific attributes.</td>
284<td>{{(digest #((username . "foo")))}}</td>
285</tr>
286<tr>
287<td>{{cache-control}}</td>
288<td>An alist of key/value pairs. If no value is applicable, it is {{#t}}</td>
289<td>((public . #t) (max-stale . 10) (no-cache . (max-age set-cookie)))</td>
290</tr>
291<tr>
292<td>{{connection}}</td>
293<td>A list of connection options (symbols)</td>
294<td>{{(close)}}</td>
295</tr>
296<tr>
297<td>{{content-encoding}}</td>
298<td>A list of encodings (symbols) applied to the entity-body.</td>
299<td>{{(deflate gzip)}}</td>
300</tr>
301<tr>
302<td>{{content-language}}</td>
303<td>The natural language(s) of the "intended audience" (symbols)</td>
304<td>{{(de nl en-gb)}}</td>
305</tr>
306<tr>
307<td>{{content-length}}</td>
308<td>The number of bytes (an exact number) in the entity-body</td>
309<td>{{(10)}}</td>
310</tr>
311<tr>
312<td>{{content-location}}</td>
313<td>A location that the content can be retrieved from (a uri-common object)</td>
314<td>{{(<#uri-common# ...>)}}</td>
315</tr>
316<tr>
317<td>{{content-md5}}</td>
318<td>The MD5 checksum (a string) of the entity-body</td>
319<td>{{("12345ABCDEF")}}</td>
320</tr>
321<tr>
322<td>{{content-range}}</td>
323<td>Content range (pair with start- and endpoint) of the entity-body, if partially sent</td>
324<td>{{((25 . 120))}}</td>
325</tr>
326<tr>
327<td>{{content-type}}</td>
328<td>The mime type of the entity-body (a symbol)</td>
329<td>{{(text/html)}}</td>
330</tr>
331<tr>
332<td>{{date}}</td>
333<td>A timestamp (10-element vector, see {{string->time}}) at which the message originated</td>
334<td>{{(#(42 23 15 20 6 108 0 309 #f 0))}}</td>
335</tr>
336<tr>
337<td>{{etag}}</td>
338<td>An entity-tag (pair, car being either the symbol weak or strong, cdr being a symbol) that uniquely identifies the resource contents.</td>
339<td>{{((strong . foo123))}}</td>
340</tr>
341<tr>
342<td>{{expect}}</td>
343<td>Expectations of the server's behaviour (alist of symbol-string pairs), possibly with parameters.</td>
344<td>{{(#(((100-continue . #t)) ()))}}</td>
345</tr>
346<tr>
347<td>{{expires}}</td>
348<td>Expiry timestamp (10-element vector, see {{string->time}}) for the entity</td>
349<td>{{(#(42 23 15 20 6 108 0 309 #f 0))}}</td>
350</tr>
351<tr>
352<td>{{from}}</td>
353<td>The e-mail address (a string) of the human user who controls the client</td>
354<td>{{("info@example.com")}}</td>
355</tr>
356<tr>
357<td>{{host}}</td>
358<td>The host to use (for virtual hosting). This is a pair of hostname and port</td>
359<td>{{(("example.com" . 80))}}</td>
360</tr>
361<tr>
362<td>{{if-match}}</td>
363<td>Entity-tags (pair, weak/strong symbol and unique entity identifier symbol) which must match.</td>
364<td>{{((strong . foo123) (strong . bar123))}}</td>
365</tr>
366<tr>
367<td>{{if-modified-since}}</td>
368<td>Timestamp (10-element vector, see {{string->time}}) which indicates since when the entity must have been modified.</td>
369<td>{{(#(42 23 15 20 6 108 0 309 #f 0))}}</td>
370</tr>
371<tr>
372<td>{{if-none-match}}</td>
373<td>Entity tags (pair, weak/strong symbol and unique entity identifier symbol) which must not match.</td>
374<td>{{((strong . foo123) (strong . bar123))}}</td>
375</tr>
376<tr>
377<td>{{if-range}}</td>
378<td>The range to request, if the entity was unchanged</td>
379<td>TODO</td>
380</tr>
381<tr>
382<td>{{if-unmodified-since}}</td>
383<td>A timestamp (10-element vector, see {{string->time}}) since which the entity must not have been modified</td>
384<td>{{(#(42 23 15 20 6 108 0 309 #f 0))}}</td>
385</tr>
386<tr>
387<td>{{last-modified}}</td>
388<td>A timestamp (10-element vector, see {{string->time}}) when the entity was last modified</td>
389<td>{{(#(42 23 15 20 6 108 0 309 #f 0))}}</td>
390</tr>
391<tr>
392<td>{{location}}</td>
393<td>A location (an URI object) to which to redirect</td>
394<td>{{(<#uri-object ...>)}}</td>
395</tr>
396<tr>
397<td>{{max-forwards}}</td>
398<td>The maximum number of proxies that can forward a request</td>
399<td>{{(2)}}</td>
400</tr>
401<tr>
402<td>{{pragma}}</td>
403<td>An alist of symbols containing implementation-specific directives.</td>
404<td>{{((no-cache . #t) (my-extension . my-value))}}</td>
405</tr>
406<tr>
407<td>{{proxy-authenticate}}</td>
408<td>Proxy authentication options (authentication scheme symbol, with parameters)</td>
409<td>{{(digest #((username . "foo")))}}</td>
410</tr>
411<tr>
412<td>{{proxy-authorization}}</td>
413<td>Same as the above, only request-side instead of response-side</td>
414<td>{{(digest #((username . "foo")))}}</td>
415</tr>
416<tr>
417<td>{{range}}</td>
418<td>The range of bytes (a pair of start and end) to request from the server.</td>
419<td>{{((25 . 120))}}</td>
420</tr>
421<tr>
422<td>{{referer}}</td>
423<td>The referring URL (uri-common object) that linked to this one.</td>
424<td>{{(<#uri-object ...>)}}</td>
425</tr>
426<tr>
427<td>{{retry-after}}</td>
428<td>Timestamp (10-element vector, see {{string->time}}) after which to retry the request if unavailable now.</td>
429<td>{{(#(42 23 15 20 6 108 0 309 #f 0))}}</td>
430</tr>
431<tr>
432<td>{{server}}</td>
433<td>List of products the server uses (list of 3-tuple lists of strings; product name, product version, comment. Version and/or comment may be {{#f}}). Note that this is a single header, with a list inside it!</td>
434<td>{{((("Apache" "2.2.9" "Unix") ("mod_ssl" "2.2.9" #f) ("OpenSSL" "0.9.8e" #f) ("DAV" "2" #f) ("mod_fastcgi" "2.4.2" #f) ("mod_apreq2-20051231" "2.6.0" #f)))}}</td>
435</tr>
436<tr>
437<td>{{te}}</td>
438<td>Allowed transfer-encodings (symbols, with optional q attribute) for the response</td>
439<td>{{(deflate #(gzip ((q . 0.2))))}}</td>
440</tr>
441<tr>
442<td>{{trailer}}</td>
443<td>Names of header fields (symbols) available in the trailer/after body</td>
444<td>{{(range etag)}}</td>
445</tr>
446<tr>
447<td>{{transfer-encoding}}</td>
448<td>The encodings (symbols) used in the body</td>
449<td>{{(chunked)}}</td>
450</tr>
451<tr>
452<td>{{upgrade}}</td>
453<td>Product names to which must be upgraded (strings)</td>
454<td>TODO</td>
455</tr>
456<tr>
457<td>{{user-agent}}</td>
458<td>List of products the user agent uses (list of 3-tuple lists of strings; product name, product version, comment. Version and/or comment may be {{#f}}). Note that this is a single header, with a list inside it!</td>
459<td>{{((("Mozilla" "5.0" "X11; U; NetBSD amd64; en-US; rv:1.9.0.3") ("Gecko" "2008110501" #f) ("Minefield" "3.0.3" #f)))}}</td>
460</tr>
461<tr>
462<td>{{vary}}</td>
463<td>The names of headers that define variation in the resource body, to determine cachability (symbols)</td>
464<td>{{(range etag)}}</td>
465</tr>
466<tr>
467<td>{{via}}</td>
468<td>The intermediate hops through which the message is forwarded (strings)</td>
469<td>TODO</td>
470</tr>
471<tr>
472<td>{{warning}}</td>
473<td>Warning code for special status</td>
474<td>TODO</td>
475</tr>
476<tr>
477<td>{{www-authenticate}}</td>
478<td>If unauthorized, a challenge to authenticate (symbol, with attributes)</td>
479<td>{{(digest #((username . "foo")))}}</td>
480</tr>
481<tr>
482<td>{{set-cookie}}</td>
483<td>Cookies to set (name/value string pair, with attributes)</td>
484<td>{{(#(("foo" . "bar") ((max-age . 10))))}}</td>
485</tr>
486<tr>
487<td>{{cookie}}</td>
488<td>Cookies that were set (name/value string pair, with attributes)</td>
489<td>{{(#(("foo" . "bar") (($path . "/"))))}}</td>
490</tr>
491</table>
492
493Any unrecognised headers are assumed to be multi-headers, and the
494entire header lines are put unparsed into a list, one entry per line.
495
496
497==== Header parsers and unparsers
498
499The parsers and unparsers used to read and write header values can be
500customized with the following parameters:
501
502* <parameter>(header-parsers [ALIST])</parameter>
503* <parameter>(header-unparsers [ALIST])</parameter>
504
505These (un)parsers are indexed with as key the header name (a symbol)
506and the value being a procedure which accepts three values: the name
507of the header (symbol), the contents of the header (a string, without
508the leading header name and colon) and the preceding headers. It
509should merge the new header with the preceding headers and return the
510resulting headers.
511
512Header parsers are supposed to call these procedures to add headers:
513
514* <procedure>(replace-header-contents NAME CONTENTS HEADERS) => HEADERS</procedure>
515* <procedure>(replace-header-contents! NAME CONTENTS HEADERS) => HEADERS</procedure>
516* <procedure>(update-header-contents NAME CONTENTS HEADERS) => HEADERS</procedure>
517* <procedure>(update-header-contents! NAME CONTENTS HEADERS) => HEADERS</procedure>
518
519The {{replace}} procedures replace any existing contents of the named
520header with new ones, the {{update}} procedures add these contents to
521the existing header. The procedures with a name ending in bang are
522linear update variants of the ones without the bang. The header
523contents have to be normalized to be a 2-element vector, with the
524first element being the actual value and the second element being an
525alist (possibly empty) of parameters/attributes for that value.
526
527The update procedures append the value to the existing header if it is
528a multi-header, and act as a simple replace in the case of a
529single-header.
530
531Whether a header is allowed once or multiple times in a request or
532response is determined by this parameter:
533
534<parameter>(single-headers [LIST])</parameter>
535
536The value is a list of symbols that define header-names which are
537allowed to occur only once in a request/response.
538
539* <procedure>(http-name->symbol-name STRING) => SYMBOL</procedure>
540* <procedure>(symbol->http-name SYMBOL) => STRING</procedure>
541
542These procedures convert strings containing the name of a header or
543attribute (parameter name) to symbols representing the same. The
544symbols are completely downcased.  When converting this symbol back to
545a string, the initial letters of all the words in the header name or
546attribute are capitalized.
547
548* <procedure>(remove-header name headers) => headers</procedure>
549* <procedure>(remove-header! name headers) => headers</procedure>
550
551These two procedures remove all headers with the given name.
552
553=== Other procedures
554
555<procedure>(keep-alive? request-or-response)</procedure>
556
557Returns {{#t}} when the given request or response object belongs
558to a connection that should be kept alive or not.  Remember that
559both parties must agree on whether the connection is to be kept
560alive or not; HTTP/1.1 defaults to keep alive unless a
561{{Connection: close}} header is sent, HTTP/1.0 defaults to closing
562the connection, unless a {{Connection: Keep-Alive}} header is sent.
563
564
565=== Changelog
566
567* 0.1 Initial version
568
569=== License
570
571  Copyright (c) 2008-2009, Peter Bex
572  All rights reserved.
573 
574  Redistribution and use in source and binary forms, with or without
575  modification, are permitted provided that the following conditions are
576  met:
577 
578  Redistributions of source code must retain the above copyright
579  notice, this list of conditions and the following disclaimer.
580 
581  Redistributions in binary form must reproduce the above copyright
582  notice, this list of conditions and the following disclaimer in the
583  documentation and/or other materials provided with the distribution.
584 
585  Neither the name of the author nor the names of its contributors may
586  be used to endorse or promote products derived from this software
587  without specific prior written permission.
588 
589  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
590  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
591  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
592  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
593  COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
594  INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
595  (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
596  SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
597  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
598  STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
599  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
600  OF THE POSSIBILITY OF SUCH DAMAGE.
Note: See TracBrowser for help on using the repository browser.