1 | [[tags: egg]] |
---|
2 | |
---|
3 | == http-client |
---|
4 | |
---|
5 | [[toc:]] |
---|
6 | |
---|
7 | === Description |
---|
8 | |
---|
9 | Http-client is a highlevel HTTP client library. |
---|
10 | |
---|
11 | === Author |
---|
12 | |
---|
13 | [[/users/peter-bex|Peter Bex]] |
---|
14 | |
---|
15 | === Requirements |
---|
16 | |
---|
17 | Requires the [[intarweb]] and [[openssl]] extensions. |
---|
18 | |
---|
19 | === Documentation |
---|
20 | |
---|
21 | This egg is still under development; the API might change a bit in |
---|
22 | future versions. |
---|
23 | |
---|
24 | ==== Main request procedures |
---|
25 | |
---|
26 | <procedure>(call-with-response request writer reader)</procedure> |
---|
27 | |
---|
28 | This is the core http-client procedure. It is only necessary to use |
---|
29 | this when you want the most control over the request/response cycle. |
---|
30 | {{request}} is the request object that contains information about the |
---|
31 | request to perform. {{reader}} is a procedure that receives the |
---|
32 | response object and should read the ''entire'' request body (any |
---|
33 | leftover data will cause errors on subsequent requests with keepalive |
---|
34 | connections), {{writer}} is a procedure that receives the request |
---|
35 | object and should write the request body. |
---|
36 | |
---|
37 | The {{writer}} should be prepared to be called several times; if the |
---|
38 | response is a redirect or some other status that indicates the server |
---|
39 | wants the client to perform a new request, the writer should be ready |
---|
40 | to write a request body for this new request. In case digest |
---|
41 | authentication with message integrity checking is used, {{writer}} is |
---|
42 | always invoked at least twice, once to determine the message digest of |
---|
43 | the response and once to actually write the response. |
---|
44 | |
---|
45 | Returns three values: The result of the call to {{reader}}, the |
---|
46 | request-uri of the last request and the response object. The |
---|
47 | request-uri is useful because this is to be used as the base uri of |
---|
48 | the document. This can differ from the initial request in the presence |
---|
49 | of redirects. |
---|
50 | |
---|
51 | <procedure>(call-with-input-request uri-or-request writer reader)</procedure> |
---|
52 | |
---|
53 | This procedure is a convenience wrapper around {{call-with-response}}. |
---|
54 | |
---|
55 | It is much less strict - {{uri-or-request}} can be an [[intarweb]] |
---|
56 | request object, but also an uri-common object or even a string with |
---|
57 | the URI in it, in which case a request object will be automatically |
---|
58 | constructed around the URI, using the {{GET}} method when {{writer}} |
---|
59 | is {{#f}} or the {{POST}} method when {{writer}} is not {{#f}}. |
---|
60 | |
---|
61 | {{writer}} can be either {{#f}} (in which case nothing is written and |
---|
62 | the {{GET}} method chosen), a string containing the raw data to send, |
---|
63 | an alist, or a procedure that accepts a port and writes the |
---|
64 | response data to it. If you supply a procedure, do not forget to set |
---|
65 | the {{content-length}} header! In the other cases, whenever possible, |
---|
66 | the length is calculated and the header automatically set for you. |
---|
67 | |
---|
68 | If you supplied an alist, the {{content-type}} header is automatically |
---|
69 | set to {{application/x-www-form-urlencoded}} unless there's an alist |
---|
70 | entry whose value is a list starting with the keyword {{file:}}, in |
---|
71 | which case {{multipart/form-data}} is used. See the examples for |
---|
72 | {{with-input-from-request}} below. |
---|
73 | |
---|
74 | {{reader}} is a procedure which accepts a port and reads out the data. |
---|
75 | If there is data left in the port when it returns, this will be |
---|
76 | automatically discarded to avoid problems. |
---|
77 | |
---|
78 | Returns three values: The result of the call to {{reader}}, the |
---|
79 | request-uri of the last request and the response object. If the |
---|
80 | response code is not in the 200 class, it will throw an exception of |
---|
81 | type {{(exn http client-error)}}, {{(exn http server-error)}} or |
---|
82 | {{(exn http unexpected-server-response)}}, depending on the response |
---|
83 | code. This includes {{404 not found}} (which is a {{client-error}}). |
---|
84 | |
---|
85 | When posting multipart form data, the value of a file entry is a list |
---|
86 | of keyword-value pairs. The following keywords are recognised: |
---|
87 | |
---|
88 | ; {{file:}} : This indicates the file to read from. Can be either a string or a port. This ''must'' be specified, everything else is optional. |
---|
89 | ; {{filename:}} : This indicates the filename to pass on to the server. If not specified or {{#f}}, the {{file:}}'s string (or port-name in case of a port) will be used. |
---|
90 | ; {{headers:}} : Additional headers to send for this entry (an [[intarweb]] headers-object). |
---|
91 | |
---|
92 | <procedure>(with-input-from-request uri-or-request writer-thunk reader-thunk)</procedure> |
---|
93 | |
---|
94 | Same as {{call-with-input-request}}, except when you pass a procedure |
---|
95 | as {{reader-thunk}} or {{writer-thunk}} it has to be a thunk (lambda |
---|
96 | of no arguments) instead of a procedure of one argument. These thunks |
---|
97 | will be executed with the current input (or output) port to the |
---|
98 | request or response port, respectively. |
---|
99 | |
---|
100 | You can still pass {{#f}} for both or an alist or string for |
---|
101 | {{writer-thunk}}. |
---|
102 | |
---|
103 | ===== Examples |
---|
104 | |
---|
105 | <enscript highlight="scheme"> |
---|
106 | (use http-client) |
---|
107 | |
---|
108 | ;; Start with a simple GET request: |
---|
109 | (with-input-from-request "http://wiki.call-cc.org/" #f read-string) |
---|
110 | => ;; [the chicken wiki page HTML contents] |
---|
111 | |
---|
112 | ;; Perform a POST of the key "test" with value "value" to an echo service: |
---|
113 | (with-input-from-request "http://localhost/echo-service" |
---|
114 | '((test . "value")) read-string) |
---|
115 | => "You posted: test=value" |
---|
116 | |
---|
117 | ;; Posting a file to the same echo-services: |
---|
118 | (with-input-from-request "http://localhost/echo-service" |
---|
119 | '((test . "value") |
---|
120 | (test-file file: "/tmp/myfile" filename: "hello.txt" |
---|
121 | headers: ((content-type text/plain)))) |
---|
122 | read-string) |
---|
123 | => "You posted: test=value and a file named \"hello.txt\"" |
---|
124 | |
---|
125 | |
---|
126 | ;; Performing a PUT request (a less commonly used method) requires |
---|
127 | ;; constructing your request object manually: |
---|
128 | |
---|
129 | (use intarweb uri-common) ; Required for "make-request" and "uri-reference" |
---|
130 | |
---|
131 | (with-input-from-request |
---|
132 | (make-request method: 'PUT |
---|
133 | uri: (uri-reference "http://example.com/blabla")) |
---|
134 | (lambda () (print "Page contents")) |
---|
135 | read-string) |
---|
136 | </enscript> |
---|
137 | |
---|
138 | |
---|
139 | ==== Request handling parameters |
---|
140 | |
---|
141 | <parameter>(max-retry-attempts [number])</parameter> |
---|
142 | |
---|
143 | When a request fails because of an I/O or network problem (or simply |
---|
144 | because the remote end closed a persistent connection while we were |
---|
145 | doing something else), the library will try to establish a new |
---|
146 | connection and perform the request again. This parameter controls how |
---|
147 | many times this is allowed to be done. If {{#f}}, it will never give up. |
---|
148 | |
---|
149 | Defaults to 1. |
---|
150 | |
---|
151 | <parameter>(retry-request? [predicate])</parameter> |
---|
152 | |
---|
153 | This procedure is invoked when a retry should take place, to determine |
---|
154 | if it should take place at all. It should be a procedure accepting a |
---|
155 | request object and returning {{#f}} or a true value. If the value is |
---|
156 | true, the new request will be sent. Otherwise, the error that caused |
---|
157 | the retry attempt will be re-raised. |
---|
158 | |
---|
159 | Defaults to {{idempotent?}}, from [[intarweb]]. This is because |
---|
160 | non-idempotent requests cannot be safely retried when it is unknown |
---|
161 | whether the previous request reached the server or not. |
---|
162 | |
---|
163 | <parameter>(max-redirect-depth [number])</parameter> |
---|
164 | |
---|
165 | The maximum number of allowed redirects, or {{#f}} if there is no |
---|
166 | limit. Currently there's no automatic redirect loop detection |
---|
167 | algorithm implemented. If zero, no redirects will be followed at all. |
---|
168 | |
---|
169 | Defaults to 5. |
---|
170 | |
---|
171 | <parameter>(client-software [software-spec])</parameter> |
---|
172 | |
---|
173 | This is the names, versions and comments of the software packages that |
---|
174 | the client is using, for use in the {{user-agent}} header which is |
---|
175 | automatically added to each request. |
---|
176 | |
---|
177 | Defaults to {{(("Chicken Scheme HTTP-client" VERSION #f))}}, where |
---|
178 | {{VERSION}} is the version of this egg. |
---|
179 | |
---|
180 | |
---|
181 | ==== Connection management |
---|
182 | |
---|
183 | <procedure>(close-connection! uri)</procedure> |
---|
184 | |
---|
185 | Close the connection to the server associated with the URI. |
---|
186 | |
---|
187 | <procedure>(close-all-connections!)</procedure> |
---|
188 | |
---|
189 | Close all connections to all servers. |
---|
190 | |
---|
191 | |
---|
192 | ==== Cookie management |
---|
193 | |
---|
194 | http-client's cookie management is supposed to be as automatic and |
---|
195 | DWIMmy as possible. This means it will write any cookie as instructed |
---|
196 | by a server and all stored cookies are automatically sent back to the |
---|
197 | server upon a new request. |
---|
198 | |
---|
199 | However, in some cases you may want to take control of how cookies are |
---|
200 | stored. |
---|
201 | |
---|
202 | The API described here should be considered unstable and it may change |
---|
203 | dramatically when someone comes up with a better way to handle cookies. |
---|
204 | |
---|
205 | <procedure>(get-cookies-for-uri uri)</procedure> |
---|
206 | |
---|
207 | Fetch a list of all cookies which ought to be sent to the given URI. |
---|
208 | Cookies are vectors of two elements: a name/value pair and an alist of |
---|
209 | attributes. In other words, these are the exact same values you can |
---|
210 | put in a {{cookie}} header. |
---|
211 | |
---|
212 | <procedure>(store-cookie! cookie-info set-cookie)</procedure> |
---|
213 | |
---|
214 | Store a cookie in the cookiejar corresponding to the Set-Cookie header |
---|
215 | given by {{set-cookie}}. This overwrites any cookie that is equal to |
---|
216 | this cookie, as defined by RFC 2965, section 3.3.3. Practically, this |
---|
217 | means that when the cookie's name, domain and path are equal to an |
---|
218 | existant one, it will be overwritten by the new one. These attributes |
---|
219 | are taken from the {{cookie-info}} alist and expected to be there. |
---|
220 | |
---|
221 | Generally, attributes should be taken from {{set-cookie}}, but if |
---|
222 | missing they ought to be taken from the request URI that responded |
---|
223 | with the {{set-cookie}}. |
---|
224 | |
---|
225 | <procedure>(delete-cookie! cookie-name cookie-info)</procedure> |
---|
226 | |
---|
227 | Removes any cookie from the cookiejar that is equal to the given |
---|
228 | cookie (again, in the sense of RFC 2965, section 3.3.3). |
---|
229 | The {{cookie-name}} must match and the {{path}} and {{domain}} values for |
---|
230 | the {{cookie-info}} alist must match. |
---|
231 | |
---|
232 | ==== Authentication support |
---|
233 | |
---|
234 | When a 401 Unauthorized response is received, in most interactive |
---|
235 | clients, the user is normally asked to authenticate. To support this |
---|
236 | type of interaction, http-client offers the following parameter: |
---|
237 | |
---|
238 | <parameter>(determine-username/password [HANDLER])</parameter> |
---|
239 | |
---|
240 | The procedure in this parameter is called whenever the remote |
---|
241 | host requests authentication via a 401 Unauthorized response. |
---|
242 | |
---|
243 | The {{HANDLER}} is a procedure of two arguments; the URI for the |
---|
244 | resource currently being requested and the realm (a string) which |
---|
245 | wants credentials. The procedure should return two string values: |
---|
246 | the username and the password to use for authentication. |
---|
247 | |
---|
248 | The default value is a procedure which extracts the username and |
---|
249 | password components from the URI. |
---|
250 | |
---|
251 | For proxy authentication support, see {{determine-proxy-username/password}} |
---|
252 | in the next section. |
---|
253 | |
---|
254 | ==== Proxy support |
---|
255 | |
---|
256 | http-client has support for sending requests through proxy servers. |
---|
257 | |
---|
258 | <parameter>(determine-proxy [HANDLER])</parameter> |
---|
259 | |
---|
260 | Whenever a request is sent, the library invokes the procedure stored |
---|
261 | in this parameter to determine through what proxy to send the request, |
---|
262 | if any. |
---|
263 | |
---|
264 | The {{HANDLER}} procedure receives one argument, the URI about to be |
---|
265 | requested, and returns either an URI-common absolute URI object |
---|
266 | representing the proxy or {{#f}} if no proxy should be used. |
---|
267 | |
---|
268 | The URI's path and query, if present, are ignored; only the scheme |
---|
269 | and authority (host, port, username, password) are used. |
---|
270 | |
---|
271 | The default value of this parameter is {{determine-proxy-from-environment}}. |
---|
272 | |
---|
273 | If you just want to disable proxy support, you can do: |
---|
274 | |
---|
275 | <enscript highlight="scheme"> |
---|
276 | (determine-proxy (constantly #f)) ; From unit data-structures |
---|
277 | </enscript> |
---|
278 | |
---|
279 | <procedure>(determine-proxy-from-environment URI)</procedure> |
---|
280 | |
---|
281 | This procedure implements the common behaviour of HTTP software under |
---|
282 | UNIX: |
---|
283 | |
---|
284 | * First it checks if the requested URI's host (or an asterisk) is listed in the {{NO_PROXY}} environment variable (if suffixed with a port number, the port is also compared). If a match is found, no proxy is used. |
---|
285 | * Then it will check if the {{$(protocol)_proxy}} or the {{$(PROTOCOL)_PROXY}} variable (in that order) are set. If so, that's used. {{protocol}} here actually means "scheme", so the URI's scheme is used, suffixed with {{_proxy}}. This means {{http_proxy}} is used for HTTP requests and {{https_proxy}} is used for HTTPS requests. |
---|
286 | * If there's still no match, it looks for {{all_proxy}} or {{ALL_PROXY}}, in that order. If one of these environment variables are set, that value is used as a fallback proxy. |
---|
287 | * Finally, if none of these checks resulted in a proxy URI, no proxy will be used. |
---|
288 | |
---|
289 | Some UNIX software expects plain hostnames or hostname port |
---|
290 | combinations separated by colons, but (currently) this library expects |
---|
291 | full URIs, like most modern UNIX programs. |
---|
292 | |
---|
293 | <parameter>(determine-proxy-username/password [HANDLER])</parameter> |
---|
294 | |
---|
295 | The procedure in this parameter is called whenever the proxy requests |
---|
296 | authentication via a 407 Proxy Authentication Required response. This |
---|
297 | basically works the same as authentication against an origin server. |
---|
298 | |
---|
299 | The {{HANDLER}} is a procedure of two arguments; the URI for the |
---|
300 | ''proxy'' currently being used and the realm (a string) which wants |
---|
301 | credentials. The procedure should return two string values: the |
---|
302 | username and the password to use for authentication. |
---|
303 | |
---|
304 | The default value is a procedure which extracts the username and |
---|
305 | password components from the proxy's URI. |
---|
306 | |
---|
307 | |
---|
308 | === Changelog |
---|
309 | |
---|
310 | * 0.5 Improve detection of dropped connections (prevents unneccessary "connection reset" exceptions to propagate into the program). Simplify interface by switching to {{POST}} when a {{writer}} is given to {{with-input-from-request}} and {{call-with-input-request}}. Add support for multipart forms (file upload). Fix error in case of missing username when authorization was required (introduced by version 0.4.2). Put loop call in tail position (thanks to [[/users/felix-winkelmann|Felix]]) Automatically discard remaining data on the input port, if any, to avoid problems on subsequent requests. Add rudimentary support for parameterizable authentication schemes. |
---|
311 | * 0.4.2 Allow missing passwords in URIs for authentication |
---|
312 | * 0.4.1 Fix connection status check so when the remote end closed the connection we don't try to read from it anymore (thanks to Daishi Kato and Thomas Hintz) |
---|
313 | * 0.4 Fix redirection code on 303, and off-by-1 mistake in redirects count (thanks to Moritz Heidkamp). Add arguments to exn objects (thanks to Christian Kellermann). Also accept an empty alist for POSTdata. Fix URI path comparisons in cookies (thanks to Daishi Kato) |
---|
314 | * 0.3 Fixed handling of missing Path parameters in set-cookie headers. Reported by Hugo Arregui. Improve set-cookie handling by only passing Path and Domain when matching Set-Cookie header included those parameters. |
---|
315 | * 0.2 Added proxy support and many many bugfixes |
---|
316 | * 0.1 Initial version |
---|
317 | |
---|
318 | === License |
---|
319 | |
---|
320 | Copyright (c) 2008-2011, Peter Bex |
---|
321 | Parts copyright (c) 2000-2004, Felix L. Winkelmann |
---|
322 | All rights reserved. |
---|
323 | |
---|
324 | Redistribution and use in source and binary forms, with or without |
---|
325 | modification, are permitted provided that the following conditions are |
---|
326 | met: |
---|
327 | |
---|
328 | Redistributions of source code must retain the above copyright |
---|
329 | notice, this list of conditions and the following disclaimer. |
---|
330 | |
---|
331 | Redistributions in binary form must reproduce the above copyright |
---|
332 | notice, this list of conditions and the following disclaimer in the |
---|
333 | documentation and/or other materials provided with the distribution. |
---|
334 | |
---|
335 | Neither the name of the author nor the names of its contributors may |
---|
336 | be used to endorse or promote products derived from this software |
---|
337 | without specific prior written permission. |
---|
338 | |
---|
339 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
---|
340 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
---|
341 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS |
---|
342 | FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE |
---|
343 | COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, |
---|
344 | INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES |
---|
345 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR |
---|
346 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
---|
347 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, |
---|
348 | STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) |
---|
349 | ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED |
---|
350 | OF THE POSSIBILITY OF SUCH DAMAGE. |
---|