source: project/wiki/uri-generic @ 13113

Last change on this file since 13113 was 13113, checked in by sjamaan, 11 years ago

Merge uri-generic doc changes into release 3 docs

File size: 8.3 KB
Line 
1[[tags: eggs]]
2[[toc:]]
3
4== uri-generic
5
6=== Description
7
8The {{uri-generic}} library contains procedures for parsing and
9manipulation of Uniform Resource Identifiers
10([[http://tools.ietf.org/html/rfc3986|RFC 3986]]). It is intended to
11conform more closely to the RFC, and uses combinator parsing and
12character classes rather than regular expressions.
13
14This library should be considered to be a ''basis'' for creating
15scheme-specific URI parser libraries. This library only parses
16the generic components from an URI.  Any specific library can
17further parse subcomponents. For this reason, encoding and decoding
18of percent-encoded characters is not done automatically.
19This should be handled by specific URI scheme implementations.
20
21=== Library Procedures
22
23==== Constructors
24
25As specified in section 2.3 of RFC 3986, URI constructors
26automatically decode percent-encoded octets in the range of unreserved
27characters. This means that the following holds true:
28
29 (equal? (uri-reference "http://example.com/foo-bar")
30         (uri-reference "http://example.com/foo%2Dbar"))  => #t
31
32<procedure>(uri-reference STRING) => URI</procedure>
33
34A URI reference is either a URI or a relative reference (RFC 3986,
35Section 4.1).  If the given string's prefix does not match the syntax
36of a scheme followed by a colon separator, then the given string is
37parsed as a relative reference.
38
39<procedure>(absolute-uri STRING) => URI</procedure>
40
41Parses the given string as an absolute URI, in which no fragments are
42allowed (RFC 3986, Section 4.2)
43
44
45==== Predicates and Accessors
46
47* <procedure>(uri? URI) => BOOL</procedure>
48* <procedure>(uri-authority URI) => URI-AUTH</procedure>
49* <procedure>(uri-scheme URI) => SYMBOL</procedure>
50* <procedure>(uri-path URI) => LIST</procedure>
51* <procedure>(uri-query URI) => STRING</procedure>
52* <procedure>(uri-fragment) URI => STRING</procedure>
53* <procedure>(uri-host URI) => STRING</procedure>
54* <procedure>(uri-port URI) => INTEGER</procedure>
55* <procedure>(uri-username URI) => STRING</procedure>
56* <procedure>(uri-password URI) => STRING</procedure>
57* <procedure>(authority? URI-AUTH) => BOOL</procedure>
58* <procedure>(authority-host URI-AUTH) => STRING</procedure>
59* <procedure>(authority-port URI-AUTH) => INTEGER</procedure>
60* <procedure>(authority-username URI-AUTH) => STRING</procedure>
61* <procedure>(authority-password URI-AUTH) => STRING</procedure>
62
63If a component is not defined in the given URI, then the corresponding
64accessor returns {{#f}}.
65
66* <procedure>(update-uri URI #!key authority scheme path query fragment host port username password) => URI</procedure>
67* <procedure>(update-authority URI-AUTH #!key host port username password) => URI</procedure>
68
69Update the specified keys in the URI or URI-AUTH object in a
70functional way (ie, it creates a new copy with the modifications).
71
72
73==== String and List Representations
74
75<procedure>(uri->string URI USERINFO) => STRING</procedure>
76
77Reconstructs the given URI into a string; uses a supplied function
78{{LAMBDA USERNAME PASSWORD -> STRING}} to map the userinfo part of the
79URI
80
81<procedure>(uri->list URI USERINFO) => LIST</procedure>
82
83Returns a list of the form {{(SCHEME SPECIFIC FRAGMENT)}};
84{{SPECIFIC}} is of the form {{(AUTHORITY PATH QUERY)}}.
85
86==== Reference Resolution
87
88<procedure>(uri-relative-to URI URI) => URI</procedure>
89
90Constructs an absolute URI given a relative URI and a base URI (RFC 3986, Section 5.2.2)
91
92<procedure>(uri-relative-from URI URI) => URI</procedure>
93
94Constructs a new, possibly relative, URI which represents the location
95of the first URI with respect to the second URI.
96
97==== String encoding and decoding
98
99<procedure>(uri-encode-string STRING [CHAR-SET]) => STRING</procedure>
100
101Returns the percent-encoded form of the given string.  The optional
102char-set argument controls which characters should be encoded.
103It defaults to the complement of {{char-set:uri-unreserved}}. This is
104always safe, but often overly careful; it is allowed to leave certain
105characters unquoted depending on the context.
106
107<procedure>(uri-decode-string STRING [CHAR-SET]) => STRING</procedure>
108
109Returns the decoded form of the given string.  The optional char-set
110argument controls which characters should be decoded.  It defaults to
111{{char-set:full}}.
112
113
114==== Normalization 
115
116<procedure>(uri-normalize-case URI) => URI</procedure>
117
118URI case normalization (RFC 3986 section 6.2.2.1)
119
120<procedure>(uri-normalize-path-segments URI) => URI</procedure>
121
122URI path segment normalization (RFC 3986 section 6.2.2.3)
123
124
125==== Character sets
126
127As a convenience for sub-parsers or other special-purpose URI handling
128code, there are a couple of character sets exported by uri-generic.
129
130<constant>char-set:gen-delims</constant>
131
132Generic delimiters.
133  gen-delims  =  ":" / "/" / "?" / "#" / "[" / "]" / "@"
134
135<constant>char-set:sub-delims</constant>
136
137Sub-delimiters.
138  sub-delims  =  "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
139
140<constant>char-set:uri-reserved</constant>
141
142The union of {{gen-delims}} and {{sub-delims}}; all reserved URI characters.
143  reserved    =  gen-delims / sub-delims
144
145<constant>char-set:uri-unreserved</constant>
146
147All unreserved characters that are allowed in an URI.
148  unreserved  =  ALPHA / DIGIT / "-" / "." / "_" / "~"
149
150Note that this is _not_ the complement of {{char-set:uri-reserved}}!
151There are several characters (even printable, noncontrol characters)
152which are not allowed at all in an URI.
153
154
155=== Requires
156
157* [[matchable]]
158* [[defstruct]]
159
160=== Version History
161
162* 2.0 Export char-sets, add char-set arg to uri-encode/uri-decode,
163       do not decode query args as x-www-form-urlencoded, change path
164       representation.  Lots of bugfixes.
165* 1.12 Fix relative path normalization when original path ends in a slash, remove consecutive slashes from paths in URIs
166* 1.11 Added accessors for the authority components, functional update procedures. Fixed case-normalization.
167* 1.10 Fixed edge case in {{uri-relative-to}} with empty path in base uri,
168       fixed {{uri->string}} for URIs with query args, fixed {{uri->string}}
169       to not add an extraneous slash after authority in case of empty path.
170* 1.9 Fixed bug in uri-encode-string with reserved characters, added
171      tests for decoding and encoding [Peter Bex]
172* 1.8 Added uri-encode-string and uri-decode-string.
173      URI constructors now perform automatic normalization
174      of percent-encoded unreserved characters. [suggested by Peter Bex]
175* 1.6 Added error message about missing scheme in absolute-uri.
176* trunk Small bugfix in absolute-uri. [Peter Bex]
177* 1.5 Bug fixes in uri->string and absolute-uri. [reported by Peter Bex]
178* 1.3 Ported to Hygienic Chicken and the [[test]] egg [Peter Bex]
179* 1.2 Now using defstruct instead of define-record [suggested by Peter Bex]
180* 1.1 Added utf8 compatibility
181* 1.0 Initial Release
182
183=== License
184
185Based on the
186[[http://www.ninebynine.org/Software/ReadMe-URI-Haskell.txt|Haskell
187URI library]] by Graham Klyne <gk@ninebynine.org>.
188
189
190  Copyright 2008 Ivan Raikov, Peter Bex.
191  All rights reserved.
192 
193  Redistribution and use in source and binary forms, with or without
194  modification, are permitted provided that the following conditions are
195  met:
196 
197  Redistributions of source code must retain the above copyright
198  notice, this list of conditions and the following disclaimer.
199 
200  Redistributions in binary form must reproduce the above copyright
201  notice, this list of conditions and the following disclaimer in the
202  documentation and/or other materials provided with the distribution.
203 
204  Neither the name of the author nor the names of its contributors may
205  be used to endorse or promote products derived from this software
206  without specific prior written permission.
207 
208  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
209  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
210  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
211  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
212  COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
213  INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
214  (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
215  SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
216  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
217  STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
218  ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
219  OF THE POSSIBILITY OF SUCH DAMAGE.
Note: See TracBrowser for help on using the repository browser.