source: project/wiki/regex-literals @ 4751

Last change on this file since 4751 was 4751, checked in by arto, 13 years ago

Added compilation information to the *-literals eggs' wiki pages.

File size: 4.6 KB
Line 
1[[tags: eggs literals regex regex-literals]]
2
3[[toc:]]
4
5== Introduction
6
7A reader extension providing precompiled regular expression literals of the
8form <code>#/[a-z0-9]+/i</code> and <code>#r{^/path/(to)/file$}</code>.
9
10
11== Examples
12
13=== Using regular expression literals in the interpreter
14
15Loading {{regex-literals}} also loads the {{regex}} unit and allows
16convenient use of regular expression literals as follows:
17
18<enscript highlight=scheme>#;1> (use regex-literals)
19
20#;2> #/[A-Za-z0-9]+/
21#<regexp>
22
23#;3> ,x #/^[a-z0-9]+$/i
24(regexp "^[a-z0-9]+$" #t #f #f)
25
26#;4> (string-match #/^(\d{2}):(\d{2})(..)/ "11:59pm")
27("11:59pm" "11" "59" "pm")
28
29#;5> (string-split-fields #/[^\s]+/ "the quick brown fox jumps over the lazy dog")
30("the" "quick" "brown" "fox" "jumps" "over" "the" "lazy" "dog")
31
32#;6> (string-split-fields #r{[^/]+} "/path/to/file")
33("path" "to" "file")
34
35#;7> (string-substitute #/(\w+)\s+(\w+)/u "\\2, \\1" "John Smith")
36"Smith, John"
37
38</enscript>
39
40
41=== Using regular expression literals with the compiler
42
43Passing a {{-X regex-literals}} command-line option to {{csc}} allows you to
44conveniently make use of regular expression literals in your egg or compiled
45program without making the {{regex-literals}} egg a runtime dependency.
46
47
48== Author
49
50[[Arto Bendiken]],
51
52
53== Requires
54
55* [[Unit regex|regex]]
56
57
58== Reader extensions
59
60This egg installs a reader extension for {{#\/}} that reads a regular
61expression literal as described below in {{read-regex-literal}}, and another
62reader extension for {{#\r}} that works similarly but supports a generalized
63delimiter syntax as described in {{read-regex-literal/general}}.
64
65Note that there are some caveats to using reader extensions when compiling;
66for more details, refer to the relevant
67[[faq#Why%20does%20{{define-reader-ctor}}%20not%20work%20in%20my%20compiled%20program?|FAQ entry]].
68
69
70== Input and output
71
72=== read-regex-literal
73
74 [procedure] (read-regex-literal [PORT])
75
76Reads a regular expression literal of the form {{#/.../}} from {{PORT}},
77which defaults to the value of {{(current-input-port)}}. The literal is
78converted to a precompiled regular expression object using the {{(regexp)}}
79procedure provided by the [[Unit regex|regex]] unit.
80
81Regular expression literals may include one or more options that modify the
82way the pattern matches strings. The options are one or more characters
83placed immediately after the terminator:
84
85* {{#/.../i}} PCRE_CASELESS: case-insensitive mode; the pattern match will
86  ignore the case of letters in the pattern.
87* {{#/.../x}} PCRE_EXTENDED: extended mode; complex regular expressions can
88  be difficult to read, so this option allows you to insert spaces,
89  newlines, and comments in the pattern to make it more readable.
90* {{#/.../u}} PCRE_UTF8: UTF-8 mode; sets the language encoding of the
91  regular expression.
92
93
94=== read-regex-literal/general
95
96 [procedure] (read-regex-literal/general [PORT])
97
98Reads a regular expression literal of the form {{#r(...)}} from {{PORT}},
99which defaults to the value of {{(current-input-port)}}. This works
100otherwise similarly to {{read-regex-literal}} but supports a generalized
101delimiter syntax as follows:
102
103* Matching delimiter pairs: {{#r{...}}}, {{#r(...)}}, {{#r[...]}} and
104  {{#r<...>}}
105* Any arbitrary character: {{#r!...!}}, {{#r|...|}}, {{#r@...@}}, and so
106  forth.
107
108
109== License
110
111  Copyright (c) 2006-2007 Arto Bendiken.
112 
113  Permission is hereby granted, free of charge, to any person obtaining a copy
114  of this software and associated documentation files (the "Software"), to
115  deal in the Software without restriction, including without limitation the
116  rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
117  sell copies of the Software, and to permit persons to whom the Software is
118  furnished to do so, subject to the following conditions:
119 
120  The above copyright notice and this permission notice shall be included in
121  all copies or substantial portions of the Software.
122 
123  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
124  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
125  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
126  AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
127  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
128  FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
129  IN THE SOFTWARE.
130
131
132== Version history
133
134;1.0.2 : Support for generalized {{#r(...)}} delimiters (by [[http://3e8.org/zb|Zbigniew]])
135;1.0.1 : Added support for the {{#/.../i}}, {{#/.../x}} and {{#/.../u}} options.
136;1.0.0 : Initial release of the {{regex-literals}} egg.
Note: See TracBrowser for help on using the repository browser.