source: project/wiki/eggref/4/string-utils @ 35145

Last change on this file since 35145 was 35145, checked in by Kon Lovett, 19 months ago

fix unicode-surrogates->codepoint return type

File size: 7.9 KB
Line 
1[[tags: egg]]
2
3== string-utils
4
5[[toc:]]
6
7
8== Documentation
9
10=== Memoized String
11
12==== Usage
13
14<enscript language=scheme>
15(require-extension memoized-string)
16</enscript>
17
18==== make-string+
19
20<procedure>(make-string+ COUNT [FILL]) => string</procedure>
21
22A ''tabling'' {{make-string}}.
23
24{{FILL}} is any valid {{char}}, including codepoints outside of the ASCII
25range. As such UTF-8 strings can be memoized.
26
27==== string+
28
29<procedure>(string+ [CHAR...]) => string</procedure>
30
31A ''tabling'' {{string}}.
32
33{{CHAR}} is any valid {{char}}, including codepoints outside of the ASCII
34range. As such UTF-8 strings can be memoized.
35
36==== global-string
37
38<procedure>(global-string STR) => string</procedure>
39
40Share common string space.
41
42==== make-string* (DEPRECATED)
43
44<procedure>(make-string* COUNT [FILL]) => string</procedure>
45
46=== String Hexadecimal
47
48==== Usage
49
50<enscript language=scheme>
51(require-extension string-hexadecimal)
52</enscript>
53
54==== string->hex
55
56<procedure>(string->hex STRING [START [END]]) => string</procedure>
57
58Returns a hexadecimal represenation of {{STRING}}. {{START}} and {{END}} are
59substring limits.
60
61{{STRING}} is treated as a string of bytes, a byte-vector.
62
63==== hex->string
64
65<procedure>(hex->string STRING [START [END]]) => string</procedure>
66
67Returns the binary representation of a hexadecimal{{STRING}}. {{START}} and
68{{END}} are substring limits.
69
70=== Unicode Utilities
71
72The name of this extension is misleading. Only UTF-8 is currently supported.
73
74For a better treatment of UTF-8 see the [[utf-8]] extension.
75
76==== Usage
77
78<enscript language=scheme>
79(require-extension unicode-utils)
80</enscript>
81
82==== ascii-codepoint?
83
84<procedure>(ascii-codepoint? CHAR) => boolean</procedure>
85
86==== unicode-char->string
87
88<procedure>(unicode-char->string CHAR) => string</procedure>
89
90Returns a string formed from Unicode codepoint {{CHAR}}.
91
92''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
93to {{1}}.
94
95Generates an error should the codepoint be out-of-range.
96
97==== unicode-string
98
99<procedure>(unicode-string [CHAR...]) => string</procedure>
100
101Returns a string formed from Unicode codepoints {{CHAR...}}
102
103''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
104to the length of {{CHAR...}}.
105
106Generates an error should the codepoint be out-of-range.
107
108==== *unicode-string
109
110<procedure>(*unicode-string CHARS) => string</procedure>
111
112Returns a string formed from Unicode codepoints {{CHARS}}, a {{(list-of
113char)}}.
114
115==== unicode-make-string
116
117<procedure>(unicode-make-string COUNT [FILL]) => string</procedure>
118
119Returns a string formed from {{COUNT}} occurrences of the Unicode codepoint
120{{FILL}}. The {{FILL}} default is {{#\space}}.
121
122''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
123to {{COUNT}}.
124
125Generates an error should the codepoint be out-of-range.
126
127==== unicode-surrogate?
128
129<procedure>(unicode-surrogate? NUM) => boolean</procedure>
130
131==== unicode-surrogates->codepoint
132
133<procedure>(unicode-surrogates->codepoint HIGH LOW) => (or boolean fixnum)</procedure>
134
135Returns the codepoint for the valid surrogate pair {{HIGH}} and {{LOW}}.
136Otherwise returns {{#f}}.
137
138=== String Utilities
139
140Reexports all of the above.
141
142== Usage
143
144<enscript language=scheme>
145(require-extension string-utils)
146</enscript>
147
148=== Bytes to Hexadecimal
149
150A common bytevector-like object to hexadecimal string facility.
151
152No error checking is performed!
153
154==== Usage
155
156<enscript language=scheme>
157(require-extension to-hex)
158</enscript>
159
160==== str_to_hex
161
162<procedure>(str_to_hex OUT IN OFF LEN)</procedure>
163
164Writes the ASCII hexadecimal representation of {{IN}} to {{OUT}}.
165
166{{IN}} is a {{nonnull-string}}.
167
168{{OFF}} is the byte offset.
169
170{{LEN}} is the length of the bytes at {{OFF}}.
171
172{{OUT}} is a {{string}} of length >= {{(+ LEN 2)}}.
173
174==== blob_to_hex
175
176<procedure>(blob_to_hex OUT IN OFF LEN)</procedure>
177
178Like {{str_to_hex}} except {{IN}} is a {{nonnull-blob}}.
179
180==== u8vec_to_hex
181
182<procedure>(u8vec_to_hex OUT IN OFF LEN)</procedure>
183
184Like {{str_to_hex}} except {{IN}} is a {{nonnull-u8vector}}.
185
186==== s8vec_to_hex
187
188<procedure>(s8vec_to_hex OUT IN OFF LEN)</procedure>
189
190Like {{str_to_hex}} except {{IN}} is a {{nonnull-s8vector}}.
191
192==== mem_to_hex
193
194<procedure>(mem_to_hex OUT IN OFF LEN)</procedure>
195
196Like {{str_to_hex}} except {{IN}} is a {{nonnull-c-pointer}}.
197
198==== hex_to_str
199
200<procedure>(hex_to_str OUT IN OFF LEN)</procedure>
201
202Reads the ASCII hexadecimal representation of {{IN}} to {{OUT}}.
203
204{{IN}} is a {{nonnull-string}}.
205
206{{OFF}} is the byte offset.
207
208{{LEN}} is the length of the bytes at {{OFF}}.
209
210{{OUT}} is a {{string}} of length >= {{(/ LEN 2)}}.
211
212==== hex_to_str
213
214<procedure>(hex_to_blob OUT IN OFF LEN)</procedure>
215
216Like {{hex_to_str}} except {{OUT}} is a {{blob}} of size >= {{(/ LEN 2)}}.
217
218=== String Interpolation
219
220==== Usage
221
222<enscript language=scheme>
223(require-extension string-interpolation)
224</enscript>
225
226<enscript language=scheme>
227(require-extension utf8-string-interpolation)
228</enscript>
229
230==== string-interpolate
231
232<procedure>(string-interpolate STR [eval-tag: EVAL-TAG] [eval-env: EVAL-ENV]) => string</procedure>
233
234Performs substitution of embedded Scheme expressions, evaluated in the
235{{EVAL-ENV}}, prefixed with {{EVAL-TAG}} and optionally enclosed in curly
236brackets. Two consecutive {{EVAL-TAG}}s are translated to a single
237{{EVAL-TAG}}.
238
239Similar to the {{#<#}} multi-line string.
240
241{{STR}} is a {{string}}.
242
243{{EVAL-TAG}} is a {{character}}, default {{#\#}}.
244
245{{EVAL-ENV}} is an {{environment}}, default {{(interaction-environment)}}.
246
247==== Usage
248
249<enscript language=scheme>
250(require-extension string-interpolation-syntax)
251</enscript>
252
253==== set-sharp-string-interpolation-syntax
254
255<procedure>(set-sharp-string-interpolation-syntax PROC)</procedure>
256
257Extends the read-syntax with #"..." where the {{"..."}} is evaluated using
258{{(PROC "...")}}. When {{PROC}} is {{#f}} the read-syntax is cleared. When
259{{PROC}} is {{#t}} then {{PROC}} is {{identity}}.
260
261<enscript language=scheme>
262(require-extension utf8-string-interpolation)
263(require-extension string-interpolation-syntax)
264
265(set-sharp-string-interpolation-syntax string-interpolate)
266;#"foo #(+ 1 2)bar #{(and 1 2)} baz"
267</enscript>
268
269
270== Requirements
271
272[[check-errors]]
273[[miscmacros]]
274[[utf8]]
275
276[[setup-helper]]
277[[test]]
278
279
280== Author
281
282[[/users/kon-lovett|Kon Lovett]]
283
284
285== Version history
286
287; 1.5.6 : Add types.
288; 1.5.5 :
289; 1.5.4 :
290; 1.5.3 : {{memorize-string}} -> {{global-string}}.
291; 1.5.2 : Fix {{string+}} & {{memorize-string}}.
292; 1.5.1 : Fix {{string+}} unicode support.
293; 1.5.0 : Deprecate {{make-string*}} for {{make-string+}}, add {{memorize-string}} & {{string+}}.
294; 1.4.0 : Add string-interpolation modules.
295; 1.3.1 : Fix {{hex_to_str}}, {{hex_to_blob}}.
296; 1.3.0 : Add {{hex->string}}, {{hex_to_str}}, {{hex_to_blob}}.
297; 1.2.5 : Remove [[lookup-table]].
298; 1.2.2 : Unicode string construction a little faster. Removed {{blob->hex}}.
299; 1.2.1 : Added {{blob->hex}}.
300; 1.2.0 : Added "generic" bytes to hexadecimal string.
301; 1.1.0 : Split into separate modules. Added some UTF-8 support.
302; 1.0.0 : Hello
303
304
305== License
306
307Copyright (C) 2010-2017 Kon Lovett.  All rights reserved.
308
309Permission is hereby granted, free of charge, to any person obtaining a
310copy of this software and associated documentation files (the Software),
311to deal in the Software without restriction, including without limitation
312the rights to use, copy, modify, merge, publish, distribute, sublicense,
313and/or sell copies of the Software, and to permit persons to whom the
314Software is furnished to do so, subject to the following conditions:
315
316The above copyright notice and this permission notice shall be included
317in all copies or substantial portions of the Software.
318
319THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
320IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
321FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
322THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
323OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
324ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
325OTHER DEALINGS IN THE SOFTWARE.
Note: See TracBrowser for help on using the repository browser.