source: project/wiki/eggref/5/string-utils @ 36061

Last change on this file since 36061 was 36061, checked in by kon, 12 months ago

rel 2.1.0

File size: 8.2 KB
Line 
1[[tags: egg]]
2
3== string-utils
4
5[[toc:]]
6
7
8== Documentation
9
10=== Memoized String
11
12==== Usage
13
14<enscript language=scheme>
15(import memoized-string)
16</enscript>
17
18==== make-string+
19
20<procedure>(make-string+ COUNT [FILL]) -> string</procedure>
21
22A ''tabling'' {{make-string}}.
23
24{{FILL}} is any valid {{char}}, including codepoints outside of the ASCII
25range. As such UTF-8 strings can be memoized.
26
27==== string+
28
29<procedure>(string+ [CHAR...]) -> string</procedure>
30
31A ''tabling'' {{string}}.
32
33{{CHAR}} is any valid {{char}}, including codepoints outside of the ASCII
34range. As such UTF-8 strings can be memoized.
35
36==== global-string
37
38<procedure>(global-string STR) -> string</procedure>
39
40Share common string space.
41
42==== make-string* (DEPRECATED)
43
44<procedure>(make-string* COUNT [FILL]) -> string</procedure>
45
46=== String Hexadecimal
47
48==== Usage
49
50<enscript language=scheme>
51(import string-hexadecimal)
52</enscript>
53
54==== string->hex
55
56<procedure>(string->hex STRING [START [END]]) -> string</procedure>
57
58Returns a hexadecimal represenation of {{STRING}}. {{START}} and {{END}} are
59substring limits.
60
61{{STRING}} is treated as a string of bytes, a byte-vector.
62
63==== hex->string
64
65<procedure>(hex->string STRING [START [END]]) -> string</procedure>
66
67Returns the binary representation of a hexadecimal{{STRING}}. {{START}} and
68{{END}} are substring limits.
69
70=== Hexadecimal Procedures
71
72==== Usage
73
74<enscript language=scheme>
75(import to-hex)
76</enscript>
77
78==== str_to_hex
79
80<procedure>(str_to_hex OUT IN OFF LEN)</procedure>
81
82Writes the ASCII hexadecimal representation of {{IN}} to {{OUT}}.
83
84{{IN}} is a {{nonnull-string}}.
85
86{{OFF}} is the byte offset.
87
88{{LEN}} is the length of the bytes at {{OFF}}.
89
90{{OUT}} is a {{string}} of length >= {{(+ LEN 2)}}.
91
92==== blob_to_hex
93
94<procedure>(blob_to_hex OUT IN OFF LEN)</procedure>
95
96Like {{str_to_hex}} except {{IN}} is a {{nonnull-blob}}.
97
98==== u8vec_to_hex
99
100<procedure>(u8vec_to_hex OUT IN OFF LEN)</procedure>
101
102Like {{str_to_hex}} except {{IN}} is a {{nonnull-u8vector}}.
103
104==== s8vec_to_hex
105
106<procedure>(s8vec_to_hex OUT IN OFF LEN)</procedure>
107
108Like {{str_to_hex}} except {{IN}} is a {{nonnull-s8vector}}.
109
110==== mem_to_hex
111
112<procedure>(mem_to_hex OUT IN OFF LEN)</procedure>
113
114Like {{str_to_hex}} except {{IN}} is a {{nonnull-c-pointer}}.
115
116==== hex_to_str
117
118<procedure>(hex_to_str OUT IN OFF LEN)</procedure>
119
120Reads the ASCII hexadecimal representation of {{IN}} to {{OUT}}.
121
122{{IN}} is a {{nonnull-string}}.
123
124{{OFF}} is the byte offset.
125
126{{LEN}} is the length of the bytes at {{OFF}}.
127
128{{OUT}} is a {{string}} of length >= {{(/ LEN 2)}}.
129
130==== hex_to_str
131
132<procedure>(hex_to_blob OUT IN OFF LEN)</procedure>
133
134Like {{hex_to_str}} except {{OUT}} is a {{blob}} of size >= {{(/ LEN 2)}}.
135
136=== Unicode Utilities
137
138The name of this extension is misleading. Only UTF-8 is currently supported.
139
140For a better treatment of UTF-8 see the [[utf-8]] extension.
141
142==== Usage
143
144<enscript language=scheme>
145(import unicode-utils)
146</enscript>
147
148==== ascii-codepoint?
149
150<procedure>(ascii-codepoint? CHAR) -> boolean</procedure>
151
152==== unicode-char->string
153
154<procedure>(unicode-char->string CHAR) -> string</procedure>
155
156Returns a string formed from Unicode codepoint {{CHAR}}.
157
158''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
159to {{1}}.
160
161Generates an error should the codepoint be out-of-range.
162
163==== unicode-string
164
165<procedure>(unicode-string [CHAR...]) -> string</procedure>
166
167Returns a string formed from Unicode codepoints {{CHAR...}}
168
169''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
170to the length of {{CHAR...}}.
171
172Generates an error should the codepoint be out-of-range.
173
174==== *unicode-string
175
176<procedure>(*unicode-string CHARS) -> string</procedure>
177
178Returns a string formed from Unicode codepoints {{CHARS}}, a {{(list-of
179char)}}.
180
181==== unicode-make-string
182
183<procedure>(unicode-make-string COUNT [FILL]) -> string</procedure>
184
185Returns a string formed from {{COUNT}} occurrences of the Unicode codepoint
186{{FILL}}. The {{FILL}} default is {{#\space}}.
187
188''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
189to {{COUNT}}.
190
191Generates an error should the codepoint be out-of-range.
192
193==== unicode-surrogate?
194
195<procedure>(unicode-surrogate? NUM) -> boolean</procedure>
196
197==== unicode-surrogates->codepoint
198
199<procedure>(unicode-surrogates->codepoint HIGH LOW) -> (or boolean fixnum)</procedure>
200
201Returns the codepoint for the valid surrogate pair {{HIGH}} and {{LOW}}.
202Otherwise returns {{#f}}.
203
204=== String Utilities
205
206==== Usage
207
208<enscript language=scheme>
209(import string-utils)
210</enscript>
211
212==== string-fixed-length
213
214<procedure>(string-fixed-length S N [pad-char: #\space] [trailing: "..."]) -> string</procedure>
215
216Returns the string {{S}} with the {{string-length}} fixed to {{N}}.
217
218A shorter string is padded. A longer string is truncated, & suffixed with the
219{{trailing}}.
220
221=== String Interpolation
222
223==== Usage
224
225<enscript language=scheme>
226(import string-interpolation-syntax)
227</enscript>
228
229==== set-sharp-string-interpolation-syntax
230
231<procedure>(set-sharp-string-interpolation-syntax PROC)</procedure>
232
233Extends the read-syntax with #"..." where the {{"..."}} is evaluated using
234{{(PROC "...")}}. When {{PROC}} is {{#f}} the read-syntax is cleared. When
235{{PROC}} is {{#t}} then {{PROC}} is {{identity}}.
236
237<enscript language=scheme>
238(use string-interpolation-syntax utf8-string-interpolation)
239
240(set-sharp-string-interpolation-syntax string-interpolate)
241;#"foo #(+ 1 2)bar #{(and 1 2)} baz"
242;=> "foo 3bar 2 baz"
243</enscript>
244
245==== Usage
246
247<enscript language=scheme>
248(import string-interpolation)
249</enscript>
250
251or
252
253<enscript language=scheme>
254(import utf8-string-interpolation)
255</enscript>
256
257==== string-interpolate
258
259<procedure>(string-interpolate STR [eval-tag: EVAL-TAG] [eval-env: EVAL-ENV]) -> string</procedure>
260
261Performs substitution of embedded Scheme expressions, evaluated in the
262{{EVAL-ENV}}, prefixed with {{EVAL-TAG}} and optionally enclosed in curly
263brackets. Two consecutive {{EVAL-TAG}}s are translated to a single
264{{EVAL-TAG}}.
265
266Similar to the {{#<#}} multi-line string.
267
268{{STR}} is a {{string}}.
269
270{{EVAL-TAG}} is a {{character}}, default {{#\#}}.
271
272{{EVAL-ENV}} is an {{environment}}, default {{(interaction-environment)}}.
273
274Automatically invokes {{(set-sharp-string-interpolation-syntax
275string-interpolate)}} on load.
276
277
278== Requirements
279
280[[check-errors]]
281[[miscmacros]]
282[[srfi-1]]
283[[srfi-13]]
284[[srfi-69]]
285[[utf8]]
286
287[[test]]
288
289
290== Author
291
292[[/users/kon-lovett|Kon Lovett]]
293
294
295== Version history
296
297; 2.1.0 : Add {{utf8-string-interpolation}}.
298; 2.0.0 : C5 release.
299; 1.6.0 : Add {{string-utils-extensions}}.
300; 1.5.6 : Add types.
301; 1.5.5 :
302; 1.5.4 :
303; 1.5.3 : {{memorize-string}} -> {{global-string}}.
304; 1.5.2 : Fix {{string+}} & {{memorize-string}}.
305; 1.5.1 : Fix {{string+}} unicode support.
306; 1.5.0 : Deprecate {{make-string*}} for {{make-string+}}, add {{memorize-string}} & {{string+}}.
307; 1.4.0 : Add string-interpolation modules.
308; 1.3.1 : Fix {{hex_to_str}}, {{hex_to_blob}}.
309; 1.3.0 : Add {{hex->string}}, {{hex_to_str}}, {{hex_to_blob}}.
310; 1.2.5 : Remove [[lookup-table]].
311; 1.2.2 : Unicode string construction a little faster. Removed {{blob->hex}}.
312; 1.2.1 : Added {{blob->hex}}.
313; 1.2.0 : Added "generic" bytes to hexadecimal string.
314; 1.1.0 : Split into separate modules. Added some UTF-8 support.
315; 1.0.0 : Hello
316
317
318== License
319
320Copyright (C) 2010-2017 Kon Lovett.  All rights reserved.
321
322Permission is hereby granted, free of charge, to any person obtaining a
323copy of this software and associated documentation files (the Software),
324to deal in the Software without restriction, including without limitation
325the rights to use, copy, modify, merge, publish, distribute, sublicense,
326and/or sell copies of the Software, and to permit persons to whom the
327Software is furnished to do so, subject to the following conditions:
328
329The above copyright notice and this permission notice shall be included
330in all copies or substantial portions of the Software.
331
332THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
333IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
334FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
335THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
336OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
337ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
338OTHER DEALINGS IN THE SOFTWARE.
Note: See TracBrowser for help on using the repository browser.