source: project/wiki/eggref/5/string-utils @ 37681

Last change on this file since 37681 was 37681, checked in by kon, 5 weeks ago

rel 2.2.0

File size: 8.3 KB
Line 
1[[tags: egg]]
2
3== string-utils
4
5[[toc:]]
6
7
8== Documentation
9
10=== Memoized String
11
12==== Usage
13
14<enscript language=scheme>
15(import memoized-string)
16</enscript>
17
18==== make-string+
19
20<procedure>(make-string+ COUNT [FILL]) -> string</procedure>
21
22A ''tabling'' {{make-string}}.
23
24{{FILL}} is any valid {{char}}, including codepoints outside of the ASCII
25range. As such UTF-8 strings can be memoized.
26
27==== string+
28
29<procedure>(string+ [CHAR...]) -> string</procedure>
30
31A ''tabling'' {{string}}.
32
33{{CHAR}} is any valid {{char}}, including codepoints outside of the ASCII
34range. As such UTF-8 strings can be memoized.
35
36==== global-string
37
38<procedure>(global-string STR) -> string</procedure>
39
40Share common string space.
41
42==== make-string* (DEPRECATED)
43
44<procedure>(make-string* COUNT [FILL]) -> string</procedure>
45
46=== String Hexadecimal
47
48==== Usage
49
50<enscript language=scheme>
51(import string-hexadecimal)
52</enscript>
53
54==== string->hex
55
56<procedure>(string->hex STRING [START [END]]) -> string</procedure>
57
58Returns a hexadecimal represenation of {{STRING}}. {{START}} and {{END}} are
59substring limits.
60
61{{STRING}} is treated as a string of bytes, a byte-vector.
62
63==== hex->string
64
65<procedure>(hex->string STRING [START [END]]) -> string</procedure>
66
67Returns the binary representation of a hexadecimal{{STRING}}. {{START}} and
68{{END}} are substring limits.
69
70=== Hexadecimal Procedures
71
72==== Usage
73
74<enscript language=scheme>
75(import to-hex)
76</enscript>
77
78==== str_to_hex
79
80<procedure>(str_to_hex OUT IN OFF LEN)</procedure>
81
82Writes the ASCII hexadecimal representation of {{IN}} to {{OUT}}.
83
84{{IN}} is a {{nonnull-string}}.
85
86{{OFF}} is the byte offset.
87
88{{LEN}} is the length of the bytes at {{OFF}}.
89
90{{OUT}} is a {{string}} of length >= {{(+ LEN 2)}}.
91
92==== blob_to_hex
93
94<procedure>(blob_to_hex OUT IN OFF LEN)</procedure>
95
96Like {{str_to_hex}} except {{IN}} is a {{nonnull-blob}}.
97
98==== u8vec_to_hex
99
100<procedure>(u8vec_to_hex OUT IN OFF LEN)</procedure>
101
102Like {{str_to_hex}} except {{IN}} is a {{nonnull-u8vector}}.
103
104==== s8vec_to_hex
105
106<procedure>(s8vec_to_hex OUT IN OFF LEN)</procedure>
107
108Like {{str_to_hex}} except {{IN}} is a {{nonnull-s8vector}}.
109
110==== mem_to_hex
111
112<procedure>(mem_to_hex OUT IN OFF LEN)</procedure>
113
114Like {{str_to_hex}} except {{IN}} is a {{nonnull-c-pointer}}.
115
116==== hex_to_str
117
118<procedure>(hex_to_str OUT IN OFF LEN)</procedure>
119
120Reads the ASCII hexadecimal representation of {{IN}} to {{OUT}}.
121
122{{IN}} is a {{nonnull-string}}.
123
124{{OFF}} is the byte offset.
125
126{{LEN}} is the length of the bytes at {{OFF}}.
127
128{{OUT}} is a {{string}} of length >= {{(/ LEN 2)}}.
129
130==== hex_to_str
131
132<procedure>(hex_to_blob OUT IN OFF LEN)</procedure>
133
134Like {{hex_to_str}} except {{OUT}} is a {{blob}} of size >= {{(/ LEN 2)}}.
135
136=== Unicode Utilities
137
138The name of this extension is misleading. Only UTF-8 is currently supported.
139
140For a better treatment of UTF-8 see the [[utf-8]] extension.
141
142==== Usage
143
144<enscript language=scheme>
145(import unicode-utils)
146</enscript>
147
148==== ascii-codepoint?
149
150<procedure>(ascii-codepoint? CHAR) -> boolean</procedure>
151
152==== unicode-char->string
153
154<procedure>(unicode-char->string CHAR) -> string</procedure>
155
156Returns a string formed from Unicode codepoint {{CHAR}}.
157
158''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
159to {{1}}.
160
161Generates an error should the codepoint be out-of-range.
162
163==== unicode-string
164
165<procedure>(unicode-string [CHAR...]) -> string</procedure>
166
167Returns a string formed from Unicode codepoints {{CHAR...}}
168
169''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
170to the length of {{CHAR...}}.
171
172Generates an error should the codepoint be out-of-range.
173
174==== *unicode-string
175
176<procedure>(*unicode-string CHARS) -> string</procedure>
177
178Returns a string formed from Unicode codepoints {{CHARS}}, a {{(list-of
179char)}}.
180
181==== unicode-make-string
182
183<procedure>(unicode-make-string COUNT [FILL]) -> string</procedure>
184
185Returns a string formed from {{COUNT}} occurrences of the Unicode codepoint
186{{FILL}}. The {{FILL}} default is {{#\space}}.
187
188''Note'' that the {{(string-length)}} (except under [[utf-8]]) may not be equal
189to {{COUNT}}.
190
191Generates an error should the codepoint be out-of-range.
192
193==== unicode-surrogate?
194
195<procedure>(unicode-surrogate? NUM) -> boolean</procedure>
196
197==== unicode-surrogates->codepoint
198
199<procedure>(unicode-surrogates->codepoint HIGH LOW) -> (or boolean fixnum)</procedure>
200
201Returns the codepoint for the valid surrogate pair {{HIGH}} and {{LOW}}.
202Otherwise returns {{#f}}.
203
204=== String Utilities
205
206==== Usage
207
208<enscript language=scheme>
209(import string-utils)
210</enscript>
211
212==== string-fixed-length
213
214<procedure>(string-fixed-length S N [pad-char: #\space] [trailing: "..."]) -> string</procedure>
215
216Returns the string {{S}} with the {{string-length}} fixed to {{N}}.
217
218A shorter string is padded. A longer string is truncated, & suffixed with the
219{{trailing}}.
220
221=== String Interpolation
222
223==== Usage
224
225<enscript language=scheme>
226(import string-interpolation-syntax)
227</enscript>
228
229==== set-sharp-string-interpolation-syntax
230
231<procedure>(set-sharp-string-interpolation-syntax PROC)</procedure>
232
233Extends the read-syntax with #"..." where the {{"..."}} is evaluated using
234{{(PROC "...")}}. When {{PROC}} is {{#f}} the read-syntax is cleared. When
235{{PROC}} is {{#t}} then {{PROC}} is {{identity}}.
236
237<enscript language=scheme>
238(use string-interpolation-syntax utf8-string-interpolation)
239
240(set-sharp-string-interpolation-syntax string-interpolate)
241#"foo #(+ 1 2)bar #{(and 1 2)} baz"
242;=> "foo 3bar 2 baz"
243</enscript>
244
245See [[http://wiki.call-cc.org/man/5/Extensions%20to%20the%20standard#multiline-string-constant-with-embedded-expressions|Multiline String Constant with Embedded Expressions]].
246
247==== Usage
248
249<enscript language=scheme>
250(import string-interpolation) ;or (import utf8-string-interpolation)
251</enscript>
252
253==== string-interpolate
254
255<procedure>(string-interpolate STR [eval-tag: EVAL-TAG]) -> list</procedure>
256
257Performs substitution of embedded Scheme expressions, prefixed with
258{{EVAL-TAG}} and optionally enclosed in curly brackets. Two consecutive
259{{EVAL-TAG}}s are translated to a single {{EVAL-TAG}}.
260
261Similar to the {{#<#}} multi-line string.
262
263{{STR}} is a {{string}}.
264
265{{EVAL-TAG}} is a {{character}}, default {{#\#}}.
266
267Automatically invokes {{(set-sharp-string-interpolation-syntax
268string-interpolate)}} on load.
269
270
271== Requirements
272
273[[check-errors]]
274[[miscmacros]]
275[[srfi-1]]
276[[srfi-13]]
277[[srfi-69]]
278[[utf8]]
279
280[[test]]
281
282
283== Author
284
285[[/users/kon-lovett|Kon Lovett]]
286
287
288== Version history
289
290; 2.2.0 : Fix {{string-interpolation}}.
291; 2.1.0 : Add {{utf8-string-interpolation}}.
292; 2.0.0 : C5 release.
293; 1.6.0 : Add {{string-utils-extensions}}.
294; 1.5.6 : Add types.
295; 1.5.5 :
296; 1.5.4 :
297; 1.5.3 : {{memorize-string}} -> {{global-string}}.
298; 1.5.2 : Fix {{string+}} & {{memorize-string}}.
299; 1.5.1 : Fix {{string+}} unicode support.
300; 1.5.0 : Deprecate {{make-string*}} for {{make-string+}}, add {{memorize-string}} & {{string+}}.
301; 1.4.0 : Add string-interpolation modules.
302; 1.3.1 : Fix {{hex_to_str}}, {{hex_to_blob}}.
303; 1.3.0 : Add {{hex->string}}, {{hex_to_str}}, {{hex_to_blob}}.
304; 1.2.5 : Remove [[lookup-table]].
305; 1.2.2 : Unicode string construction a little faster. Removed {{blob->hex}}.
306; 1.2.1 : Added {{blob->hex}}.
307; 1.2.0 : Added "generic" bytes to hexadecimal string.
308; 1.1.0 : Split into separate modules. Added some UTF-8 support.
309; 1.0.0 : Hello
310
311
312== License
313
314Copyright (C) 2010-2017 Kon Lovett.  All rights reserved.
315
316Permission is hereby granted, free of charge, to any person obtaining a
317copy of this software and associated documentation files (the Software),
318to deal in the Software without restriction, including without limitation
319the rights to use, copy, modify, merge, publish, distribute, sublicense,
320and/or sell copies of the Software, and to permit persons to whom the
321Software is furnished to do so, subject to the following conditions:
322
323The above copyright notice and this permission notice shall be included
324in all copies or substantial portions of the Software.
325
326THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
327IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
328FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
329THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
330OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
331ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
332OTHER DEALINGS IN THE SOFTWARE.
Note: See TracBrowser for help on using the repository browser.