source: project/release/2/charconv/charconv.html @ 8742

Last change on this file since 8742 was 8742, checked in by felix winkelmann, 12 years ago

moved eggs partially to rrb2

File size: 5.9 KB
Line 
1<html><head><title>charconv</title>
2
3<style type="text/css">
4  <!--
5      CODE {
6             color: #666666;
7           }
8      EM {
9           font-weight: bold;
10           font-style: normal;
11         }
12      DT.function { 
13                    background: #f5f5f5;
14                    color: black;
15                    padding: 0.1em;
16                    border: 1px solid #bbbaaf;
17                    font-family: monospace;
18                  }
19      PRE {
20        background: #efeee0;
21        padding: 0.1em;
22        border: 1px solid #bbbaaf;
23      }
24    -->
25</style></head>
26  <body>
27
28<center><img src="egg.jpg"></center>
29<center><a href="index.html">back</a></center>
30
31<h2>charconv</h2>
32
33<h3>Description:</h3>
34
35Character encoding utilities
36
37<h3>Author:</h3>
38Alex Shinn
39
40<h3>Version:</h3>
41<ul>
42<li>1.2 Fixing bug in pad-euc-input.  Signalling errors when trying to wrap a port with an unknown encoding.
43<li>1.1 adapted to SRFI-69-compatible hash-tables
44<li>1.0
45</li></ul>
46
47<h3>Usage:</h3>
48<pre>(require-extension charconv)
49</pre>
50
51<h3>Requires:</h3>
52<a href="iconv.html"><code>iconv</code></a> and <a href="autoload.html"><code>autoload</code></a>
53
54<h3>Download:</h3>
55<a href="charconv.egg">charconv.egg</a>
56
57<h3>Documentation:</h3>
58
59This module provides a convenience layer over top of the iconv
60module, as well as automatic detection of character encoding schemes.
61It implicitly assumes you are using UTF8 internally for your strings
62(you can use the <a href="utf8.html">utf8</a> module to change string semantics to use UTF8
63as well).  Given that, all you need to do is specify the external
64encoding you are working with.
65
66<p>INPUT/OUTPUT PROCEDURES:
67
68<p>  The following are direct analogs of the equivalent R5RS procedures:
69
70<ul>
71  <li> <code>open-encoded-input-file FILE ENC</code>
72  <li> <code>call-with-encoded-input-file FILE ENC PROC</code>
73  <li> <code>with-input-from-encoded-file FILE ENC THUNK</code>
74  <li> <code>open-encoded-output-file FILE ENC</code>
75  <li> <code>call-with-encoded-output-file FILE ENC PROC</code>
76  <li> <code>with-output-to-encoded-file FILE ENC THUNK</code>
77
78<p>  Example:
79
80<pre>
81  (with-input-from-encoded-file "/usr/share/edict/edict" "EUC-JP"
82    read-line)
83</pre>
84
85   <li> <code>read-encoded-string ENC [N [PORT]]</code>
86
87<p>  An analog of <code>read-string</code> using byte-count (not character count).
88  May read additional bytes to ensure you read along a character
89  boundary.  If you really want exactly N bytes regardless of
90  character boundaries, you should combine <code>read-string</code> with
91  <code>ces-convert</code> below.
92</ul>
93
94<p>UTILITY PROCEDURES:
95
96<p>  The following are copied from the Gauche API.  CES stands for
97  Character Encoding Scheme.
98
99<ul>
100  <li> <code>ces-equivalent? CES-A CES-B</code>
101
102  <p>Returns #t if CES-A and CES-B are equivalent (aliases), #f otherwise.
103
104  <li> <code>ces-upper-compatible? CES-A CES-B</code>
105
106  <p>Returns #t if a string encoded in CES-B can be considered a string
107  in CES-A without conversion.
108
109  <li> <code>ces-convert STR FROM [TO]</code>
110
111  <p>Return a new string of STR converted from encoding FROM to encoding
112  TO.
113</ul>
114
115<p>DETECTION PROCEDURES:
116
117  <p><ul>
118  <li> <code>detect-file-encoding FILE [LOCALE]</code>
119   <li> <code>detect-encoding STRING [LOCALE]</code>
120
121  <p>The detection procedures can correctly identify most common 'types'
122  of encodings, such as UTF-8/16/32, EUC-*, ISO-2022-*, Shift_JIS or
123  single-byte, without any need for specifying the locale.  However,
124  currently it doesn't include any statistical or linguistic routines,
125  without which it can't distinguish between EUC-JP and EUC-KR, or
126  between any of the single-byte encodings (including ISO-8859-*).  In
127  these cases you can specify a locale, such that in the event of a
128  single-byte encoding a "de" locale would result in the default
129  German single-byte encoding, ISO-8859-1.
130
131 <p>The detect-file-encoding procedure also recognizes the Emacs-style
132
133<pre>
134    -*- coding: foo -*-
135</pre>
136
137 signature in either of the first two lines.
138</ul>
139
140<p>AUTOMATIC DETECTION:
141
142<p> You can also use the automatic detection implicitly in the input
143 procedures by specifying an encoding of "*" or "*<LOCALE>".  For
144 example,
145
146<pre>
147   (open-encoded-input-file file "*")    ; guess with no locale
148   (open-encoded-input-file file "*DE")  ; guess with a German locale
149</pre>
150
151 <p>For compatibility with the Gauche convention, the encoding "*JP"
152 is equivalent to "*JA", the Japanese locale.
153
154
155
156<h3>License:</h3>
157
158<pre>
159Copyright (c) 2004-2005, Alex Shinn
160All rights reserved.
161
162Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following
163conditions are met:
164
165  Redistributions of source code must retain the above copyright notice, this list of conditions and the following
166    disclaimer.
167  Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
168    disclaimer in the documentation and/or other materials provided with the distribution.
169  Neither the name of the author nor the names of its contributors may be used to endorse or promote
170    products derived from this software without specific prior written permission.
171
172THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS
173OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
174AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR
175CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
176CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
177SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
178THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
179OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
180POSSIBILITY OF SUCH DAMAGE.
181</pre>
182
183
184<hr><a href="index.html">back</a>
185
186
187</body></html>
Note: See TracBrowser for help on using the repository browser.