1 | [[toc:]] |
---|
2 | |
---|
3 | == Sentence Split |
---|
4 | |
---|
5 | Chicken Scheme port of Perl module Lingua::EN::Sentence [[https://github.com/kimryan/Lingua-EN-Sentence]]. |
---|
6 | === Description |
---|
7 | |
---|
8 | Sentence split extracts sentences from a body of text using a series of regular expressions. This project was originated by Shlomo Yona. The Perl module is currently maintained by Kim Ryan |
---|
9 | |
---|
10 | === Project / Source Code Repository |
---|
11 | |
---|
12 | [[https://gitlab.com/maxwell79/chicken-sentence-split]] |
---|
13 | |
---|
14 | |
---|
15 | === Maintainer |
---|
16 | |
---|
17 | David Ireland (djireland79 at gmail dot com) |
---|
18 | |
---|
19 | == API |
---|
20 | |
---|
21 | |
---|
22 | <procedure>(get-sentences TEXT)</procedure> |
---|
23 | |
---|
24 | Return a list of extracted sentences. |
---|
25 | |
---|
26 | <procedure>(acronyms)</procedure> |
---|
27 | |
---|
28 | Returns a list of acronyms. |
---|
29 | |
---|
30 | <procedure>(end-of-sentence-marker) </procedure> |
---|
31 | |
---|
32 | Returns the current end of sentence marker. |
---|
33 | |
---|
34 | <procedure>(set-end-of-sentence-marker! MARKER) </procedure> |
---|
35 | |
---|
36 | Set the end of sentence marker. |
---|
37 | |
---|
38 | <procedure>(add-acronyms! ACRONYMS) </procedure> |
---|
39 | |
---|
40 | Add a list of additional acronyms. |
---|
41 | |
---|
42 | == Example |
---|
43 | |
---|
44 | |
---|
45 | <enscript highlight="scheme"> |
---|
46 | (use (prefix sentence-split ss:)) |
---|
47 | |
---|
48 | (define text "Pr. Chamberlain was also a col. in the army. |
---|
49 | Chamberlain was commissioned a lt. col. in |
---|
50 | the 20th Maine Volunteer Infantry Regiment. |
---|
51 | On July 2nd, during the Battle of Gettysburg, |
---|
52 | Chamberlain's regiment occupied the extreme |
---|
53 | left of the Union lines at Little Round Top.") |
---|
54 | |
---|
55 | (for-each |
---|
56 | (lambda (x) |
---|
57 | (print "<" x ">")) |
---|
58 | (ss:get-sentences text)) |
---|
59 | |
---|
60 | |
---|
61 | </enscript> |
---|
62 | |
---|
63 | '''Produces the following output:''' |
---|
64 | |
---|
65 | <Pr. Chamberlain was also a col. in the army.> |
---|
66 | |
---|
67 | <Chamberlain was commissioned a lt. col. in the 20th Maine Volunteer Infantry Regiment.> |
---|
68 | |
---|
69 | <On July 2nd, during the Battle of Gettysburg, Chamberlain's regiment occupied the extreme left of the Union lines at Little Round Top.> |
---|
70 | |
---|
71 | == Changelog |
---|
72 | |
---|
73 | * 0.1 Initial Release |
---|
74 | |
---|
75 | == License |
---|
76 | |
---|
77 | This is free software; you can redistribute it and/or modify it under |
---|
78 | the same terms as the Perl 5 programming language system itself. |
---|
79 | Terms of the Perl programming language system itself |
---|
80 | |
---|
81 | a) the GNU General Public License as published by the Free |
---|
82 | Software Foundation; either version 1, or (at your option) any |
---|
83 | later version, or |
---|
84 | b) the "Artistic License" |
---|
85 | |
---|
86 | --- The GNU General Public License, Version 1, February 1989 --- |
---|
87 | |
---|
88 | This is free software, licensed under: |
---|
89 | |
---|
90 | The GNU General Public License, Version 1, February 1989 |
---|
91 | |
---|
92 | GNU GENERAL PUBLIC LICENSE |
---|
93 | Version 1, February 1989 |
---|
94 | |
---|
95 | Copyright (C) 1989 Free Software Foundation, Inc. |
---|
96 | 51 Franklin St, Suite 500, Boston, MA 02110-1335 USA |
---|
97 | |
---|
98 | Everyone is permitted to copy and distribute verbatim copies |
---|
99 | of this license document, but changing it is not allowed. |
---|
100 | |
---|
101 | Preamble |
---|
102 | |
---|
103 | The license agreements of most software companies try to keep users |
---|
104 | at the mercy of those companies. By contrast, our General Public |
---|
105 | License is intended to guarantee your freedom to share and change free |
---|
106 | software--to make sure the software is free for all its users. The |
---|
107 | General Public License applies to the Free Software Foundation's |
---|
108 | software and to any other program whose authors commit to using it. |
---|
109 | You can use it for your programs, too. |
---|
110 | |
---|
111 | When we speak of free software, we are referring to freedom, not |
---|
112 | price. Specifically, the General Public License is designed to make |
---|
113 | sure that you have the freedom to give away or sell copies of free |
---|
114 | software, that you receive source code or can get it if you want it, |
---|
115 | that you can change the software or use pieces of it in new free |
---|
116 | programs; and that you know you can do these things. |
---|
117 | |
---|
118 | To protect your rights, we need to make restrictions that forbid |
---|
119 | anyone to deny you these rights or to ask you to surrender the rights. |
---|
120 | These restrictions translate to certain responsibilities for you if you |
---|
121 | distribute copies of the software, or if you modify it. |
---|
122 | |
---|
123 | For example, if you distribute copies of a such a program, whether |
---|
124 | gratis or for a fee, you must give the recipients all the rights that |
---|
125 | you have. You must make sure that they, too, receive or can get the |
---|
126 | source code. And you must tell them their rights. |
---|
127 | |
---|
128 | We protect your rights with two steps: (1) copyright the software, and |
---|
129 | (2) offer you this license which gives you legal permission to copy, |
---|
130 | distribute and/or modify the software. |
---|
131 | |
---|
132 | Also, for each author's protection and ours, we want to make certain |
---|
133 | that everyone understands that there is no warranty for this free |
---|
134 | software. If the software is modified by someone else and passed on, we |
---|
135 | want its recipients to know that what they have is not the original, so |
---|
136 | that any problems introduced by others will not reflect on the original |
---|
137 | authors' reputations. The precise terms and conditions for copying, distribution and |
---|
138 | modification follow. |
---|
139 | |
---|
140 | GNU GENERAL PUBLIC LICENSE |
---|
141 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
---|
142 | |
---|
143 | 0. This License Agreement applies to any program or other work which |
---|
144 | contains a notice placed by the copyright holder saying it may be |
---|
145 | distributed under the terms of this General Public License. The |
---|
146 | "Program", below, refers to any such program or work, and a "work based |
---|
147 | on the Program" means either the Program or any work containing the |
---|
148 | Program or a portion of it, either verbatim or with modifications. Each |
---|
149 | licensee is addressed as "you". |
---|
150 | |
---|
151 | 1. You may copy and distribute verbatim copies of the Program's source |
---|
152 | code as you receive it, in any medium, provided that you conspicuously and |
---|
153 | appropriately publish on each copy an appropriate copyright notice and |
---|
154 | disclaimer of warranty; keep intact all the notices that refer to this |
---|
155 | General Public License and to the absence of any warranty; and give any |
---|
156 | other recipients of the Program a copy of this General Public License |
---|
157 | along with the Program. You may charge a fee for the physical act of |
---|
158 | transferring a copy. |
---|
159 | |
---|
160 | 2. You may modify your copy or copies of the Program or any portion of |
---|
161 | it, and copy and distribute such modifications under the terms of Paragraph |
---|
162 | 1 above, provided that you also do the following: |
---|
163 | |
---|
164 | a) cause the modified files to carry prominent notices stating that |
---|
165 | you changed the files and the date of any change; and |
---|
166 | |
---|
167 | b) cause the whole of any work that you distribute or publish, that |
---|
168 | in whole or in part contains the Program or any part thereof, either |
---|
169 | with or without modifications, to be licensed at no charge to all |
---|
170 | third parties under the terms of this General Public License (except |
---|
171 | that you may choose to grant warranty protection to some or all |
---|
172 | third parties, at your option). |
---|
173 | |
---|
174 | c) If the modified program normally reads commands interactively when |
---|
175 | run, you must cause it, when started running for such interactive use |
---|
176 | in the simplest and most usual way, to print or display an |
---|
177 | announcement including an appropriate copyright notice and a notice |
---|
178 | that there is no warranty (or else, saying that you provide a |
---|
179 | warranty) and that users may redistribute the program under these |
---|
180 | conditions, and telling the user how to view a copy of this General |
---|
181 | Public License. |
---|
182 | |
---|
183 | d) You may charge a fee for the physical act of transferring a |
---|
184 | copy, and you may at your option offer warranty protection in |
---|
185 | exchange for a fee. |
---|
186 | |
---|
187 | Mere aggregation of another independent work with the Program (or its |
---|
188 | derivative) on a volume of a storage or distribution medium does not bring |
---|
189 | the other work under the scope of these terms. |
---|
190 | |
---|
191 | 3. You may copy and distribute the Program (or a portion or derivative of |
---|
192 | it, under Paragraph 2) in object code or executable form under the terms of |
---|
193 | Paragraphs 1 and 2 above provided that you also do one of the following: |
---|
194 | |
---|
195 | a) accompany it with the complete corresponding machine-readable |
---|
196 | source code, which must be distributed under the terms of |
---|
197 | Paragraphs 1 and 2 above; or, |
---|
198 | |
---|
199 | b) accompany it with a written offer, valid for at least three |
---|
200 | years, to give any third party free (except for a nominal charge |
---|
201 | for the cost of distribution) a complete machine-readable copy of the |
---|
202 | corresponding source code, to be distributed under the terms of |
---|
203 | Paragraphs 1 and 2 above; or, |
---|
204 | |
---|
205 | c) accompany it with the information you received as to where the |
---|
206 | corresponding source code may be obtained. (This alternative is |
---|
207 | allowed only for noncommercial distribution and only if you |
---|
208 | received the program in object code or executable form alone.) |
---|
209 | |
---|
210 | Source code for a work means the preferred form of the work for making |
---|
211 | modifications to it. For an executable file, complete source code means |
---|
212 | all the source code for all modules it contains; but, as a special |
---|
213 | exception, it need not include source code for modules which are standard |
---|
214 | libraries that accompany the operating system on which the executable |
---|
215 | file runs, or for standard header files or definitions files that |
---|
216 | accompany that operating system. |
---|
217 | |
---|
218 | 4. You may not copy, modify, sublicense, distribute or transfer the |
---|
219 | Program except as expressly provided under this General Public License. |
---|
220 | Any attempt otherwise to copy, modify, sublicense, distribute or transfer |
---|
221 | the Program is void, and will automatically terminate your rights to use |
---|
222 | the Program under this License. However, parties who have received |
---|
223 | copies, or rights to use copies, from you under this General Public |
---|
224 | License will not have their licenses terminated so long as such parties |
---|
225 | remain in full compliance. |
---|
226 | |
---|
227 | 5. By copying, distributing or modifying the Program (or any work based |
---|
228 | on the Program) you indicate your acceptance of this license to do so, |
---|
229 | and all its terms and conditions. |
---|
230 | |
---|
231 | 6. Each time you redistribute the Program (or any work based on the |
---|
232 | Program), the recipient automatically receives a license from the original |
---|
233 | licensor to copy, distribute or modify the Program subject to these |
---|
234 | terms and conditions. You may not impose any further restrictions on the |
---|
235 | recipients' exercise of the rights granted herein. |
---|
236 | |
---|
237 | 7. The Free Software Foundation may publish revised and/or new versions |
---|
238 | of the General Public License from time to time. Such new versions will |
---|
239 | be similar in spirit to the present version, but may differ in detail to |
---|
240 | address new problems or concerns. |
---|
241 | |
---|
242 | Each version is given a distinguishing version number. If the Program |
---|
243 | specifies a version number of the license which applies to it and "any |
---|
244 | later version", you have the option of following the terms and conditions |
---|
245 | either of that version or of any later version published by the Free |
---|
246 | Software Foundation. If the Program does not specify a version number of |
---|
247 | the license, you may choose any version ever published by the Free Software |
---|
248 | Foundation. |
---|
249 | |
---|
250 | 8. If you wish to incorporate parts of the Program into other free |
---|
251 | programs whose distribution conditions are different, write to the author |
---|
252 | to ask for permission. For software which is copyrighted by the Free |
---|
253 | Software Foundation, write to the Free Software Foundation; we sometimes |
---|
254 | make exceptions for this. Our decision will be guided by the two goals |
---|
255 | of preserving the free status of all derivatives of our free software and |
---|
256 | of promoting the sharing and reuse of software generally. |
---|
257 | |
---|
258 | NO WARRANTY |
---|
259 | |
---|
260 | 9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY |
---|
261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN |
---|
262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES |
---|
263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED |
---|
264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF |
---|
265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS |
---|
266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE |
---|
267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, |
---|
268 | REPAIR OR CORRECTION. |
---|
269 | |
---|
270 | 10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING |
---|
271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR |
---|
272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, |
---|
273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING |
---|
274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED |
---|
275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY |
---|
276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER |
---|
277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE |
---|
278 | POSSIBILITY OF SUCH DAMAGES. |
---|
279 | |
---|
280 | END OF TERMS AND CONDITIONS |
---|
281 | |
---|
282 | Appendix: How to Apply These Terms to Your New Programs |
---|
283 | |
---|
284 | If you develop a new program, and you want it to be of the greatest |
---|
285 | possible use to humanity, the best way to achieve this is to make it |
---|
286 | free software which everyone can redistribute and change under these |
---|
287 | terms. |
---|
288 | |
---|
289 | To do so, attach the following notices to the program. It is safest to |
---|
290 | attach them to the start of each source file to most effectively convey |
---|
291 | the exclusion of warranty; and each file should have at least the |
---|
292 | "copyright" line and a pointer to where the full notice is found. |
---|
293 | |
---|
294 | <one line to give the program's name and a brief idea of what it does.> |
---|
295 | Copyright (C) 19yy <name of author> |
---|
296 | |
---|
297 | This program is free software; you can redistribute it and/or modify |
---|
298 | it under the terms of the GNU General Public License as published by |
---|
299 | the Free Software Foundation; either version 1, or (at your option) |
---|
300 | any later version. |
---|
301 | |
---|
302 | This program is distributed in the hope that it will be useful, |
---|
303 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
304 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
305 | GNU General Public License for more details. |
---|
306 | |
---|
307 | You should have received a copy of the GNU General Public License |
---|
308 | along with this program; if not, write to the Free Software |
---|
309 | Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA |
---|
310 | |
---|
311 | |
---|
312 | Also add information on how to contact you by electronic and paper mail. |
---|
313 | |
---|
314 | If the program is interactive, make it output a short notice like this |
---|
315 | when it starts in an interactive mode: |
---|
316 | |
---|
317 | Gnomovision version 69, Copyright (C) 19xx name of author |
---|
318 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. |
---|
319 | This is free software, and you are welcome to redistribute it |
---|
320 | under certain conditions; type `show c' for details. |
---|
321 | |
---|
322 | The hypothetical commands `show w' and `show c' should show the |
---|
323 | appropriate parts of the General Public License. Of course, the |
---|
324 | commands you use may be called something other than `show w' and `show |
---|
325 | c'; they could even be mouse-clicks or menu items--whatever suits your |
---|
326 | program. |
---|
327 | |
---|
328 | You should also get your employer (if you work as a programmer) or your |
---|
329 | school, if any, to sign a "copyright disclaimer" for the program, if |
---|
330 | necessary. Here a sample; alter the names: |
---|
331 | |
---|
332 | Yoyodyne, Inc., hereby disclaims all copyright interest in the |
---|
333 | program `Gnomovision' (a program to direct compilers to make passes |
---|
334 | at assemblers) written by James Hacker. |
---|
335 | |
---|
336 | <signature of Ty Coon>, 1 April 1989 |
---|
337 | Ty Coon, President of Vice |
---|
338 | |
---|
339 | That's all there is to it! |
---|
340 | |
---|
341 | |
---|
342 | --- The Artistic License 1.0 --- |
---|
343 | |
---|
344 | This software is Copyright (c) 2016 by Kim Ryann. |
---|
345 | |
---|
346 | This is free software, licensed under: |
---|
347 | |
---|
348 | The Artistic License 1.0 |
---|
349 | |
---|
350 | The Artistic License |
---|
351 | |
---|
352 | Preamble |
---|
353 | |
---|
354 | The intent of this document is to state the conditions under which a Package |
---|
355 | may be copied, such that the Copyright Holder maintains some semblance of |
---|
356 | artistic control over the development of the package, while giving the users of |
---|
357 | the package the right to use and distribute the Package in a more-or-less |
---|
358 | customary fashion, plus the right to make reasonable modifications. |
---|
359 | |
---|
360 | Definitions: |
---|
361 | |
---|
362 | - "Package" refers to the collection of files distributed by the Copyright |
---|
363 | Holder, and derivatives of that collection of files created through |
---|
364 | textual modification. |
---|
365 | - "Standard Version" refers to such a Package if it has not been modified, |
---|
366 | or has been modified in accordance with the wishes of the Copyright |
---|
367 | Holder. |
---|
368 | - "Copyright Holder" is whoever is named in the copyright or copyrights for |
---|
369 | the package. |
---|
370 | - "You" is you, if you're thinking about copying or distributing this Package. |
---|
371 | - "Reasonable copying fee" is whatever you can justify on the basis of media |
---|
372 | cost, duplication charges, time of people involved, and so on. (You will |
---|
373 | not be required to justify it to the Copyright Holder, but only to the |
---|
374 | computing community at large as a market that must bear the fee.) |
---|
375 | - "Freely Available" means that no fee is charged for the item itself, though |
---|
376 | there may be fees involved in handling the item. It also means that |
---|
377 | recipients of the item may redistribute it under the same conditions they |
---|
378 | received it. |
---|
379 | |
---|
380 | 1. You may make and give away verbatim copies of the source form of the |
---|
381 | Standard Version of this Package without restriction, provided that you |
---|
382 | duplicate all of the original copyright notices and associated disclaimers. |
---|
383 | |
---|
384 | 2. You may apply bug fixes, portability fixes and other modifications derived |
---|
385 | from the Public Domain or from the Copyright Holder. A Package modified in such |
---|
386 | a way shall still be considered the Standard Version. |
---|
387 | |
---|
388 | 3. You may otherwise modify your copy of this Package in any way, provided that |
---|
389 | you insert a prominent notice in each changed file stating how and when you |
---|
390 | changed that file, and provided that you do at least ONE of the following: |
---|
391 | |
---|
392 | a) place your modifications in the Public Domain or otherwise make them |
---|
393 | Freely Available, such as by posting said modifications to Usenet or an |
---|
394 | equivalent medium, or placing the modifications on a major archive site |
---|
395 | such as ftp.uu.net, or by allowing the Copyright Holder to include your |
---|
396 | modifications in the Standard Version of the Package. |
---|
397 | |
---|
398 | b) use the modified Package only within your corporation or organization. |
---|
399 | |
---|
400 | c) rename any non-standard executables so the names do not conflict with |
---|
401 | standard executables, which must also be provided, and provide a separate |
---|
402 | manual page for each non-standard executable that clearly documents how it |
---|
403 | differs from the Standard Version. |
---|
404 | |
---|
405 | d) make other distribution arrangements with the Copyright Holder. |
---|
406 | |
---|
407 | 4. You may distribute the programs of this Package in object code or executable |
---|
408 | form, provided that you do at least ONE of the following: |
---|
409 | |
---|
410 | a) distribute a Standard Version of the executables and library files, |
---|
411 | together with instructions (in the manual page or equivalent) on where to |
---|
412 | get the Standard Version. |
---|
413 | |
---|
414 | b) accompany the distribution with the machine-readable source of the Package |
---|
415 | with your modifications. |
---|
416 | |
---|
417 | c) accompany any non-standard executables with their corresponding Standard |
---|
418 | Version executables, giving the non-standard executables non-standard |
---|
419 | names, and clearly documenting the differences in manual pages (or |
---|
420 | equivalent), together with instructions on where to get the Standard |
---|
421 | Version. |
---|
422 | |
---|
423 | d) make other distribution arrangements with the Copyright Holder. |
---|
424 | |
---|
425 | 5. You may charge a reasonable copying fee for any distribution of this |
---|
426 | Package. You may charge any fee you choose for support of this Package. You |
---|
427 | may not charge a fee for this Package itself. However, you may distribute this |
---|
428 | Package in aggregate with other (possibly commercial) programs as part of a |
---|
429 | larger (possibly commercial) software distribution provided that you do not |
---|
430 | advertise this Package as a product of your own. |
---|
431 | |
---|
432 | 6. The scripts and library files supplied as input to or produced as output |
---|
433 | from the programs of this Package do not automatically fall under the copyright |
---|
434 | of this Package, but belong to whomever generated them, and may be sold |
---|
435 | commercially, and may be aggregated with this Package. |
---|
436 | |
---|
437 | 7. C or perl subroutines supplied by you and linked into this Package shall not |
---|
438 | be considered part of this Package. |
---|
439 | |
---|
440 | 8. The name of the Copyright Holder may not be used to endorse or promote |
---|
441 | products derived from this software without specific prior written permission. |
---|
442 | |
---|
443 | 9. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED |
---|
444 | WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF |
---|
445 | MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. |
---|
446 | |
---|
447 | The End |
---|