source: project/wiki/eggref/5/npdiff @ 38031

Last change on this file since 38031 was 38031, checked in by Ivan Raikov, 2 months ago

initial npdiff doc for c5

File size: 5.2 KB
Line 
1[[tags:egg]]
2
3== npdiff
4
5Compute the longest common subsequence of two sequences, and generate
6a script to transform one sequence into the other.
7
8[[toc:]]
9
10== Documentation
11
12The {{npdiff}} library is an implementation of an algorithm for
13sequence comparison that has a worst-case running time of {{O(N*P)}},
14where {{P}} is the number of deletions required by the shortest edit
15script between the two sequences.  The algorithm is described in the
16following paper:
17
18 Sun Wu, Udi Manber, Eugene Myers, and Webb Miller. An O(NP) sequence comparison algorithm.
19 In Information Processing Letters, volume 35, pages 317--323, September 1990.
20
21This library defines a data type that describes the three basic
22operations of an edit script (remove, insert, change) and a procedure
23that implements the algorithm for sequences of an arbitrary type,
24provided an equality predicate for that type, and accessors for the
25sequence.
26
27=== Data types
28
29
30{{(define-datatype diffop diffop? ...)}}
31
32Representation of the three diff operations: insert, remove,
33change. The target in each operation is an index or a range of indices
34in the sequence being transformed (target sequence), and the source is
35a range of indices in the sequence being read from (source
36sequence). The operation definitions are:
37
38; {{(Insert target source seq context)}} : Insert operation
39* {{TARGET}} is the index at which the insertion takes place;
40* {{SOURCE}} is a range in the form (x . y) that specifies the indices of those elements in the source sequence that are inserted in the target sequence;
41* {{SEQ}} is the target sequence, and
42* {{CONTEXT}} is an optional context provided  in the form {{(BEFORE . AFTER)}} where {{BEFORE}} is a sequence of elements from the source preceding the elements being inserted, and {{AFTER}} is a sequence of elements from the source that follow the elements being inserted.
43; {{(Remove target seq context)}} : Remove operation
44* {{TARGET}} is a range of elements in the form (x . y) that are being deleted from the target sequence;
45* {{SEQ}} is the target sequence, and
46* {{CONTEXT}} is an optional context provided  in the form {{(BEFORE . AFTER)}} where {{BEFORE}} is a sequence of elements from the target preceding the elements being deleted, and {{AFTER}} is a sequence of elements from the target that follow the elements being deleted.
47; {{(Change target source seqin seqout contextin contextout)}} : Change operation
48* {{TARGET}} is a range of elements in the form (x . y) that are being modified in the target sequence;
49* {{SOURCE}} is a range of elements in the form (x . y) from the source sequence that are replacing the specified elements in the target sequence;
50* {{SEQIN}} is the source sequence, and {{SEQOUT}} is the target sequence;
51* {{CONTEXTOUT}} is an optional context provided  in the form {{(BEFORE . AFTER)}} where {{BEFORE}} is a sequence of elements from the target preceding the elements being changed (SEQOUT), and {{AFTER}} is a sequence of element from the target that follow the elements being changed (SEQOUT); {{CONTEXTIN}} is an optional context provided  in the form {{(BEFORE . AFTER)}} where {{BEFORE}} is a sequence of elements from the source preceding the replacement elements (SEQIN), and {{AFTER}} is a sequence of elements from the source that follow the replacement elements (SEQIN);
52
53=== Procedures
54
55
56<procedure>npdiff:: A B [CONTEXT-LEN] -> (HUNK ... )</procedure>
57
58
59Performs comparison on the input sequences
60where {{A}} and {{B}} are two sequences to be compared,
61and {{CONTEXT-LEN}} is an optional context length, which is used by
62the {{HUNKS}} procedure to create context in the hunks of the edit
63script
64
65== About this egg
66
67=== Author
68
69[[/users/ivan-raikov|Ivan Raikov]]
70
71=== Version history
72; 2.0 Ported to CHICKEN 5 and yasos collections
73; 1.16 : Added missing else clause when constructing hunks for completely different sequences [thanks to Felix Winkelmann]
74; 1.15 : Increased amount of context used for remove hunks [related to a bug discovered by Daishi Kato]
75; 1.14 : Proper handling of boundary cases when either input sequence is empty [thanks to Daishi Kato]
76; 1.13 : Make sure context is included even if the two sequences are completely different
77; 1.12 : Updated version in setup file
78; 1.11 : Documentation converted to wiki format
79; 1.10 : Eliminated dependency on matchable
80; 1.9 : Ported to Chicken 4
81; 1.8 : Added missing files to the file manifest
82; 1.7 : Removed dependence on stack egg
83; 1.5 : Build script updated for better cross-platform compatibility
84; 1.4 : eggdoc documentation fix
85; 1.3 : License upgrade to GPL v3
86; 1.2 : Added license text to source files
87; 1.1 : Documentation fixes
88; 1.0 : Initial release
89
90=== License
91
92
93 Copyright 2007-2019 Ivan Raikov.
94 
95 This program is free software: you can redistribute it and/or modify
96 it under the terms of the GNU General Public License as published by
97 the Free Software Foundation, either version 3 of the License, or
98 (at your option) any later version.
99 
100 This program is distributed in the hope that it will be useful, but
101 WITHOUT ANY WARRANTY; without even the implied warranty of
102 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
103 General Public License for more details.
104 
105 A full copy of the GPL license can be found at
106 <http://www.gnu.org/licenses/>.
107
Note: See TracBrowser for help on using the repository browser.