source: project/gazette/src/issues/15.wiki @ 21838

Last change on this file since 21838 was 21838, checked in by Moritz Heidkamp, 9 years ago

gazette #15: fix a few typos (thanks florz), add chicken talk and update contribution note

File size: 20.4 KB
Line 
1((title . "Issue 15")
2 (authors "Alaric Snell-Pym" "Moritz Heidkamp")
3 (date . 1291685838))
4
5== 0. Introduction
6
7Welcome to issue 15 of the Chicken Gazette.
8
9== 1. The Hatching Farm
10
11* In his everlasting quest to provide the best documentation tooling
12  of all, [[user:jim-ursetto|Jim Ursetto]] created yet another
13  [[egg:chicken-doc|chicken-doc]] spin-off. Luckily it is not called
14  {{yacdso}}. Please welcome [[egg:manual-labor|manual-labor]]!
15 
16* [[user:ivan-raikov|Ivan Raikov]] went on and fixed an edge case
17   problem in the [[egg:npdiff|npdiff]] egg, leading to the release of
18   version 1.14. While he was at it he also released version 1.11 of
19   [[format-textdiff|format-textdiff]] fixing some bugs in the context
20   diff handler.
21
22* [[user:peter-lane|Peter Lane]] updated his
23  [[egg:dataset-utils|dataset-utils]] and
24  [[egg:classifiers|classifiers]] eggs. If machine learning or data
25  mining is your thing you might want to check those out!
26 
27* [[user:felix-winkelmann|Felix Winkelmann]] released version 0.2 of
28  the handy [[egg:progress-indicators|progress-indicators]] extension.
29 
30* Lastly, our daring [[user:peter-bex|Peter Bex]] wrestled with
31  [[egg:openssl|openssl]] once again. The new release 1.5 should
32  hopefully fix the remaining issues we had with the
33  [[http://wiki.call-cc.org/|Chicken Wiki]].
34
35
36== 2. Core development
37
38Chicken's {{master}} branch has seen two bug fix commits this
39week. One of them fixed a problem with {{thread-sleep!}} which failed
40when it was passed an inexact/flonum sleep-time. Thanks to Karel
41Miklav for reporting this! The other one fixed a bug which caused all
42kinds of trouble when {{cons}} got redefined. This was noticed by
43David Steiner when he tried following the code examples in
44[[http://mitpress.mit.edu/sicp/|SICP]] using Chicken.
45
46The {{experimental}} branch has been busy experimenting: In the
47aftermath of the [[egg:jbogenturfahi|jbogenturfa'i]] debugging session
48(see below), Felix decided to remove the lambda-lifter entirely since
49he figured it was rather ineffective at lifting lambdas. The
50scrutinizer on the other hand got some dents flattened and is here to
51stay. Finally, the aforementioned [[egg:manual-labor|manual-labor]] is
52the official (experimental) manual generator now!
53
54== 3. Chicken Talk
55
56Readers of [[/issues/11.html|issue 11]] may remember the
57[[http://www.call-cc.org/pictures/t-dose2010/index.html|T-DOSE picture gallery]]. Be
58informed that it now contains a few additional pictures, some of which
59are quite okay!
60
61The chicken-users list was dominated by [[user:alyn.post|Alan Post]]'s
62ongoing effort to create a Lojban parser this week, spanning across
63two threads. In the first of them he
64[[http://www.mail-archive.com/chicken-users@nongnu.org/msg12661.html|details his progress on optimizing the code's compilation]]
65while in the second
66[[http://www.mail-archive.com/chicken-users@nongnu.org/msg12705.html|he reports on problems he encountered when running it for the first time]]. As
67it turned out, they were caused by using {{equal?}} on a list
68containing circular data resulting in an endless recursion and finally
69in a stack overflow. A change request is imminent to ameliorate this
70situation.
71
72Christian Kellermann kindly announced the all new
73[[http://www.mail-archive.com/chicken-users@nongnu.org/msg12711.html|Gazette authorship organisation scheme]]. Check
74it out if you are interested in contributing or helping with a future
75Gazette issue!
76
77== 4. Omelette Recipes
78
79Today we're going to look at the [[egg:amb|amb]] egg. This egg
80implements the {{amb}} operator, much-loved as an example of the use
81of continuations to alter control flow in exciting ways, unusual
82evaluation orders, and a mind-altering image of a world where
83computers exploit parallel universes or quantum mechanics to explore
84multiple branches of a recursion.
85
86However, {{amb}} is sometimes a useful tool; there's been a few cases
87where I've wished it was available in projects I've done in other
88languages, and I'm going to simplify one of them into an example.
89
90But first, let's look at what {{amb}} does (and a little about how it
91works). Basically, {{amb}} is a form (it can be a macro or a
92procedure, and the difference in effect is immaterial for our purposes
93in this recipe) that has a variable number of arguments and, in
94principle, returns them all in separate "threads"; it can be thought
95of as a bit like POSIX's {{fork()}}, except lightweight. However, the
96intent is not to exploit parallelism (most implementations of {{amb}}
97do not provide any kind of pre-emption; each "thread" runs until it
98terminates, then lets another run), but to explore different possible
99control flows. As such, there is an {{amb-assert}} form that, if its
100argument is #f, kills the current "thread" and runs another.
101
102So every time your program performs an {{amb}}, multiple threads pop
103into existence; and whenever it performs an {{amb-assert}}, some threads
104are culled. The principle is, whenever you have a point in your
105program where a choice must be made, you can use {{amb}} to have the
106program split up and explore every possible choice; and when it turns
107out, further down the line, to have been the wrong choice, you can
108kill it, freeing the CPU to explore another choice.
109
110This is usually demonstrated with a program that solves a logic puzzle
111of the form "Peter, Jim, Moritz and Felix are hacking on the Chicken
112core. Peter is working only on source files written in Scheme. Moritz
113works only on {{posix.scm}}, whoever works on the expander also works
114on the compiler, you can't work on the scheduler without also working
115on {{csi}}, ...and so on... So, who introduced bug #232?". Which is
116all well and good for an undergraduate programming exercise, but who
117encounters such problems in real life?
118
119Well, the general class of problem is a "tree search problem". The
120idea is you have some situation that can be modelled as a tree of
121possibilities (potentially infinite!), with solutions dotted around
122the tree; and our goal is to find perhaps all the solutions, perhaps
123just any solution, or perhaps the solution nearest to the root of the
124tree. Practical examples include finding a record in a B-Tree (where
125the tree structure actually corresponds to physical index blocks on
126disk), solving logic puzzles (where the tree structure is a purely
127logical tree of possible situations, some of which may yield
128solutions), or choosing the best move to make in a game (where the
129tree represents possible states of the game, and the moves that can
130move between states).
131
132These kinds of algorithms are normally written as recursive functions,
133which take whatever represents a position on the tree as an argument,
134check to see if there's a solution there (and handle it
135appropriately if so, which may involve terminating the search if we've
136found enough solutions), then work out all the positions below this
137one in the tree and recursively call the function on each in
138turn. This forces a "depth-first" search, as the recursion will only
139bottom out if a solution is found (that terminates the search
140entirely) or a subtree is totally exhausted of options. Going for a
141"breadth-first" search, where every child of the current position is
142checked for solutions before starting to recurse down into the
143children of the children, is a little harder to implement; rather than
144just writing a recursive function that explores the tree, one must
145keep track of a queue of subtrees that will need exploring in future,
146pushing newly-found subtrees onto the back of the queue rather than
147exploring them as soon as they're found. However, breadth-first
148searches will find the solutions nearest to the top of the tree first,
149and will not flounder if they encounter infinite subtrees with no
150solutions in, so they are often more attractive than depth-first.
151
152However, rather than writing a depth-first recursive search function, or a
153breadth-first search function using a queue, you can also implement
154these search problems as a simple function using {{amb}}. One benefit
155of this is that one can swap the {{amb}} implementation used from
156depth-first to breadth-first to choose the best search strategy,
157without changing your program - but that's a side benefit. The real
158benefit is, of course, clearer, more maintainable code - which makes
159the {{amb}} egg worth its memory usage in leaked diplomatic cables!
160
161Let's look at a real example from my sordid past. I once wrote an app
162(in C, alas) to help people navigate the UK's complex welfare system,
163which basically lets people apply to receive money back from the
164government if they meet certain criteria. There's a lot of different
165benefits, each of which with different eligibility rules, and complex
166rules to work out how much benefit you're entitled to based on your
167circumstances. It's a nightmare, and exactly the sort of thing
168computers are good at. So I wrote a program to work it out. Now,
169working out eligibility for a benefit, and how much can be claimed,
170involves asking the user a lot of questions, which is tiresome. Many
171of these questions are shared between different benefits (or different
172branches of the complex flowchart you have to follow to work out some
173of the hairier benefits), and also, many of these questions need only
174be asked if certain paths are explored (questions about your child
175need only be asked if you're looking into benefits that are meant to
176help with the costs of raising children - which are only worth
177exploring at all if you actually have children; and the ones relating
178to pregnancy, well, there's no point in asking about any of them if
179the answer to "What is your biological gender?" was "Male".) We want
180to ask the minimum number of questions we can - so we shouldn't ask
181the same question twice, and we shouldn't ask questions unless we need
182the answers. The first problem can be solved by keeping a database of
183answers, and only asking the question if the desired information isn't
184already in the database; the second problem has to be solved by
185organising the control flow of the process to ask questions that can
186eliminate entire branches of enquiry up-front. Which implies a tree
187search!
188
189As it happens, in C, I wrote a horrible nest of tangled-looking code
190that basically followed all the rules to work out what benefits you
191were entitled to. It worked, but it was hard to maintain - and the
192benefits rules change frequently. Sometimes the order of questions
193being asked or calculations being performed mattered (you needed to
194ask things up front to eliminate possibilities) and sometimes it was
195irrelevant, as I had to put them in some order, and only my comments
196in the code made it clear which was which. A big part of the problem
197was that I had to arrange the computation around the asking of the
198questions; this was fine when we could ask a single question that
199would choose the path to take (if the client is male, don't ask about
200pregnancy), but sometimes we had to work out several benefits that
201were available, working out each case in turn (which was complex, as
202claiming some benefits altered your eligibility for others) to see
203which one would work out best for the client - rejecting any they
204turned out to not be eligible for. The resulting code was messy
205indeed.
206
207But enough prose - let's get to an example. With some code!
208
209I can't remember the details of the UK welfare system as of the late
2101990s, and it was incredibly tedious anyway. So we'll work on a
211fictitious over-simplified example, to get to the heart of the matter
212easily.
213
214Let's say we have these benefits available:
215
216* Low Income Credit, which is payable to people who earn less than
217  £10,000 a year, and whose partner (if the have one) earns less than
218  £15,000 a year. You get given £(30,000 - total income of
219  person/couple) / 10 per year, split into monthly payments.
220
221* Housing Benefit, which is payable to people or couples who earn less
222  than £20,000 a year between them, or £15,000 if they have children,
223  and who live in rented accomodation, and are not claiming Low Income
224  Credit. You get £2,500 a year, in monthly payments, if you have
225  children or earn less than £15,000; £2,000 a year if you do not have
226  children and earn more than £15,000.
227
228* Carer's Allowance, which is payable to people whose partners are so
229  ill that they need help with basic household tasks. The partner must
230  not be earning any income for this to be claimed. It's a flat £1,000
231  a year. However, being an "allowance" rather than a "credit" or a
232  "benefit", it counts as income so may affect other benefits.
233
234* Family Credit, which is available to the parent of a child (as long
235  as the other parent is not also claiming it for the same
236  child). It's a flat £1,000 a year, unless you (and your partner, if
237  you have one) together earn more than £30,000 a year.
238
239Now, on to the code. To avoid asking the same question more than once,
240we can keep a global alist of asked questions:
241
242<enscript highlight="scheme">
243 (use srfi-1)
244
245 (define *asked-questions* '())
246
247 (define (ask question)
248   (let ((existing (assoc question *asked-questions*)))
249     (if existing
250         (cdr existing)
251         (begin
252           (write question) (newline)
253           (let ((answer (read-line)))
254             (set! *asked-questions*
255                   (cons (cons question answer)
256                         *asked-questions*))
257             answer)))))
258</enscript>
259
260As we use an actual global variable, this state will be preserved
261even if we backtrack with {{amb}}.
262
263We can wrap that in a few functions that ask useful questions:
264
265<enscript highlight="scheme">
266 (define (income allowances)
267   (+ allowances (string->number
268      (ask "What is your income, not including any allowances? Enter 0 for none"))))
269
270 (define (has-partner?)
271   (string=? (ask "Do you have a partner?") "yes"))
272
273 (define (partner-is-ill?)
274   (if (has-partner?)
275       (string=? (ask "Does your partner need help around the house?") "yes")
276       #f))
277
278 (define (renting?)
279   (string=? (ask "Do you rent your home?") "yes"))
280
281 (define (partner-income)
282   (if (has-partner?)
283       (string->number
284        (ask "What is your partner's income, including any allowances? Enter 0 for none"))
285       0))
286
287 (define (family-income allowances)
288   (+ (income allowances) (partner-income)))
289
290 (define (num-children)
291   (string->number (ask "How many children do you have?")))
292</enscript>
293
294Now, clearly, "effective income" is the actual earnings of the person
295or couple, plus any "allowances" they receive, and what other benefits
296are being claimed may affect the computation of a benefit. So we will
297phrase our functions to work out the benefits as having an argument
298which is a record giving details of what else they're claiming. We can
299then write our basic benefit calculators as fairly direct translations
300of the rules, using {{amb-assert}} from the amb egg to reject any
301invalid benefits:
302
303<enscript highlight="scheme">
304 (use amb)
305
306 (define-record benefits
307   claimed allowances)
308
309 (define (claim-benefit existing-benefits benefit amount allowance?)
310   (make-benefits
311    (cons (cons benefit amount)
312          (benefits-claimed existing-benefits))
313    (if allowance?
314        (+ amount (benefits-allowances existing-benefits))
315        (benefits-allowances existing-benefits))))
316
317 (define (claiming? existing-benefits benefit)
318   (assq benefit (benefits-claimed existing-benefits)))
319
320 (define (compute-lic other-benefits)
321   (amb-assert (< (income (benefits-allowances other-benefits)) 10000))
322   (amb-assert (not (claiming? other-benefits 'hb)))
323   (if (has-partner?)
324       (amb-assert (< (partner-income) 15000)))
325   (claim-benefit
326    other-benefits
327    'lic (/ (- 30000 (family-income (benefits-allowances other-benefits))) 10) #f))
328
329 (define (compute-hb other-benefits)
330   (if (zero? (num-children))
331       (amb-assert (< (family-income (benefits-allowances other-benefits)) 20000))
332       (amb-assert (< (family-income (benefits-allowances other-benefits)) 25000)))
333   (amb-assert (renting?))
334   (amb-assert (not (claiming? other-benefits 'lic)))
335   (claim-benefit
336    other-benefits
337    'hb (if (and (zero? (num-children)) (> (family-income) 15000))
338            2000
339            2500) #f))
340
341 (define (compute-ca other-benefits)
342   (amb-assert (zero? (partner-income)))
343   (amb-assert (partner-is-ill?))
344   (claim-benefit
345    other-benefits
346    'ca 1000 #t))
347
348 (define (compute-fc other-benefits)
349   (amb-assert (not (zero? (num-children))))
350   (amb-assert (< (family-income (benefits-allowances other-benefits)) 30000))
351   (claim-benefit
352    other-benefits
353    'fc 1000 #f))
354</enscript>
355
356Having done that, we can try out all the possibilities using {{amb}}
357to decide whether we try to claim each benefit or not:
358
359<enscript highlight="scheme">
360 (define (compute-benefits)
361   (let* ((initial-state (make-benefits '() 0))
362          ;; Compute allowances
363          (claim-ca (amb #t #f))
364          (after-ca (if claim-ca (compute-ca initial-state) initial-state))
365          ;; Compute other benefits
366          (claim-fc (amb #t #f))
367          (after-fc (if claim-fc (compute-fc after-ca) after-ca))
368          (claim-hb (amb #t #f))
369          (after-hb (if claim-hb (compute-hb after-fc) after-fc))
370          (claim-lic (amb #t #f))
371          (after-lic (if claim-lic (compute-lic after-hb) after-hb)))
372     after-lic))
373</enscript>
374
375What does {{compute-benefits}} actually do? For each benefit, it
376"splits the world" using {{amb}} into two versions - one where we try
377and claim this benefit, and one where we don't. We compute Carer's
378Allowance first, as it is an allowance which might affect later
379calculations; we can't compute any income-dependent benefits until
380we've decided if we're claiming this one or not, so it has to go
381first. The order of the others was picked arbitrarily.
382
383{{compute-benefits}} then returns the final state of affairs as a
384benefits record, but it will return several times with different
385possible final states. We need to find them all, and pick the one with
386the highest benefits earned. The way to do that is with
387{{amb-collect}}, which takes an expression and makes a list of all the
388results that expression returns in different threads:
389
390<enscript highlight="scheme">
391 (define (total-benefits benefits)
392   (fold (lambda (claim sum-so-far)
393           (+ (cdr claim) sum-so-far))
394         0
395         (benefits-claimed benefits)))
396
397 (define (best-benefits-to-claim)
398   (let ((options
399          (sort
400           (amb-collect (compute-benefits))
401           (lambda (a b)
402             (< (total-benefits b)
403                (total-benefits a))))))
404     (display "Your best option is to claim the following: ")
405     (write (benefits-claimed (car options)))
406     (newline)))
407</enscript>
408
409Running {{(best-benefits-to-claim)}} will ask you some questions (if
410it's the first time you've run it, at any rate) and then suggest how
411much to claim of which benefits:
412
413 #;22> (best-benefits-to-claim)
414 "Do you have a partner?"
415 #;22> no
416 "How many children do you have?"
417 #;22> 2
418 "What is your income, not including any allowances? Enter 0 for none"
419 #;22> 15000
420 "Do you rent your home?"
421 #;22> yes
422 Your best option is to claim the following: ((hb . 2500) (fc . 1000))
423
424You can try again if you clear the {{*asked-questions*}} store:
425
426 #;25> (set! *asked-questions* '())
427 #;26> (best-benefits-to-claim)
428 "Do you have a partner?"
429 #;26> yes
430 "What is your partner's income, including any allowances? Enter 0 for none"
431 #;26> 0
432 "Does your partner need help around the house?"
433 #;26> yes
434 "How many children do you have?"
435 #;26> 2
436 "What is your income, not including any allowances? Enter 0 for none"
437 #;26> 14500
438 "Do you rent your home?"
439 #;26> yes
440 Your best option is to claim the following: ((hb . 2500) (fc . 1000) (ca . 1000))
441
442The resulting code is highly extensible. The entire complexities of
443choosing which benefits to go for are reduced to listing the
444requirements of each benefit inside its definition, using
445{{amb-assert}}s, then stacking up the benefits in the {{let*}} inside
446{{compute-benefits}}. If {{compute-benefits}} became much more
447complex, I'd be inclined to put the benefits functions into two
448lists (one for allowances, one for the rest, to ensure allowances are
449covered first) and iterate through them.
450
451This example is a bit simplified and contrived - imagine each benefit
452has tens to hundreds of computation steps, many of which involve
453asking a question, and many of which depend on the results of previous
454calculations or assumptions; and imagine there are fifteen different
455benefits of varying complexity levels. And then imagine that the rules
456for each benefit change periodically, so you need to minimize the
457amount of duplication of information in your formulation of the rules.
458
459Aren't you glad to be a smug Lisp weenie?
460
461== 5. About the Chicken Gazette
462
463The Gazette is produced weekly by a volunteer from the Chicken
464community. The latest issue can be found at
465[[http://gazette.call-cc.org]] or you can follow it in your feed
466reader at [[http://gazette.call-cc.org/feed.atom]]. If you'd like to
467write an issue, [[http://wiki.call-cc.org/gazette|consult the wiki]]
468for the schedule and instructions!
Note: See TracBrowser for help on using the repository browser.