Changeset 21833 in project for gazette


Ignore:
Timestamp:
12/07/10 02:55:31 (8 years ago)
Author:
alaric
Message:

gazette: Merged in my amb-based omelette recipe

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gazette/src/issues/15.wiki

    r21832 r21833  
    4444== 4. Omelette Recipes
    4545
    46 Alaric's turn.
     46
     47Today we're going to look at the [[egg:amb|amb]] egg. This egg
     48implements the {{amb}} operator, much-loved as an example of the use
     49of continuations to alter control flow in exciting ways, unusual
     50evaluation orders, and a mind-altering image of a world where
     51computers exploit parallel universes or quantum mechanics to explore
     52multiple branches of a recursion.
     53
     54However, {{amb}} is sometimes a useful tool; there's been a few cases
     55where I've wished it was available in projects I've done in other
     56languages, and I'm going to simplify one of them into an example.
     57
     58But first, let's look at what {{amb}} does (and a little about how it
     59works). Basically, {{amb}} is a form (it can be a macro oor a
     60procedure, and the difference in effect is immaterial for our purposes
     61in this recipe) that has a variable number of arguments and, in
     62principle, returns them all in separate "threads"; it can be thought
     63of as a bit like POSIX's {{fork()}}, except lightweight. However, the
     64intent is not to exploit parallelism (most implementations of {{amb}}
     65do not provide any kind of pre-emption; each "thread" runs until it
     66terminates, then lets another run), but to explore different possible
     67control flows. As such, there is a {{amb-assert}} form that, if its
     68argument is #f, kills the current "thread" and runs another.
     69
     70So every time your program performs an {{amb}}, multiple threads pop
     71into existance; and whenever it performs an {{amb-assert}}, some threads
     72are culled. The principle is, whenever you have a point in your
     73program where a choice must be made, you can use {{amb}} to have the
     74program split up and explore every possible choice; and when it turns
     75out, further down the line, to have been the wrong choice, you can
     76kill it, freeing the CPU to explore another choice.
     77
     78This is usually demonstrated with a program that solves a logic puzzle
     79of the form "Peter, Jim, Moritz and Felix are hacking on the Chicken
     80core. Peter is working only on source files written in Scheme. Moritz
     81works only on {{posix.scm}}, whoever works on the expander also works
     82on the compiler, you can't work on the scheduler without also working
     83on {{csi}}, ...and so on... So, who introduced bug #232?". Which is
     84all well and good for an undergraduate programming exercise, but who
     85encounters such problems in real life?
     86
     87Well, the general class of problem is a "tree search problem". The
     88idea is you have some situation that can be modelled as a tree of
     89possibilities (potentially infinite!), with solutions dotted around
     90the tree; and our goal is to find perhaps all the solutions, perhaps
     91just any solution, or perhaps the solution nearest the root of the
     92tree. Practical examples include finding a record in a B-Tree (where
     93the tree structure actually corresponds to physical index blocks on
     94disk), solving logic puzzles (where the tree structure is a purely
     95logical tree of possible situations, some of which may yield
     96solutions), or choosing the best move to make in a game (where the
     97tree represents possible states of the game, and the moves that can
     98move between states).
     99
     100These kinds of algorithms are normally written as recursive functions,
     101which take whatever represents a position on the tree as an argument,
     102check to see if there's a solution there (and handle it
     103appropriately if so, which may involve terminating the search if we've
     104found enough solutions), then work out all the positions below this
     105one in the tree and recursively call the function on each in
     106turn. This forces a "depth-first" search, as the recursion will only
     107botton out if a solution is found (that terminates the search
     108entirely) or a subtree is totally exhausted of options. Going for a
     109"breadth-first" search, where every child of the current position is
     110checked for solutions before starting to recurse down into the
     111children of the children, is a little harder to implement; rather than
     112just writing a recursive function that explores the tree, one must
     113keep track of a queue of subtrees that will need exploring in future,
     114pushing newly-found subtrees onto the back of the queue rather than
     115exploring them as soon as they're found. However, breadth-first
     116searches will find the solutions nearest to the top of the tree first,
     117and will not flounder if they encounter infinite subtrees with no
     118solutions in, so they are often more attractive than depth-first.
     119
     120However, rather than writing a depth-first recursive search function, or a
     121breadth-first search function using a queue, you can also implement
     122these search problems as a simple function using {{amb}}. One benefit
     123of this is that one can swap the {{amb}} implementation used from
     124depth-first to breadth-first to choose the best search strategy,
     125without changing your program - but that's a side benefit. The real
     126benefit is, of course, clearer, more maintainable code - which makes
     127the {{amb}} egg worth its memory usage in leaked diplomatic cables!
     128
     129Let's look at a real example from my sordid past. I once wrote an app
     130(in C, alas) to help people navigate the UK's complex welfare system,
     131which basically lets people apply to receive money back from the
     132government if they meet certain criteria. There's a lot of different
     133benefits, each of which with different eligibility rules, and complex
     134rules to work out how much benefit you're entitled to based on your
     135circumstances. It's a nightmare, and exactly the sort of thing
     136computers are good at. So I wrote a program to work it out. Now,
     137working out eligibility for a benefit, and how much can be claimed,
     138involves asking the user a lot of questions, which is tiresome. Many
     139of these questions are shared between different benefits (or different
     140branches of the complex flowchart you have to follow to work out some
     141of the hairier benefits), and also, many of these questions need only
     142be asked if certain paths are explored (questions about your child
     143need only be asked if you're looking into benefits that are meant to
     144help with the costs of raising children - which are only worth
     145exploring at all if you actually have children; and the ones relating
     146to pregnancy, well, there's no point in asking about any of them if
     147the answer to "What is your biological gender?" was "Male".) We want
     148to ask the minimum number of questions we can - so we shouldn't ask
     149the same question twice, and we shouldn't ask questions unless we need
     150the answers. The first problem can be solved by keeping a database of
     151answers, and only asking the question if the desired information isn't
     152already in the database; the second problem has to be solved by
     153organising the control flow of the process to ask questions that can
     154eliminate entire branches of enquiry up-front. Which implies a tree
     155search!
     156
     157As it happens, in C, I wrote a horrible nest of tangled-looking code
     158that basically followed all the rules to work out what benefits you
     159were entitled to. It worked, but it was hard to maintain - and the
     160benefits rules change frequently. Sometimes the order of questions
     161being asked or calculations being performed mattered (you needed to
     162ask things up front to eliminate possibilities) and sometimes it was
     163irrelevant, as I had to put them in some order, and only my comments
     164in the code made it clear which was which. A big part of the problem
     165was that I had to arrange the computation around the asking of the
     166questions; this was fine when we could ask a single question that
     167would choose the path to take (if the client is male, don't ask about
     168pregnancy), but sometimes we had to work out several benefits that
     169were available, working out each case in turn (which was complex, as
     170claiming some benefits altered your eligibility for others) to see
     171which one would work out best for the client - rejecting any they
     172turned out to not be eligible for. The resulting code was messy
     173indeed.
     174
     175But enough prose - let's get to an example. With some code!
     176
     177I can't remember the details of the UK welfare system as of the late
     1781990s, and it was incredibly tedious anyway. So we'll work on a
     179fictitious over-simplified example, to get to the heart of the matter
     180easily.
     181
     182Let's say we have these benefits available:
     183
     184* Low Income Credit, which is payable to people who earn less than
     185  £10,000 a year, and whose partner (if the have one) earns less than
     186  £15,000 a year. You get given £(30,000 - total income of
     187  person/couple) / 10 per year, split into monthly payments.
     188
     189* Housing Benefit, which is payable to people or couples who earn less
     190  than £20,000 a year between them, or £15,000 if they have children,
     191  and who live in rented accomodation, and are not claiming Low Income
     192  Credit. You get £2,500 a year, in monthly payments, if you have
     193  children or earn less than £15,000; £2,000 a year if you do not have
     194  children and earn more than £15,000.
     195
     196* Carer's Allowance, which is payable to people whose partners are so
     197  ill that they need help with basic household tasks. The partner must
     198  not be earning any income for this to be claimed. It's a flat £1,000
     199  a year. However, being an "allowance" rather than a "credit" or a
     200  "benefit", it counts as income so may affect other benefits.
     201
     202* Family Credit, which is available to the parent of a child (as long
     203  as the other parent is not also claiming it for the same
     204  child). It's a flat £1,000 a year, unless you (and your partner, if
     205  you have one) together earn more than £30,000 a year.
     206
     207Now, on to the code. To avoid asking the same question more than once,
     208we can keep a global alist of asked questions:
     209
     210<enscript highlight="scheme">
     211 (use srfi-1)
     212
     213 (define *asked-questions* '())
     214
     215 (define (ask question)
     216   (let ((existing (assoc question *asked-questions*)))
     217     (if existing
     218         (cdr existing)
     219         (begin
     220           (write question) (newline)
     221           (let ((answer (read-line)))
     222             (set! *asked-questions*
     223                   (cons (cons question answer)
     224                         *asked-questions*))
     225             answer)))))
     226</enscript>
     227
     228As we use an actual global variable, this state will be preserved
     229even if we backtrack with {{amb}}.
     230
     231We can wrap that in a few functions that ask useful questions:
     232
     233<enscript highlight="scheme">
     234 (define (income allowances)
     235   (+ allowances (string->number
     236      (ask "What is your income, not including any allowances? Enter 0 for none"))))
     237
     238 (define (has-partner?)
     239   (string=? (ask "Do you have a partner?") "yes"))
     240
     241 (define (partner-is-ill?)
     242   (if (has-partner?)
     243       (string=? (ask "Does your partner need help around the house?") "yes")
     244       #f))
     245
     246 (define (renting?)
     247   (string=? (ask "Do you rent your home?") "yes"))
     248
     249 (define (partner-income)
     250   (if (has-partner?)
     251       (string->number
     252        (ask "What is your partner's income, including any allowances? Enter 0 for none"))
     253       0))
     254
     255 (define (family-income allowances)
     256   (+ (income allowances) (partner-income)))
     257
     258 (define (num-children)
     259   (string->number (ask "How many children do you have?")))
     260</enscript>
     261
     262Now, clearly, "effective income" is the actual earnings of the person
     263or couple, plus any "allowances" they receive, and what other benefits
     264are being claimed may affect the computation of a benefit. So we will
     265phrase our functions to work out the benefits as having an argument
     266which is a record giving details of what else they're claiming. We can
     267then write our basic benefit calculators as fairly direct translations
     268of the rules, using {{amb-assert}} from the amb egg to reject any
     269invalid benefits:
     270
     271<enscript highlight="scheme">
     272 (use amb)
     273
     274 (define-record benefits
     275   claimed allowances)
     276
     277 (define (claim-benefit existing-benefits benefit amount allowance?)
     278   (make-benefits
     279    (cons (cons benefit amount)
     280          (benefits-claimed existing-benefits))
     281    (if allowance?
     282        (+ amount (benefits-allowances existing-benefits))
     283        (benefits-allowances existing-benefits))))
     284
     285 (define (claiming? existing-benefits benefit)
     286   (assq benefit (benefits-claimed existing-benefits)))
     287
     288 (define (compute-lic other-benefits)
     289   (amb-assert (< (income (benefits-allowances other-benefits)) 10000))
     290   (amb-assert (not (claiming? other-benefits 'hb)))
     291   (if (has-partner?)
     292       (amb-assert (< (partner-income) 15000)))
     293   (claim-benefit
     294    other-benefits
     295    'lic (/ (- 30000 (family-income (benefits-allowances other-benefits))) 10) #f))
     296
     297 (define (compute-hb other-benefits)
     298   (if (zero? (num-children))
     299       (amb-assert (< (family-income (benefits-allowances other-benefits)) 20000))
     300       (amb-assert (< (family-income (benefits-allowances other-benefits)) 25000)))
     301   (amb-assert (renting?))
     302   (amb-assert (not (claiming? other-benefits 'lic)))
     303   (claim-benefit
     304    other-benefits
     305    'hb (if (and (zero? (num-children)) (> (family-income) 15000))
     306            2000
     307            2500) #f))
     308
     309 (define (compute-ca other-benefits)
     310   (amb-assert (zero? (partner-income)))
     311   (amb-assert (partner-is-ill?))
     312   (claim-benefit
     313    other-benefits
     314    'ca 1000 #t))
     315
     316 (define (compute-fc other-benefits)
     317   (amb-assert (not (zero? (num-children))))
     318   (amb-assert (< (family-income (benefits-allowances other-benefits)) 30000))
     319   (claim-benefit
     320    other-benefits
     321    'fc 1000 #f))
     322</enscript>
     323
     324Having done that, we can try out all the possibilities using {{amb}}
     325to decide whether we try to claim each benefit or not:
     326
     327<enscript highlight="scheme">
     328 (define (compute-benefits)
     329   (let* ((initial-state (make-benefits '() 0))
     330          ;; Compute allowances
     331          (claim-ca (amb #t #f))
     332          (after-ca (if claim-ca (compute-ca initial-state) initial-state))
     333          ;; Compute other benefits
     334          (claim-fc (amb #t #f))
     335          (after-fc (if claim-fc (compute-fc after-ca) after-ca))
     336          (claim-hb (amb #t #f))
     337          (after-hb (if claim-hb (compute-hb after-fc) after-fc))
     338          (claim-lic (amb #t #f))
     339          (after-lic (if claim-lic (compute-lic after-hb) after-hb)))
     340     after-lic))
     341</enscript>
     342
     343What does {{compute-benefits}} actually do? For each benefit, it
     344"splits the world" using {{amb}} into two versions - one where we try
     345and claim this benefit, and one where we don't. We compute Carer's
     346Allowance first, as it is an allowance which might affect later
     347calculations; we can't compute any income-dependent benefits until
     348we've decided if we're claiming this one or not, so it has to go
     349first. The order of the others was picked arbitrarily.
     350
     351{{compute-benefits}} then returns the final state of affairs as a
     352benefits record, but it will return several times with different
     353possible final states. We need to find them all, and pick the one with
     354the highest benefits earned. The way to do that is with
     355{{amb-collect}}, which takes an expression and makes a list of all the
     356results that expression returns in different threads:
     357
     358<enscript highlight="scheme">
     359 (define (total-benefits benefits)
     360   (fold (lambda (claim sum-so-far)
     361           (+ (cdr claim) sum-so-far))
     362         0
     363         (benefits-claimed benefits)))
     364
     365 (define (best-benefits-to-claim)
     366   (let ((options
     367          (sort
     368           (amb-collect (compute-benefits))
     369           (lambda (a b)
     370             (< (total-benefits b)
     371                (total-benefits a))))))
     372     (display "Your best option is to claim the following: ")
     373     (write (benefits-claimed (car options)))
     374     (newline)))
     375</enscript>
     376
     377Running {{(best-benefits-to-claim)}} will ask you some questions (if
     378it's the first time you've run it, at any rate) and then suggest how
     379much to claim of which benefits:
     380
     381 #;22> (best-benefits-to-claim)
     382 "Do you have a partner?"
     383 #;22> no
     384 "How many children do you have?"
     385 #;22> 2
     386 "What is your income, not including any allowances? Enter 0 for none"
     387 #;22> 15000
     388 "Do you rent your home?"
     389 #;22> yes
     390 Your best option is to claim the following: ((hb . 2500) (fc . 1000))
     391
     392You can try again if you clear the {{*asked-questions*}} store:
     393
     394 #;25> (set! *asked-questions* '())
     395 #;26> (best-benefits-to-claim)
     396 "Do you have a partner?"
     397 #;26> yes
     398 "What is your partner's income, including any allowances? Enter 0 for none"
     399 #;26> 0
     400 "Does your partner need help around the house?"
     401 #;26> yes
     402 "How many children do you have?"
     403 #;26> 2
     404 "What is your income, not including any allowances? Enter 0 for none"
     405 #;26> 14500
     406 "Do you rent your home?"
     407 #;26> yes
     408 Your best option is to claim the following: ((hb . 2500) (fc . 1000) (ca . 1000))
     409
     410The resulting code is highly extensible. The entire complexities of
     411choosing which benefits to go for are reduced to listing the
     412requirements of each benefit inside its definition, using
     413{{amb-assert}}s, then stacking up the benefits in the {{let*}} inside
     414{{compute-benefits}}. If {{compute-benefits}} became much more
     415complex, I'd be inclined to put the benefits functions into two
     416lists (one for allowances, one for the rest, to ensure allowances are
     417covered first) and iterate through them.
     418
     419This example is a bit simplified and contrived - imagine each benefit
     420has tens to hundreds of computation steps, many of which involve
     421asking a question, and many of which depend on the results of previous
     422calculations or assumptions; and imagine there are fifteen different
     423benefits of varying complexity levels. And then imagine that the rules
     424for each benefit change periodically, so you need to minimize the
     425amount of duplication of information in your formulation of the rules.
     426
     427Aren't you glad to be a smug Lisp weenie?
    47428
    48429== 5. About the Chicken Gazette
Note: See TracChangeset for help on using the changeset viewer.