source: project/gazette/src/issues/13.wiki @ 21514

Last change on this file since 21514 was 21514, checked in by sjamaan, 10 years ago

Add initial version of Gazette 13, with the omelette recipe since I don't have time to write it until sunday evening

File size: 14.9 KB
Line 
1((title . "Issue 13")
2 (authors "Peter Bex")
3 (date . 1290429658))
4
5== 0. Introduction
6
7Welcome to issue 13 of the Chicken Gazette!
8
9== 1. The Hatching Farm
10
11* [[user:sjamaan|Peter Bex]] released a fix for a problem in the [[egg:postgresql|postgresql]] egg with the builtin "name" datatype, as well as a problem with NUL bytes in byte-arrays.
12
13
14== 2. Core development
15
16Felix has added a workaround to the scheduler for a problem with
17threads that accidentally get blocked on a filedescriptor but are not
18marked as blocked for I/O, which may help fix a few race conditions.
19
20Felix has both fixed the {{,g}} debugging toplevel command and added
21some more feedback to it so it shows the variable being fetched.
22
23== 3. Chicken Talk
24
25Just before the previous gazette was issued, Nicolas Pelletier posted [[http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00101.html|a follow-up]] to [[http://lists.nongnu.org/archive/html/chicken-users/2010-07/msg00136.html|an old thread of his]] about an overflow problem with the {{current-time}} procedure. He reported that he had tested the changes Felix made in the "flonum-milliseconds" branch, and they were working perfectly for him.
26
27== 4. Omelette Recipes
28
29With this week's omelette recipe we'll dip our toes in some of the more advanced uses of the insanely great [[http://www.postgresql.org|PostgreSQL]] database.  With the [[egg:postgresql|postgresql egg]] I've been trying to unlock some advanced features through a high-level API, so it becomes a breeze to use them!
30
31But let's start simple. First, we create a database and matching user
32account so we can mess around with impunity:
33
34  $ psql -U postgres  # or psql -U pgsql postgres (depending on OS/package)
35  psql (8.4.4)
36  Type "help" for help.
37 
38  postgres=# CREATE USER gazette WITH PASSWORD 'whatever';
39  CREATE ROLE
40  postgres=# CREATE DATABASE gazette WITH OWNER gazette;
41  CREATE DATABASE
42
43If the server is running, but the connection cannot be established,
44you might need to configure {{pg_hba.conf}} to allow
45connections. Please see the
46[[http://www.postgresql.org/docs/current/interactive/auth-pg-hba-conf.html|pg_hba.conf section of the Postgres manual]] for more info.  The easiest is to
47set "trust" for "local" connections, assuming you can secure access to
48the unix domain socket file.
49
50If you ever need a reminder about SQL syntax, you can type "{{\h CREATE USER}}"
51or even just "{{\h CREATE}}".
52
53Now, let's fire up Chicken and connect to the database:
54
55<enscript highlight="scheme">
56(use postgresql)
57(define con (connect '((dbname . "gazette")
58                       (user . "gazette")
59                       (password . "whatever"))))
60</enscript>
61
62The alist you pass to the {{connect}} procedure can contain all the
63options described in the [[http://www.postgresql.org/docs/current/interactive/libpq-connect.html|list of PQconnectdbParams parameter keywords]].  You can also
64pass a connection string as accepted by {{PQconnectdb}}, which is just a space-separated string that looks like this: {{"dbname=gazette user=gazette password=whatever"}}
65
66Let us assume we're going to store daily average temperature
67measurements as part of a weather service. Assuming we have several
68stations across the nation that measure the temperature and we don't
69want to mess around with datetime types (as those are somewhat
70problematic in Chicken in my opinion), we could set up a table like
71this:
72
73<enscript highlight="scheme">
74(query con (conc "CREATE TABLE weather "
75                 "( year integer, "
76                 "  month integer, "
77                 "  day integer, "
78                 "  avg_temps float[], "
79                 "  PRIMARY KEY (year, month, day))"))
80</enscript>
81
82'''Disclaimer''': Note that the datatypes we're going to define are
83probably not the best way to deal with this dataset.  It's intended as
84just a simple demonstration.  Good database design is still an art,
85and what's good design depends on how you're going to use your data!
86Rule of thumb: think of the queries you're going to do most often and
87optimise for those.  It's just like choosing datatypes or algorithms
88in a programming language...
89
90The temperatures column contains arrays of floating point
91numbers. This maps to a vector in Chicken:
92
93<enscript highlight="scheme">
94;; Store temperatures at five measuring stations on 22 November
95(query con (conc "INSERT INTO weather (year, month, day, avg_temps) "
96                 "VALUES ($1, $2, $3, $4)")
97           2010 11 22 (vector 6 7.5 5.2 4.6 5.4))
98</enscript>
99
100The dollar signs are placeholders, on which the positional value
101corresponding to the number after the dollar gets inserted.  These
102values are the extra arguments to {{query}} following the query
103string.  The library is smart enough to know how to convert vectors to
104Postgres native arrays.  By the way, always use placeholders like
105this.  Never ''ever'' splice values directly into the query string!
106You'd risk a visit from [[http://www.bobby-tables.com|little Bobby Tables]]...
107
108When we request our data back into Scheme, we get a vector again:
109
110<enscript highlight="scheme">
111(value-at (query con (conc "SELECT avg_temps "
112                           "FROM weather "
113                           "WHERE year = $1 AND month = $2 AND day = $3")
114                     2010 11 22))
115=>
116#(6 7.5 5.2 4.6 5.4)
117</enscript>
118
119[[http://3e8.org/chickadee/postgresql#def:value-at|value-at]] can return
120any value located at any row or column from a result set matrix, but by
121default, it'll pick the first column of the first row.
122
123Now let's suppose that we want to store not only the average
124temperature, but also the amount of rain in millimeters that fell that day.
125We could just add another column, but let's say we prefer to keep the data
126provided by each measuring station together. In order to do so, we'll
127create a custom
128[[http://www.postgresql.org/docs/current/interactive/rowtypes.html|"composite type"]]:
129
130<enscript highlight="scheme">
131(query con "CREATE type weatherdata AS ( rainfall float, avg_temp float )")
132</enscript>
133
134And now we'll need to convert our existing data to make use of this
135new type.  For safety, we'll wrap it in a transaction so if we make a
136mistake we won't end up with a half-converted or corrupted table.  If
137you don't grok the UPDATE query, just skip it and assume it converts
138properly from the old to the new representation, using NULL values for
139the rainfall of the converted records.
140
141<enscript highlight="scheme">
142(with-transaction con
143  (lambda ()
144    (query con "ALTER TABLE weather ADD COLUMN data weatherdata[]")
145    (query con (conc "UPDATE weather w "
146                     ;; Set the data to a generated row array:
147                     "SET data=arrays.r "
148                     "FROM (SELECT array_agg(ROW(NULL, t[i])::weatherdata) AS r, "
149                     "             flat.year, flat.month, flat.day "
150
151                     ;; This next SELECT returns the valid indices for each
152                     ;; array and that array, so you'd get something like this:
153                     ;;
154                     ;;  1 | {a, b}    | year | month | day
155                     ;;  2 | {a, b}    | year | month | day
156                     ;;  1 | {c, d, e} | year | month | day
157                     ;;  2 | {c, d, e} | year | month | day
158                     ;;  3 | {c, d, e} | year | month | day
159                     ;; ... etc ...
160
161                     "      FROM (SELECT generate_subscripts(avg_temps, 1) i, "
162                     "                   avg_temps t, year, month, day "
163                     "            FROM weather "
164                     "            ORDER BY i) AS flat "
165                     "      GROUP BY flat.year, flat.month, flat.day) AS arrays "
166                     
167                     ;; Match up the data belonging to the row being updated
168                     "WHERE arrays.year = w.year "
169                     "  AND arrays.month = w.month "
170                     "  AND arrays.day = w.day"))
171    (query con "ALTER TABLE weather DROP COLUMN avg_temps")))
172</enscript>
173
174It's a little convoluted, but that's one of the disadvantages of
175working with arrays; all SQL operations are set-based so you will need
176to jump through some hoops to convert back and forth.
177
178The {{with-transaction}} protects us from stupid mistakes. If an
179exception is thrown inside, it will automatically roll back the
180transaction, and since Postgres supports transactions almost
181everywhere, the alter table and update statements will be undone.
182Anyone else using the table in the meanwhile will not even notice
183anything was going on with it.
184
185Anyway, we can now add some new values to the database for the next day,
186and retrieve what we have so far:
187
188<enscript highlight="scheme">
189(query con (conc "INSERT INTO weather (year, month, day, data)"
190                 "VALUES ($1, $2, $3, $4)")
191           2010 11 23 '#((11 4.2) (8.3 5.8) (6.0 7.1) (7.3 6.2) (12 4)))
192(row-fold-right cons '() (query con (conc "SELECT year, month, day, data "
193                                          "FROM weather ORDER BY day")))
194=>
195((2010 11 22 #((#<sql-null-type> 6)
196               (#<sql-null-type> 7.5)
197               (#<sql-null-type> 5.2)
198               (#<sql-null-type> 4.6)
199               (#<sql-null-type> 5.4)))
200 (2010 11 23 #((4.2 11) (5.8 8.3) (7.1 6) (6.2 7.3) (4 12))))
201</enscript>
202
203{{Row-fold-right}} simply performs a right fold of the supplied
204procedure over the resulting rows, with the column values provided in
205a list, in the order of the SELECT list's fields.
206
207Now, the composite values from our weatherdata type are represented as
208ordinary lists.  The position in the list corresponds to the position
209in the "ROW" type's constructor syntax, which itself corresponds to
210the ordering of the types in the {{CREATE TYPE}} statement we entered
211earlier.  Of course this is not the most convenient way to deal with
212types, so let's improve the situation by creating an abstract type in
213Scheme to match the one in SQL, using the [[egg:defstruct|defstruct egg]]:
214
215<enscript highlight="scheme">
216(use defstruct)
217(defstruct weather-data rainfall average-temperature)
218
219;; For debugging purposes:
220(define-record-printer (weather-data w out)
221  (fprintf out "{rain: ~A temp: ~A}"
222               (weather-data-rainfall w)
223               (weather-data-average-temperature w)))
224</enscript>
225
226This struct somehow needs to be serialized to an SQL row value (you can look
227up the exact expected syntax in the [[http://www.postgresql.org/docs/current/interactive/rowtypes.html|Composite Value Input section]] of the Postgres manual):
228
229<enscript highlight="scheme">
230(define (weather-data-unparser conn w)
231  (sprintf "(~A, ~A)"
232           (weather-data-rainfall w)
233           (weather-data-average-temperature w)))
234</enscript>
235
236In case of more complex types, you can also re-use the
237[[http://3e8.org/chickadee/postgresql#def:list-unparser|list-unparser procedure]],
238which simply unparses all elements in a list.  This is highly recommended when
239you are unparsing string values, since those may contain characters which
240have special meaning in the ROW syntax.  That would look like this:
241
242<enscript highlight="scheme">
243(define (weather-data-unparser conn w)
244  (list-unparser conn (list (weather-data-rainfall w)
245                            (weather-data-average-temperature w))))
246</enscript>
247
248While we're at it, let's also write a weather-data parser which can
249extract a weather data object from a string describing a ROW returned
250by the database:
251
252<enscript highlight="scheme">
253(define weather-data-parser
254  (let ((constituent-parser (make-composite-parser
255                              (list numeric-parser numeric-parser))))
256    (lambda (str)
257      (let ((l (constituent-parser str)))
258        (make-weather-data rainfall: (car l)
259                           average-temperature: (cadr l))))))
260</enscript>
261
262And now we hook these into the connection object:
263
264<enscript highlight="scheme">
265(update-type-parsers! con
266  (cons (cons "weatherdata" weather-data-parser)
267        (type-parsers con)))
268
269(update-type-unparsers! con
270  (cons (cons weather-data? weather-data-unparser)
271        (type-unparsers con)))
272</enscript>
273
274The first part says the "weatherdata" type (which you can find in the
275{{pg_type}} system table) can be parsed by the {{weather-data-parser}}
276procedure.  Some magic ensures that parsers for arrays and other
277composite types containing elements of this type are available.  If
278you need to, you could override the parser for weatherdata arrays by
279providing a parser for the type called "_weatherdata".
280
281The second part says that when {{query}} receives an argument for
282which the {{weather-data?}} predicate returns {{#t}}, it should be
283unparsed by the {{weather-data-unparser}} procedure.
284
285Finally, we can work with a more readable (if more verbose) interface:
286
287<enscript highlight="scheme">
288(query con (conc "INSERT INTO weather (year, month, day, data)"
289                 "VALUES ($1, $2, $3, $4)")
290           2010 11 24 (vector (make-weather-data rainfall: 12
291                                                 average-temp: 6)
292                              (make-weather-data rainfall: 11.2
293                                                 average-temp: 4)
294                              (make-weather-data rainfall: 10.1
295                                                 average-temp: 4.7)
296                              (make-weather-data rainfall: 9.4
297                                                 average-temp: 2.5)
298                              (make-weather-data rainfall: 11
299                                                 average-temp: 4.3)))
300(row-fold-right cons '() (query con (conc "SELECT year, month, day, data "
301                                          "FROM weather ORDER BY day")))
302=>
303((2010 11 22 #({rain: #<sql-null-type> temp: 6}
304               {rain: #<sql-null-type> temp: 7.5}
305               {rain: #<sql-null-type> temp: 5.2}
306               {rain: #<sql-null-type> temp: 4.6}
307               {rain: #<sql-null-type> temp: 5.4}))
308 (2010 11 23 #({rain: 11 temp: 4.2}
309               {rain: 8.3 temp: 5.8}
310               {rain: 6 temp: 7.1}
311               {rain: 7.3 temp: 6.2}
312               {rain: 12 temp: 4}))
313 (2010 11 24 #({rain: 12 temp: 6}
314               {rain: 11.2 temp: 4}
315               {rain: 10.1 temp: 4.7}
316               {rain: 9.4 temp: 2.5}
317               {rain: 11 temp: 4.3})))
318</enscript>
319
320If you want the parsers to be available for all connections, just
321parameterize the {{default-type-parsers}} and
322{{default-type-unparsers}}.
323
324Whew!  This omelette recipe grew a little longer than I expected it to
325become, but I hope it shows that PostgreSQL is an extensible database,
326and as such can be used a lot like Lispy languages; you can extend it
327to fit your program's problem domain, and best of all: you can
328integrate those database-extensions right into Chicken!
329
330For those who are motivated to go
331[[http://www.postgresql.org/docs/current/interactive/sql-createtype.html|further down the rabbit hole]],
332you can try to figure out how to define type ''hierarchies'' or custom
333''base types''; atomic types which have their own custom "reader
334syntax" inside Postgres.  You'd write those readers/writers in C or
335another language embedded in Postgres.  Any takers for making Chicken
336one of those?
337
338== 5. About the Chicken Gazette
339
340The Gazette is produced weekly by a volunteer from the Chicken
341community. The latest issue can be found at
342[[http://gazette.call-cc.org]] or you can follow it in your feed
343reader at [[http://gazette.call-cc.org/feed.atom]]. If you'd like to
344write an issue,
345[[http://bugs.call-cc.org/browser/gazette/README.txt|check out the instructions]]
346and come and find us in #chicken on Freenode!
Note: See TracBrowser for help on using the repository browser.