source: project/wiki/eggref/4/mpi @ 36460

Last change on this file since 36460 was 36460, checked in by Ivan Raikov, 21 months ago

mpi doc

File size: 18.7 KB
Line 
1[[tags:egg]]
2[[toc:]]
3== mpi
4
5Message Passing Interface (MPI)
6
7
8== Documentation
9
10[[http://www.mpi-forum.org/|MPI]] is a popular standard for
11distributed-memory parallel programming. It offers both point-to-point
12message passing and group communication operations (broadcast,
13scatter/gather, etc).
14
15[[http://www.open-mpi.org/|Open MPI]] is an implementation of the MPI
16standard that combines technologies and resources from several other
17projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) in order to build the
18best MPI library available.
19
20The Chicken MPI egg provides a Scheme interface to a large subset of
21the MPI 1.2 procedures for communication.  It is based on the
22[[http://forge.ocamlcore.org/projects/ocamlmpi/|Ocaml MPI]]
23library by Xavier Leroy. The {{mpi}} library has been tested with Open
24MPI versions 1.2.4 - 1.10.1.
25
26=== Initialization and time procedures
27
28
29<procedure>MPI:init :: [ARG1 ...] -> UNDEFINED</procedure>
30
31Initializes the MPI execution environment. This routine must be called
32before any other MPI routine. MPI can be initialized at most once.
33
34
35
36<procedure>MPI:spawn :: COMMAND * ARGUMENTS * MAXPROCS * LOCATIONS * ROOT * COMM -> (COMM * S32VECTOR)</procedure>
37
38Spawns {{MAXPROCS}} identical copies of the MPI program specified by
39{{COMMAND}} and returns an intercommunicator and a vector of status
40values. {{ARGUMENTS}} is a list of command-line
41arguments. {{LOCATIONS}} is a list of string pairs {{(HOST * WDIR)}}
42that tell MPI the host and working directory where to start processes.
43
44<procedure>MPI:finalize</procedure>
45
46Terminates the MPI execution environment.
47
48<procedure>MPI:wtime :: VOID -> SECONDS</procedure>
49
50Returns the number of seconds representing elapsed wall-clock time on
51the calling process.
52
53
54=== Handling of communicators
55
56
57<procedure>MPI:comm? :: OBJ -> BOOL</procedure>
58
59Returns true if {{OBJ}} is an MPI communicator object, false otherwise.
60
61<procedure>MPI:get-comm-world:: VOID -> COMM</procedure>
62
63Returns the default communicator created by {{MPI_Init}}; the group
64associated with this communicator contains all processes.
65
66<procedure>MPI:comm-size :: COMM -> INTEGER</procedure>
67
68Returns the size of the group associated with communicator {{COMM}}.
69
70
71<procedure>MPI:comm-rank :: COMM -> INTEGER</procedure>
72
73Returns the rank of the calling process in communicator {{COMM}}.
74
75
76<procedure>MPI:comm-equal? :: COMM1 * COMM2 -> BOOL</procedure>
77
78Returns true if the two given communicators are for identical groups, false otherwise.
79
80
81<procedure>MPI:comm-split :: COMM * COLOR * KEY -> COMM</procedure>
82
83Creates new communicators based on colors and keys.
84
85
86<procedure>MPI:comm-create :: COMM * GROUP -> COMM</procedure>
87
88Creates a new communicator with communication group that spans all
89processes in {{GROUP}} and a new context. See the procedures in
90subsection ''Handling of communication groups'' for information on how
91to create process group objects.
92
93
94
95<procedure>MPI:make-cart :: COMM * DIMS * PERIODS * REORDER -> COMM</procedure>
96
97Creates a new communicator with Cartesian topology
98information. Argument {{DIMS}} is an SRFI-4 s32vector that contains
99the number of dimensions of the Cartesian grid. Argument {{PERIODS}}
100is an SRFI-4 s32vector of the same length as {{DIMS}} that indicates
101if the grid is periodic (1) or not (0) in each dimension. Argument
102{{REORDER}} is a boolean value that indicates whether process ranking
103may be reordered.
104
105
106
107<procedure>MPI:make-dims :: NNODES * NDIMS -> DIMS</procedure>
108
109Creates a division of processes in a Cartesian grid. Argument
110{{NNODES}} is the number of nodes in the grid. Argument {{NDIMS}} is
111the number of Cartesian dimensions. The return values is an SRFI-4
112s32vector.
113
114
115
116<procedure>MPI:cart-coords :: COMM * RANK -> COORDS</procedure>
117
118Determines process coordinates in Cartesian topology, given a rank in
119the group. The return value is an SRFI-4 s32vector of length {{NDIMS}}
120(the number of dimensions in the Cartesian topology).
121
122
123
124=== Handling of communication groups
125
126
127<procedure>MPI:group? :: OBJ -> BOOL</procedure>
128
129Returns true if {{OBJ}} is an MPI group object, false otherwise.
130
131
132
133<procedure>MPI:comm-group :: COMM -> GROUP</procedure>
134Returns the group associated with the given communicator.
135
136
137<procedure>MPI:group-size :: GROUP -> INTEGER</procedure>
138Returns the size of the group {{GROUP}}.
139
140
141<procedure>MPI:group-rank :: GROUP -> INTEGER</procedure>
142Returns the rank of the calling process in the given group.
143
144
145<procedure>MPI:group-translate-ranks :: GROUP1 * RANKS * GROUP2 -> RANKS2</procedure>
146
147Translates the ranks of processes in one group to those in another
148group. The return value is an SRFI-4 s32vector.
149
150
151
152<procedure>MPI:group-union :: GROUP1 * GROUP2 -> GROUP</procedure>
153
154
155
156<procedure>MPI:group-difference :: GROUP1 * GROUP2 -> GROUP</procedure>
157
158
159
160<procedure>MPI:group-intersection :: GROUP1 * GROUP2 -> GROUP</procedure>
161
162
163
164<procedure>MPI:group-incl :: GROUP * RANKS -> GROUP</procedure>
165
166Produces a group by reordering an existing group and taking only
167members with the given ranks. Argument {{RANKS}} is an SRFI-4
168s32vector.
169
170
171
172<procedure>MPI:group-excl :: GROUP * RANKS -> GROUP</procedure>
173
174Produces a group by reordering an existing group and taking only
175members that do not have the given ranks. Argument {{RANKS}} is an
176SRFI-4 s32vector.
177
178=== MPI datatypes
179
180<procedure>MPI:datatype? :: OBJ -> BOOL</procedure>
181
182Returns true if `OBJ` is an MPI datatype object, false otherwise.
183
184<procedure>MPI:type-extent :: DATATYPE -> (EXTENT LB)</procedure>
185
186Returns the extent and lower bound of an MPI data type.
187
188<procedure>MPI:type-size :: DATATYPE -> INT</procedure>
189
190Returns the size of a datatype.
191
192
193* MPI:type-char
194* MPI:type-int
195* MPI:type-fixnum
196* MPI:type-flonum
197* MPI:type-byte
198* MPI:type-s8
199* MPI:type-u8
200* MPI:type-s16
201* MPI:type-u16
202* MPI:type-s32
203* MPI:type-u32
204* MPI:type-f32
205* MPI:type-f64
206
207Predefined MPI data types.
208
209<procedure>MPI:make-type-struct :: FIELD-COUNT * BLOCK-LENS * FIELDTYS -> DATATYPE</procedure>
210
211Given a gield count, field lengths and field types, creates and
212returns a new MPI structure data type with the given fields.
213
214
215=== Point-to-point communication
216
217
218Most communication procedures in this library come in several flavors,
219for fixnums, integers, floating point numbers, bytevectors, and for
220each of the SRFI-4 homogeneous vector types.
221
222<procedure>MPI:send :: DATATYPE * DATA * DEST * TAG * COMM -> UNDEFINED</procedure>
223<procedure>MPI:send-TYPE :: DATA * DEST * TAG * COMM -> UNDEFINED</procedure>
224
225Performs a standard-mode blocking send. Argument {{DEST}} is the rank
226of the destination process. Argument {{TAG}} is integer message
227tag. Argument {{DATATYPE}} is an MPI datatype object. {{TYPE}} is one
228of the following: {{fixnum, int, flonum, bytevector, s8vector,
229u8vector, s16vector, u16vector, s32vector, u32vector, f32vector,
230f64vector}}
231
232
233<procedure>MPI:receive :: DATATYPE * SOURCE * TAG * COMM -> DATA</procedure>
234<procedure>MPI:receive-TYPE :: LENGTH * SOURCE * TAG * COMM -> DATA</procedure>
235
236Performs a standard-mode blocking receive. Argument {{DEST}} is the
237rank of the destination process. Argument {{TAG}} is integer message
238tag. Argument {{LENGTH}} is present only in the vector
239procedures. {{TYPE}} is one of the following: {{fixnum, int, flonum,
240bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
241u32vector, f32vector, f64vector}}
242
243
244
245<procedure>MPI:probe :: DATATYPE * SOURCE * TAG * COMM -> (COUNT * SOURCE * TAG)</procedure>
246
247Checks for an incoming message. This is a blocking call that returns
248only after a matching message is found. Argument {{DATATYPE}} is an
249MPI datatype object. Argument {{SOURCE}} can be
250{{MPI:any-source}}. Argument {{TAG}} can be {{MPI:any-tag}}.
251
252
253=== Group communication
254
255
256<procedure>MPI:barrier :: COMM -> UNDEFINED</procedure>
257Barrier synchronization.
258
259
260<procedure>MPI:broadcast :: DATATYPE * DATA * ROOT * COMM -> UNDEFINED</procedure>
261<procedure>MPI:broadcast-TYPE :: DATA * ROOT * COMM -> UNDEFINED</procedure>
262
263Broadcasts a message from the process with rank root to all other
264processes of the group. Argument {{DATATYPE}} is an MPI datatype
265object. {{TYPE}} is one of the following: {{fixnum, int, flonum,
266bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
267u32vector, f32vector, f64vector}}
268
269
270
271<procedure>MPI:scatter :: DATATYPE * DATA * SENDCOUNT * ROOT * COMM -> DATA</procedure>
272<procedure>MPI:scatter-TYPE :: DATA * SENDCOUNT * ROOT * COMM -> DATA</procedure>
273
274Sends data from the root process to all processes in a group, and
275returns the data received by the calling process. Argument
276{{SENDCOUNT}} is the number of elements sent to each process. Argument
277{{DATATYPE}} is an MPI datatype object. Argument {{DATA}} is only
278required at the root process. All other processes can invoke this
279procedure with (void) as {{DATA}}. {{TYPE}} is one of the following:
280{{int, flonum, bytevector, s8vector, u8vector, s16vector, u16vector,
281s32vector, u32vector, f32vector, f64vector}}
282
283
284
285<procedure>MPI:scatterv :: DATATYPE * DATA * ROOT * COMM -> DATA</procedure>
286<procedure>MPI:scatterv-TYPE :: DATA * ROOT * COMM -> DATA</procedure>
287
288Sends variable-length data from the root process to all processes in a
289group, and returns the data received by the calling process.  Argument
290{{DATATYPE}} is an MPI datatype object. Argument {{DATA}} is only
291required at the root process, and is a list of values of type
292{{TYPE}}, where each element of the list is sent to the process of
293corresponding rank. All other processes can invoke this procedure with
294(void) as {{DATA}}. {{TYPE}} is one of the following: {{int, flonum,
295bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
296u32vector, f32vector, f64vector}}
297
298
299
300<procedure>MPI:gather :: DATATYPE * DATA * SENDCOUNT * ROOT * COMM -> DATA</procedure>
301<procedure>MPI:gather-TYPE :: DATA * SENDCOUNT * ROOT * COMM -> DATA</procedure>
302
303Gathers data from a group of processes, where each process send data
304of the same length.  Argument {{SENDCOUNT}} is the number of data
305elements being sent by each process. Argument {{DATATYPE}} is an MPI
306datatype object.  {{TYPE}} is one of the following: {{int, flonum,
307bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
308u32vector, f32vector, f64vector}}
309
310
311
312<procedure>MPI:gatherv :: DATATYPE * DATA * ROOT * COMM -> DATA</procedure>
313<procedure>MPI:gatherv-TYPE :: DATA * ROOT * COMM -> DATA</procedure>
314
315Gathers data from a group of processes, where each process can send
316data of variable length. Argument {{DATATYPE}} is an MPI datatype
317object. {{TYPE}} is one of the following: {{int, flonum, bytevector,
318s8vector, u8vector, s16vector, u16vector, s32vector, u32vector,
319f32vector, f64vector}}
320
321
322
323<procedure>MPI:allgather :: DATATYPE * DATA * ROOT * COMM -> DATA</procedure>
324<procedure>MPI:allgather-TYPE :: DATA * ROOT * COMM -> DATA</procedure>
325
326Gathers data of variable length from all processes and distributes it
327to all processes. Argument {{DATATYPE}} is an MPI datatype
328object. {{TYPE}} is one of the following: {{int, flonum, bytevector,
329s8vector, u8vector, s16vector, u16vector, s32vector, u32vector,
330f32vector, f64vector}}
331
332
333
334<procedure>MPI:reduce-TYPE :: DATA * OP * ROOT * COMM -> DATA</procedure>
335
336Reduces values on all processes within a group, using a global reduce
337operation, and return the result at the root process. {{OP}} is one of
338the following: {{MPI:i_max, MPI:i_min, MPI:i_sum, MPI:i_prod,
339MPI:i_land, MPI:i_lor, MPI:i_xor}} (integer operations); and
340{{MPI:f_max, MPI:f_min, MPI:f_sum, MPI:f_prod}} (floating point
341operations). {{TYPE}} is one of the following: {{int, flonum,
342bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
343u32vector, f32vector, f64vector}}
344
345
346
347<procedure>MPI:allreduce-TYPE :: DATA * OP * COMM -> DATA</procedure>
348
349Reduces values on all processes within a group, using a global reduce
350operation, and return the result at each process. {{OP}} is one of the
351following: {{MPI:i_max, MPI:i_min, MPI:i_sum, MPI:i_prod, MPI:i_land,
352MPI:i_lor, MPI:i_xor}} (integer operations); and {{MPI:f_max,
353MPI:f_min, MPI:f_sum, MPI:f_prod}} (floating point
354operations). {{TYPE}} is one of the following: {{int, flonum,
355bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
356u32vector, f32vector, f64vector}}
357
358
359
360<procedure>MPI:scan-TYPE :: DATA * OP * COMM -> DATA</procedure>
361
362Computes a partial reduction across the processes in a group. {{OP}}
363is one of the following: {{MPI:i_max, MPI:i_min, MPI:i_sum,
364MPI:i_prod, MPI:i_land, MPI:i_lor, MPI:i_xor}} (integer operations);
365and {{MPI:f_max, MPI:f_min, MPI:f_sum, MPI:f_prod}} (floating point
366operations). {{TYPE}} is one of the following: {{int, flonum,
367bytevector, s8vector, u8vector, s16vector, u16vector, s32vector,
368u32vector, f32vector, f64vector}}
369
370
371
372=== Round-robin routines
373
374
375The following variants of {{fold}}, {{map}} and {{for-each}} process
376lists in round-robin fashion on MPI nodes: for a given node {{n}},
377only list elements whose index is a modulo of n will be processed on
378this node.
379
380<procedure>MPI-rr-fold :: FN * INITIAL * XS -> RESULT</procedure>
381
382<procedure>MPI-rr-map :: FN * XS -> RESULT</procedure>
383
384<procedure>MPI-rr-for-each :: FN * XS -> VOID</procedure>
385
386
387
388== Examples
389
390
391=== Master/worker example
392
393<enscript highlight="scheme">
394;; Simple master/worker example
395;; Can be run as follows: mpirun -np 4 csi -s master-worker.scm
396;; where -np # indicates the number of processes
397
398(import scheme (chicken base) srfi-4 mpi)
399
400(MPI:init)
401
402;; MPI uses objects called communicators to define how processes
403;; communicate with each other.  Almost all MPI routines require a
404;; communicator as an argument. 
405
406;; `MPI:get-comm-world' returns the communicator object which can send
407;; messages to all running MPI processes
408
409(define comm-world  (MPI:get-comm-world))
410
411;; `MPI:comm-size' returns the number of running MPI processes
412;;  (including the current one)
413
414(define size        (MPI:comm-size comm-world))
415
416;; `MPI:comm-rank' returns the rank of the calling MPI process
417
418(define myrank      (MPI:comm-rank comm-world))
419
420;; We assign rank 0 to be the master process, and the rest will be
421;; worker processes
422
423(if (zero? myrank)
424    (begin
425      (printf "[~a/~a]: I am the master\n" myrank size)
426      (let recur ((i 1))
427        (if (< i size)
428            (begin
429              ;; Send Hello message to process of rank i
430              (MPI:send-bytevector (string->blob (sprintf "Hello ~a..." i)) i 0 comm-world)
431              (recur (+ 1 i)))
432            ))
433
434      (let recur ((i 1))
435        (if (< i size)
436             ;; Wait for a response from process of rank i
437            (let ((n (blob->string (MPI:receive-bytevector i MPI:any-tag comm-world))))
438              (printf "[~a/~a]: received: ~a~%" myrank size n)
439              (recur (+ 1 i)))
440            ))
441      )
442    (begin
443      (printf "[~a/~a]: I am a worker\n" myrank size)
444      ;; Wait for a message from the master (process 0)
445      (let ((n (blob->string (MPI:receive-bytevector 0 MPI:any-tag comm-world))))
446        (printf "[~a/~a]: received: ~a\n" myrank size n)
447        ;; Send a response back to the master
448        (MPI:send-bytevector (string->blob (sprintf "Processor ~a reporting!" myrank))
449                     0 0 comm-world))
450      )
451    )
452
453(MPI:finalize)
454</enscript>
455
456=== Master/worker implemented with collective operations
457
458<enscript highlight="scheme">
459;; Master/worker example implemented with collective operations
460;; Can be run as follows: mpirun -np 4 csi -s master-worker.scm
461
462(import scheme (chicken base) srfi-1 srfi-4 mpi)
463
464(MPI:init)
465
466;; MPI uses objects called communicators to define how processes
467;; communicate with each other.  Almost all MPI routines require a
468;; communicator as an argument. 
469
470;; `MPI:get-comm-world' returns the communicator object which can send
471;; messages to all running MPI processes
472
473(define comm-world  (MPI:get-comm-world))
474
475;; `MPI:comm-size' returns the number of running MPI processes
476;;  (including the current one)
477
478(define size        (MPI:comm-size comm-world))
479
480;; `MPI:comm-rank' returns the rank of the calling MPI process
481
482(define myrank      (MPI:comm-rank comm-world))
483
484;; We assign rank 0 to be the master process, and the rest will be
485;; worker processes
486
487(if (zero? myrank)
488    (begin
489      (printf "[~a/~a]: I am the master\n" myrank size)
490     
491      ;; data is a list of vectors to be sent to each process.  The
492      ;; master process sends element i from the list to process i
493      ;; (including itself). Note that each process must call scatterv
494      ;; in order to receive its data. In this example, the master
495      ;; ignores the result to its call to scatterv.
496
497      (let ((data (list-tabulate size (lambda (i) (string->blob (sprintf "Hello ~a..." i))))))
498        (MPI:scatterv-bytevector data 0 comm-world))
499   
500      ;; With gatherv, each process (master process included) sends
501      ;; the contents of its send buffer to the master process. The
502      ;; master process receives the messages and stores them in rank
503      ;; order.
504
505      (let ((v (MPI:gatherv-bytevector (string->blob "I am the master!") 0 comm-world)))
506        (printf "[~a/~a]: received: ~a\n" myrank size (map blob->string v))
507        ))
508    (begin
509      (printf "[~a/~a]: I am a worker\n" myrank size)
510
511      ;; The worker collects its data via a call to scatterv. The data
512      ;; argument is #f because the worker is not sending anything,
513      ;; just receiving.
514
515      (let ((n (blob->string (MPI:scatterv-bytevector #f 0 comm-world))))
516        (printf "[~a/~a]: received: ~a\n" myrank size n)
517
518        ;; The worker sends its result back to the master via a call to gatherv.
519        (MPI:gatherv-bytevector (string->blob (sprintf "Processor ~a reporting!" myrank))
520                                0 comm-world))
521      )
522    )
523
524(MPI:finalize)
525</enscript>
526
527
528== About this egg
529
530
531=== Author
532
533[[/users/ivan-raikov|Ivan Raikov]]
534
535=== Version history
536
537; 2.2 : Ported to CHICKEN 5
538; 2.1 : Support for MPI alltoall / alltoallv operations
539; 2.0 : Added support for MPI derived datatypes
540; 1.14 : Added simple round-robin routines
541; 1.12 : Fixes to allgather-int and allgather-flonum (thanks to Peter Bex)
542; 1.11 : Test script fixes
543; 1.9 : Ensure test script returns non-zero on error (thanks to mario)
544; 1.7 : Switched to wiki documentation
545; 1.6 : Ported to Chicken 4
546; 1.5 : Added a binding for MPI:spawn
547; 1.3 : Bug fix in MPI:scatter-int
548; 1.2 : Bug fix in the meta file
549; 1.1 : Bug fixes and improvements to the regression tests
550; 1.0 : Initial release
551
552=== License
553
554
555 Copyright 2007-2018 Ivan Raikov.
556 
557 Based on the Ocaml MPI library by Xavier Leroy.
558 
559 This program is free software: you can redistribute it and/or modify
560 it under the terms of the GNU General Public License as published by
561 the Free Software Foundation, either version 3 of the License, or (at
562 your option) any later version.
563 
564 This program is distributed in the hope that it will be useful, but
565 WITHOUT ANY WARRANTY; without even the implied warranty of
566 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
567 General Public License for more details.
568 
569 A full copy of the GPL license can be found at
570 <http://www.gnu.org/licenses/>.
571
Note: See TracBrowser for help on using the repository browser.