1 | Some quick notes while everything's still fresh in my mind. These |
---|
2 | might also be useful when integrating this in core. |
---|
3 | |
---|
4 | === Integration of the numeric tower into the type system |
---|
5 | |
---|
6 | The basic idea in this code is that there are two distinct types of |
---|
7 | numbers: "basic" and "extended". The basic numbers are the |
---|
8 | fundamental ones that have always been known by CHICKEN core, with the |
---|
9 | new addition of bignums. The extended numbers exist *ONLY* in Scheme, |
---|
10 | which means that to C they're just structure/record objects (much like |
---|
11 | the way SRFI 4 vectors are currently handled in core as second-class |
---|
12 | citizens). This rule is broken in only a handful of places (eqv, assq) |
---|
13 | for performance reasons. |
---|
14 | |
---|
15 | In the CHICKEN 4 numbers egg, this is faked out because we can't truly |
---|
16 | extend the core number types, so bignums are structures as well. But |
---|
17 | for integration into core, this is changed to a true type. |
---|
18 | |
---|
19 | In intermediate versions of the "numbers" egg, we had to pass a |
---|
20 | failure continuation, which meant creating an extra closure object |
---|
21 | upon every call to numeric operations. But now, in order to avoid any |
---|
22 | performance impact, the Scheme procedures are invoked as an |
---|
23 | "exception", much like the way the error handler is invoked through |
---|
24 | barf() in cases of error. This allows us to only pass the arguments |
---|
25 | to a numeric operation, pretending the implementation is native C. |
---|
26 | |
---|
27 | === Performance impact |
---|
28 | |
---|
29 | I've tried very hard to keep the performance of basic numeric |
---|
30 | operations exactly the same as in core. In particular, the various |
---|
31 | checks for number types are done in exactly the same order as |
---|
32 | everywhere in core: |
---|
33 | |
---|
34 | - Is it a fixnum? |
---|
35 | - Is it an immediate? If so, barf. |
---|
36 | - Does the header have a flonum tag? (before, this was combined with the immediate check) |
---|
37 | - Does the header have a bignum tag? (normally, we'd have an "else barf()" at this point) |
---|
38 | - Look up the numeric operation's matching Scheme procedure for extended numeric types, and call it (or barf, if it's not defined for these) |
---|
39 | |
---|
40 | This means that "generic" numeric code should incur ZERO performance |
---|
41 | penalty for functions that are non-allocating and inlineable.... |
---|
42 | |
---|
43 | Unfortunately, that's where the good part ends. Any operation that |
---|
44 | results in a fresh number is no longer inlineable, because in case of |
---|
45 | bignums they will need to allocate an unknown quantity of memory, |
---|
46 | which may require a GC. The upshot is that every "allocating inline" |
---|
47 | procedure will now need to be called in primitive CPS context. This |
---|
48 | is fundamental limitation that we can't do much about. |
---|
49 | |
---|
50 | In addition, the comparison functions (=, <, >, <=, >=) are no longer |
---|
51 | inline. This is due to the fact that in order to correctly compare |
---|
52 | flonums, they need to be converted to a bignum and then compared. We |
---|
53 | *could* decide to rip this out, but that would result in unexpected |
---|
54 | things, like: (< 19000000000000000.0 19000000000000001) => #f or |
---|
55 | (= 19000000000000000.0 190000000000000001) => #t |
---|
56 | These are currently the case, too. This is due to precision loss from |
---|
57 | the fix->flo conversion (which means we drop from 62 bits to 54 bits). |
---|
58 | Because we _are_ comparing inexact numbers (which could already have |
---|
59 | lost information before comparing them), we could decide to ignore |
---|
60 | these edge cases and keep them like they are. For the "=" function |
---|
61 | that would mean it can remain inlined and non-allocating. For < and >, |
---|
62 | however, this doesn't help: in the case of ratnums we must multiply the |
---|
63 | numerator of x with the denominator of y and vice versa, and compare |
---|
64 | the results. This means we're stuck with an allocating, non-inlineable |
---|
65 | function. Because of this, I decided to keep the comparison functions |
---|
66 | all non-inlineable. |
---|
67 | |
---|
68 | Finally, the C implementation of the comparison functions as well as |
---|
69 | +, -, * and / are no longer vararg functions. Instead, the variadic |
---|
70 | part is handled in Scheme, and the C implementation only compares two |
---|
71 | numbers at a time. This shouldn't be too much of a performance impact |
---|
72 | considering they already have to be in CPS context anyway. Plus, |
---|
73 | calls with two operations can easily be rewritten to a direct call, |
---|
74 | which leads us to... |
---|
75 | |
---|
76 | ==== Specializations |
---|
77 | |
---|
78 | There's some light at the end of the tunnel: In critical |
---|
79 | number-crunching code, you'll usually be working with either integers |
---|
80 | or flonums (or you'd already be using the old numbers egg and |
---|
81 | everything would be shit-slow anyway). These two situations are |
---|
82 | catered to specifically by specialized versions. This is where the |
---|
83 | specialization/scrutiny stuff really shines: if we know something is a |
---|
84 | whole integer, we can use unsafe operations that only need to check a |
---|
85 | single bit to distinguish whether a number is fixnum or bignum, just |
---|
86 | like in the old situation we could this to distinguish between fixnum |
---|
87 | and flonum. |
---|
88 | |
---|
89 | This case should be easy to infer. There's one caveat: do not use the |
---|
90 | "/" division operator, because this may result in a ratnum. Instead, |
---|
91 | in fast code where you know you're dealing with integers, it's best to |
---|
92 | use "quotient", instead. And of course, if you use the trigonometric |
---|
93 | operations you may get a flonum, which will also result in the generic |
---|
94 | number functions being used. |
---|