1 | Some quick notes while everything's still fresh in my mind. These |
---|
2 | might also be useful when integrating this in core. |
---|
3 | |
---|
4 | === Integration of the numeric tower into the type system |
---|
5 | |
---|
6 | The basic idea in this code is that there are two distinct types of |
---|
7 | numbers: "basic" and "extended". The basic numbers are the |
---|
8 | fundamental ones that have always been known by CHICKEN core, with the |
---|
9 | new addition of bignums. The extended numbers exist *ONLY* in Scheme, |
---|
10 | which means that to C they're just structure/record objects. This |
---|
11 | rule is broken in only a handful of places (eqv, assq) for performance |
---|
12 | reasons. |
---|
13 | |
---|
14 | In the CHICKEN 4 numbers egg, this is faked out because we can't truly |
---|
15 | extend the core number types, so bignums are structures as well. But |
---|
16 | for integration into core, this is changed to a true type. |
---|
17 | |
---|
18 | In intermediate versions of the "numbers" egg, we had to pass a |
---|
19 | failure continuation, which meant creating an extra closure object |
---|
20 | upon every call to numeric operations. But now, in order to avoid any |
---|
21 | performance impact, the Scheme procedures are invoked as an |
---|
22 | "exception", much like the way the error handler is invoked through |
---|
23 | barf() in cases of error. This allows us to only pass the arguments |
---|
24 | to a numeric operation, pretending the implementation is native C. |
---|
25 | |
---|
26 | === Performance impact |
---|
27 | |
---|
28 | I've tried very hard to keep the performance of basic numeric |
---|
29 | operations exactly the same as in core. In particular, the various |
---|
30 | checks for number types are done in exactly the same order as |
---|
31 | everywhere in core: |
---|
32 | |
---|
33 | - Is it a fixnum? |
---|
34 | - Is it an immediate? If so, barf. |
---|
35 | - Does the header have a flonum tag? (before, this was combined with the immediate check) |
---|
36 | - Does the header have a bignum tag? (normally, we'd have an "else barf()" at this point) |
---|
37 | - Look up the numeric operation's matching Scheme procedure for extended numeric types, and call it (or barf, if it's not defined for these) |
---|
38 | |
---|
39 | This means that "generic" numeric code should incur ZERO performance |
---|
40 | penalty for functions that are non-allocating and inlineable.... |
---|
41 | |
---|
42 | Unfortunately, that's where the good part ends. Any operation that |
---|
43 | results in a fresh number is no longer inlineable, because in case of |
---|
44 | bignums they will need to allocate an unknown quantity of memory, |
---|
45 | which may require a GC. The upshot is that every "allocating inline" |
---|
46 | procedure will now need to be called in primitive CPS context. This |
---|
47 | is fundamental limitation that we can't do much about. |
---|
48 | |
---|
49 | In addition, the comparison functions (=, <, >, <=, >=) are no longer |
---|
50 | inline. This is due to the fact that in order to correctly compare |
---|
51 | flonums, they need to be converted to a bignum and then compared. We |
---|
52 | *could* decide to rip this out, but that would result in unexpected |
---|
53 | things, like: (< 19000000000000000.0 19000000000000001) => #f or |
---|
54 | (= 19000000000000000.0 190000000000000001) => #t |
---|
55 | These are currently the case, too. This is due to precision loss from |
---|
56 | the fix->flo conversion (which means we drop from 62 bits to 54 bits). |
---|
57 | Because we _are_ comparing inexact numbers (which could already have |
---|
58 | lost information before comparing them), we could decide to ignore |
---|
59 | these edge cases and keep them like they are. For the "=" function |
---|
60 | that would mean it can remain inlined and non-allocating. For < and >, |
---|
61 | however, this doesn't help: in the case of ratnums we must multiply the |
---|
62 | numerator of x with the denominator of y and vice versa, and compare |
---|
63 | the results. This means we're stuck with an allocating, non-inlineable |
---|
64 | function. Because of this, I decided to keep the comparison functions |
---|
65 | all non-inlineable. |
---|
66 | |
---|
67 | Finally, the C implementation of the comparison functions as well as |
---|
68 | +, -, * and / are no longer vararg functions. Instead, the variadic |
---|
69 | part is handled in Scheme, and the C implementation only compares two |
---|
70 | numbers at a time. This shouldn't be too much of a performance impact |
---|
71 | considering they already have to be in CPS context anyway. Plus, |
---|
72 | calls with two operations can easily be rewritten to a direct call, |
---|
73 | which leads us to... |
---|
74 | |
---|
75 | ==== Specializations |
---|
76 | |
---|
77 | There's some light at the end of the tunnel: In critical |
---|
78 | number-crunching code, you'll usually be working with either integers |
---|
79 | or flonums (or you'd already be using the old numbers egg and |
---|
80 | everything would be shit-slow anyway). These two situations are |
---|
81 | catered to specifically by specialized versions. This is where the |
---|
82 | specialization/scrutiny stuff really shines: if we know something is a |
---|
83 | whole integer, we can use unsafe operations that only need to check a |
---|
84 | single bit to distinguish whether a number is fixnum or bignum, just |
---|
85 | like in the old situation we could this to distinguish between fixnum |
---|
86 | and flonum. |
---|
87 | |
---|
88 | This case should be easy to infer. There's one caveat: do not use the |
---|
89 | "/" division operator, because this may result in a ratnum. Instead, |
---|
90 | in fast code where you know you're dealing with integers, it's best to |
---|
91 | use "quotient", instead. And of course, if you use the trigonometric |
---|
92 | operations you may get a flonum, which will also result in the generic |
---|
93 | number functions being used. |
---|