Opened 10 years ago

Closed 10 years ago

Last modified 9 years ago

#579 closed defect (worksforme)

Numbers egg does not compute the same result as other schemes

Reported by: Christian Kellermann Owned by: sjamaan
Priority: critical Milestone: 4.9.0
Component: core libraries Version: 4.7.x
Keywords: numbers egg Cc:
Estimated difficulty:

Description

Dominic Pearson reported on #chicken the following problem:

When computing the sum i = 0 to n where n = 1000 of nn chicken seems to produce an incorrect result.

I have attached the results and procedures used to this report.

Attachments (1)

numbers-error.txt (15.6 KB) - added by Christian Kellermann 10 years ago.

Download all attachments as: .zip

Change History (14)

Changed 10 years ago by Christian Kellermann

Attachment: numbers-error.txt added

comment:1 Changed 10 years ago by Christian Kellermann

I have tested this on a 64 bit linux machine:

Linux foo 2.6.35-28-generic #50-Ubuntu SMP Fri Mar 18 18:42:20 UTC 2011 x86_64 GNU/Linux

comment:2 Changed 10 years ago by sjamaan

Component: unknowncore libraries
Milestone: 4.7.04.8.0
Owner: changed from sjamaan to felix winkelmann
Status: newassigned
Summary: Numbers egg does not compute the same result as other schemesexpt does not check errno after calling pow()

So far it appears to be a bug in Chicken core's implementation of expt; it doesn't check errno and produces fixnum results when pow() overflows.

Updating summary and reassigning to Felix.

comment:4 Changed 10 years ago by felix winkelmann

Would returning +inf on overflow, -inf on underflow and +nan on EDOM be sufficient?

comment:5 Changed 10 years ago by sjamaan

I'm not sure. Probably...

Maybe John Cowan has something sensible to say about this :)

comment:6 in reply to:  5 Changed 10 years ago by felix winkelmann

Keywords: egg added; eggs removed
Priority: majorcritical

Replying to sjamaan:

I'm not sure. Probably...

Maybe John Cowan has something sensible to say about this :)

He often does, but what he says isn't automatically sensible...

I will add a slight variation of the C expt routine in runtime.c to numbers and change the default procedure to return +/-inf and signal an error on EDOM (which would otherwise return a complex number). The latter change should be a Change Request, I think, since old code may stop working (even if it computed incorrect results).

comment:7 Changed 10 years ago by felix winkelmann

Version: 4.6.x4.7.x

comment:8 Changed 10 years ago by felix winkelmann

I committed a change to numbers/trunk that uses a fixnum-specific version of C_expt, but still get an incorrect result on a 64-bit machine (well, actually I forgot what the correct result was, but it is different from Racket's result).

comment:9 Changed 10 years ago by felix winkelmann

Owner: changed from felix winkelmann to sjamaan

comment:10 Changed 10 years ago by felix winkelmann

Summary: expt does not check errno after calling pow()Numbers egg does not compute the same result as other schemes

comment:11 Changed 10 years ago by sjamaan

I think the real problem is that 64-bit floating point numbers only have 52 bits at their disposal; if the top-most bits of a 64 bit number _and_ the lowest bit of that number are used, pow() does not seem to set errno:

gosh> (expt 999 6)
994014980014994001
gosh> (* 999 999 999 999 999 999)
994014980014994001
gosh> (* 999.0 999.0 999.0 999.0 999.0 999.0)
9.94014980014994e17
gosh> (inexact->exact (* 999.0 999.0 999.0 999.0 999.0 999.0))
994014980014994048

As you can see in the last call, precision is lost. This can be reproduced in Chicken, and even simplified to:

#;1> (inexact->exact (exact->inexact 994014980014994001))
994014980014994048

There is no way to get around this I'm afraid; on 64-bits systems integers can represent larger numbers than doubles when they have their lowest and highest bits set :(

What's really annoying is that as far as I could find, nothing indicates this truncation even happened! The comparison between m1 and r doesn't help, since the truncation already happened before converting it to a C_word, so it compares as being equal.

I ripped out the special case for fixpoint args so it always takes the "slow" route in [23864] and the calculation now gives the correct result.

If you agree with this and don't know a better solution either, please close the ticket and I'll tag a new release of "numbers". I'm unsure what should happen in core; I suppose it can just return the wrong result since there's nothing sane we can do.

God this sucks

comment:12 Changed 10 years ago by felix winkelmann

Resolution: worksforme
Status: assignedclosed

Yeah, I agree completely. Very unfortunate. Thanks for figuring this out.

comment:13 Changed 9 years ago by felix winkelmann

Milestone: 4.8.04.9.0

Milestone 4.8.0 deleted

Note: See TracTickets for help on using tickets.