Opened 12 years ago

Last modified 8 years ago

#985 assigned defect

read-line blocks intermittantly on Atom based netbook

Reported by: Matt Welland Owned by: Matt Welland
Priority: minor Milestone: someday
Component: unknown Version: 4.8.x
Keywords: Cc:
Estimated difficulty: hard

Description

This problem first shows up in:

9eef92115fba6034a98cc21ef740fd82ea52387b is the first bad commit
commit 9eef92115fba6034a98cc21ef740fd82ea52387b
Author: Peter Bex <peter.bex@…>
Date: Sun Nov 18 21:03:51 2012 +0100

Fix select() buffer overrun vulnerability by using POSIX poll() on systems that support it, leaving only those few systems vulnerable that don't (ie, only Windows).


I used the following code snippet to test. It took as much as 10 runs to trigger the bug sometimes:

=============CODE=============
(use posix)

(define (conservative-read port)

(let loop ((res '()))

(if (not (eof-object? (peek-char port)))

(loop (cons (read-char port) res))
(apply conc (reverse res)))))

(define (cmd-run-with-stderr->list cmd . params)

(let-values (((fh fho pid fhe) (if (null? params)

(process* cmd)
(process* cmd params))))

(let loop ((curr (read-line fh))

(result '()))

(print "GOT HERE 1, curr=" curr)
(let ((errstr (conservative-read fhe)))

(print "GOT HERE 3, errstr=" errstr)
(if (not (string=? errstr ""))

(set! result (cons errstr result))))

(print "GOT HERE 2, result=" result)
(if (not (eof-object? curr))

(begin

(print "GOT HERE 4")
(loop (read-line fh)

(cons curr result)))

(begin

(print "GOT HERE 5")
(close-input-port fh)
(close-input-port fhe)
(close-output-port fho)
(reverse result))))))

(print "Got: " (cmd-run-with-stderr->list "ls"))

Attachments (1)

0001-Fix-985-by-making-process-ports-consistent-with-TCP-.patch (4.6 KB) - added by sjamaan 12 years ago.
A fix against master

Download all attachments as: .zip

Change History (8)

comment:1 in reply to:  description Changed 12 years ago by Matt Welland

Replying to kiatoa:

This problem first shows up in:

9eef92115fba6034a98cc21ef740fd82ea52387b is the first bad commit
commit 9eef92115fba6034a98cc21ef740fd82ea52387b
Author: Peter Bex <peter.bex@…>
Date: Sun Nov 18 21:03:51 2012 +0100

Fix select() buffer overrun vulnerability by using POSIX poll() on systems that support it, leaving only those few systems vulnerable that don't (ie, only Windows).


I used the following code snippet to test. It took as much as 10 runs to trigger the bug sometimes:

=============CODE=============
(use posix)

(define (conservative-read port)

(let loop ((res '()))

(if (not (eof-object? (peek-char port)))

(loop (cons (read-char port) res))
(apply conc (reverse res)))))

(define (cmd-run-with-stderr->list cmd . params)

(let-values (((fh fho pid fhe) (if (null? params)

(process* cmd)
(process* cmd params))))

(let loop ((curr (read-line fh))

(result '()))

(print "GOT HERE 1, curr=" curr)
(let ((errstr (conservative-read fhe)))

(print "GOT HERE 3, errstr=" errstr)
(if (not (string=? errstr ""))

(set! result (cons errstr result))))

(print "GOT HERE 2, result=" result)
(if (not (eof-object? curr))

(begin

(print "GOT HERE 4")
(loop (read-line fh)

(cons curr result)))

(begin

(print "GOT HERE 5")
(close-input-port fh)
(close-input-port fhe)
(close-output-port fho)
(reverse result))))))

(print "Got: " (cmd-run-with-stderr->list "ls"))

To trigger the issue I ran the above code 10-20 times in a row. Generally the problem would show up in the first or second run but sometimes it would take many iterations before the problem occurred.

comment:2 Changed 12 years ago by sjamaan

Interesting. I can reproduce the problem, but when I force the old "select()"-based implementation by ripping out HAVE_POSIX_POLL from Makefile.bsd, the same happens.

Seems this is more complicated than just the switch to poll.

comment:3 Changed 12 years ago by sjamaan

I just did a bisect and it came up with a different commit:

fa9ccaa030cf7acaceb15378a4d6c33464f0eb1f is the first bad commit
commit fa9ccaa030cf7acaceb15378a4d6c33464f0eb1f
Author: Peter Bex <peter.bex@…>
Date: Sat Feb 16 14:01:08 2013 +0100

Get rid of overflow situation in read-line causing lines to be read wrong


Signed-off-by: Christian Kellermann <ckeen@…>

That the bug was introduced somewhere in the read-line changes makes more sense to me.

comment:4 Changed 12 years ago by Jim Ursetto

But kiatoa said the problem occurs with 4.8.0.1, and that patch isn't in stability (obviously, since it just got applied 4 days ago!) Perhaps you are just not triggering the problem reliably?

Changed 12 years ago by sjamaan

A fix against master

comment:5 Changed 12 years ago by sjamaan

There's a possibility there are two distinct bugs here (or two different places where it manifested), since you ran into the problem with 4.8.0.1. Could you please try this patch against master?

comment:6 Changed 12 years ago by sjamaan

Note: this bug only gets triggered if you run the program from a directory with sufficiently many files with sufficiently long filenames, I think (from the chicken source directory, for example)

comment:7 Changed 8 years ago by sjamaan

Estimated difficulty: hard
Owner: set to Matt Welland
Status: newassigned

Needs feedback

Note: See TracTickets for help on using tickets.