Opened 5 years ago

Last modified 13 months ago

#985 assigned defect

read-line blocks intermittantly on Atom based netbook

Reported by: kiatoa Owned by: kiatoa
Priority: minor Milestone: someday
Component: unknown Version: 4.8.x
Keywords: Cc:
Estimated difficulty: hard

Description

This problem first shows up in:

9eef92115fba6034a98cc21ef740fd82ea52387b is the first bad commit
commit 9eef92115fba6034a98cc21ef740fd82ea52387b
Author: Peter Bex <peter.bex@…>
Date: Sun Nov 18 21:03:51 2012 +0100

Fix select() buffer overrun vulnerability by using POSIX poll() on systems that support it, leaving only those few systems vulnerable that don't (ie, only Windows).


I used the following code snippet to test. It took as much as 10 runs to trigger the bug sometimes:

=============CODE=============
(use posix)

(define (conservative-read port)

(let loop ((res '()))

(if (not (eof-object? (peek-char port)))

(loop (cons (read-char port) res))
(apply conc (reverse res)))))

(define (cmd-run-with-stderr->list cmd . params)

(let-values (((fh fho pid fhe) (if (null? params)

(process* cmd)
(process* cmd params))))

(let loop ((curr (read-line fh))

(result '()))

(print "GOT HERE 1, curr=" curr)
(let ((errstr (conservative-read fhe)))

(print "GOT HERE 3, errstr=" errstr)
(if (not (string=? errstr ""))

(set! result (cons errstr result))))

(print "GOT HERE 2, result=" result)
(if (not (eof-object? curr))

(begin

(print "GOT HERE 4")
(loop (read-line fh)

(cons curr result)))

(begin

(print "GOT HERE 5")
(close-input-port fh)
(close-input-port fhe)
(close-output-port fho)
(reverse result))))))

(print "Got: " (cmd-run-with-stderr->list "ls"))

Attachments (1)

0001-Fix-985-by-making-process-ports-consistent-with-TCP-.patch (4.6 KB) - added by sjamaan 5 years ago.
A fix against master

Download all attachments as: .zip

Change History (8)

comment:1 in reply to: ↑ description Changed 5 years ago by kiatoa

Replying to kiatoa:

This problem first shows up in:

9eef92115fba6034a98cc21ef740fd82ea52387b is the first bad commit
commit 9eef92115fba6034a98cc21ef740fd82ea52387b
Author: Peter Bex <peter.bex@…>
Date: Sun Nov 18 21:03:51 2012 +0100

Fix select() buffer overrun vulnerability by using POSIX poll() on systems that support it, leaving only those few systems vulnerable that don't (ie, only Windows).


I used the following code snippet to test. It took as much as 10 runs to trigger the bug sometimes:

=============CODE=============
(use posix)

(define (conservative-read port)

(let loop ((res '()))

(if (not (eof-object? (peek-char port)))

(loop (cons (read-char port) res))
(apply conc (reverse res)))))

(define (cmd-run-with-stderr->list cmd . params)

(let-values (((fh fho pid fhe) (if (null? params)

(process* cmd)
(process* cmd params))))

(let loop ((curr (read-line fh))

(result '()))

(print "GOT HERE 1, curr=" curr)
(let ((errstr (conservative-read fhe)))

(print "GOT HERE 3, errstr=" errstr)
(if (not (string=? errstr ""))

(set! result (cons errstr result))))

(print "GOT HERE 2, result=" result)
(if (not (eof-object? curr))

(begin

(print "GOT HERE 4")
(loop (read-line fh)

(cons curr result)))

(begin

(print "GOT HERE 5")
(close-input-port fh)
(close-input-port fhe)
(close-output-port fho)
(reverse result))))))

(print "Got: " (cmd-run-with-stderr->list "ls"))

To trigger the issue I ran the above code 10-20 times in a row. Generally the problem would show up in the first or second run but sometimes it would take many iterations before the problem occurred.

comment:2 Changed 5 years ago by sjamaan

Interesting. I can reproduce the problem, but when I force the old "select()"-based implementation by ripping out HAVE_POSIX_POLL from Makefile.bsd, the same happens.

Seems this is more complicated than just the switch to poll.

comment:3 Changed 5 years ago by sjamaan

I just did a bisect and it came up with a different commit:

fa9ccaa030cf7acaceb15378a4d6c33464f0eb1f is the first bad commit
commit fa9ccaa030cf7acaceb15378a4d6c33464f0eb1f
Author: Peter Bex <peter.bex@…>
Date: Sat Feb 16 14:01:08 2013 +0100

Get rid of overflow situation in read-line causing lines to be read wrong


Signed-off-by: Christian Kellermann <ckeen@…>

That the bug was introduced somewhere in the read-line changes makes more sense to me.

comment:4 Changed 5 years ago by zbigniew

But kiatoa said the problem occurs with 4.8.0.1, and that patch isn't in stability (obviously, since it just got applied 4 days ago!) Perhaps you are just not triggering the problem reliably?

Changed 5 years ago by sjamaan

A fix against master

comment:5 Changed 5 years ago by sjamaan

There's a possibility there are two distinct bugs here (or two different places where it manifested), since you ran into the problem with 4.8.0.1. Could you please try this patch against master?

comment:6 Changed 5 years ago by sjamaan

Note: this bug only gets triggered if you run the program from a directory with sufficiently many files with sufficiently long filenames, I think (from the chicken source directory, for example)

comment:7 Changed 13 months ago by sjamaan

  • Estimated difficulty set to hard
  • Owner set to kiatoa
  • Status changed from new to assigned

Needs feedback

Note: See TracTickets for help on using tickets.