Opened 12 years ago

Closed 9 years ago

#858 closed defect (fixed)

spiffy-request-vars' test shows "cannot write to socket - Broken pipe" at random

Reported by: Mario Domenech Goulart Owned by: Mario Domenech Goulart
Priority: critical Milestone:
Component: core libraries Version: 4.7.x
Keywords: spiffy-request-vars, cannot write to socket, broken pipe Cc:
Estimated difficulty:

Description

Since spiffy-request-vars' tests have been fixed (they were not working at all before), it's been presenting a strange behavior.

A particular test causes a "cannot write to socket - Broken pipe" error, although it seems to get the expected result.

This behavior is intermittent. It worked on May 19, 20, 21, 23 and 24, but caused the "cannot write to socket" error on 18, 22 and 25.

Change History (8)

comment:1 Changed 12 years ago by Mario Domenech Goulart

Priority: majorcritical

Salmonella reports for prerelease don't show that error: 18, 19, 20, 21, 22, 23, 24 and 25

comment:2 Changed 12 years ago by Mario Domenech Goulart

Here's a simpler code snippet that can be used to reproduce a similar problem:

Server part:

$ csi -e '(use spiffy) (start-server)'

Client part:

$ cat client.scm
(use http-client)

(define (get path)
  (let ((val (handle-exceptions exn
               'ignore
               (with-input-from-request
                (string-append "http://localhost:8080" path)
                #f
                read-string))))
    (close-all-connections!)
    val))

(let loop ((i 0))
  (unless (= i 10000)
    (print i)
    (get "/foo")
    (loop (+ i 1))))

$ csi -s client.scm

I suspect the error is caused by some client issue, since I don't get any error when hitting the server with ab.

By stracing the client, I could see it stops on select:

connect(4, {sa_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
select(5, NULL, [4], NULL, {0, 0})      = 1 (out [4], left {0, 0})
getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
fcntl64(4, F_GETFL)                     = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
send(4, "GET", 3, 0)                    = 3
send(4, " ", 1, 0)                      = 1
send(4, "/foo", 4, 0)                   = 4
send(4, " HTTP/", 6, 0)                 = 6
send(4, "1", 1, 0)                      = 1
send(4, ".", 1, 0)                      = 1
send(4, "1", 1, 0)                      = 1
send(4, "\r\n", 2, 0)                   = 2
send(4, "Host", 4, 0)                   = 4
send(4, ": ", 2, 0)                     = 2
send(4, "localhost:8080", 14, 0)        = 14
send(4, "\r\n", 2, 0)                   = 2
send(4, "User-Agent", 10, 0)            = 10
send(4, ": ", 2, 0)                     = 2
send(4, "Chicken Scheme HTTP-client/0.5", 30, 0) = 30
send(4, "\r\n", 2, 0)                   = 2
send(4, "\r\n", 2, 0)                   = 2
recv(4, 0xb7704df8, 1024, 0)            = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1338590646, 168171}, NULL) = 0
gettimeofday({1338590646, 168201}, NULL) = 0
gettimeofday({1338590646, 168229}, NULL) = 0
select(1024, [4], [], NULL, {60, 0}

comment:3 Changed 12 years ago by felix winkelmann

Component: unknowncore libraries

That strace output is strange: recv returns with EGAIN, which should produce "cannot _read_ from socket". Are you sure this is the correct error message?

We possible need to handle EGAIN in the tcp port implementation.

comment:4 Changed 12 years ago by felix winkelmann

Owner: set to Mario Domenech Goulart
Status: newassigned

Please try the current git master - I added support for EGAIN, so perhaps one of the problems you report can be handled.

comment:5 Changed 12 years ago by Mario Domenech Goulart

Thanks, Felix. I've run the test case above with your patch applied to chicken. It seems that it doesn't change much, unfortunately. The client still pauses sometimes. After the pauses, the server issues Error: cannot write to socket - Broken pipe 4 messages and go on.

The client still shows

send(4, "User-Agent", 10, 0)            = 10
send(4, ": ", 2, 0)                     = 2
send(4, "Chicken Scheme HTTP-client/0.5", 30, 0) = 30
send(4, "\r\n", 2, 0)                   = 2
send(4, "\r\n", 2, 0)                   = 2
recv(4, 0xb70dc71c, 1024, 0)            = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1339504932, 16290}, NULL) = 0
gettimeofday({1339504932, 16338}, NULL) = 0
gettimeofday({1339504932, 16386}, NULL) = 0
select(1024, [4], [], NULL, {60, 0}

when paused.

I notice the pauses are much less frequent when the server is run with strace.

I'm using 5247d0dde5762d7ae984746b4c4f617b8e60ba17

comment:6 Changed 12 years ago by Mario Domenech Goulart

I just rerun the test case against chicken with 7e7d44aca43f5522c138668a45baaaae823f8877 (Correct typo in EAGAIN patch spotted by Marijn). Nothing seems to have changed, as far as I can see.

I still get Error: cannot write to socket - Broken pipe and several client pauses along its execution.

comment:7 Changed 12 years ago by felix winkelmann

Milestone: 4.8.0

comment:8 Changed 9 years ago by Mario Domenech Goulart

Resolution: fixed
Status: assignedclosed

I'm closing this one, since that behavior hasn't been observed recently.

Note: See TracTickets for help on using tickets.