Opened 13 years ago
Closed 12 years ago
#800 closed defect (fixed)
data seems to get lost in tcp unit when using http-client
Reported by: | Christian Kellermann | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | core libraries | Version: | 4.7.x |
Keywords: | tcp, http-client, connection reset by peer who-is-this-Peer-anyway | Cc: | |
Estimated difficulty: |
Description
The original sympthom:
In a podcast fetcher I am using http-client to retrieve data, one of the sources sends a HTTP 302 response that does not seem to get handled by http-client, wherever other redirects with this status code work. I am using with-input-from request for this task.
The effect is that a "Connection reset by peer" condition is raised to the callee on this particular URI: http://traffic.libsyn.com/amberstar/Zencast351.mp3
curl shows what's happening:
$ curl -v http://traffic.libsyn.com/amberstar/Zencast351.mp3 * About to connect() to traffic.libsyn.com port 80 (#0) * Trying 204.16.242.150... connected * Connected to traffic.libsyn.com (204.16.242.150) port 80 (#0) > GET /amberstar/Zencast351.mp3 HTTP/1.1 > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18 > Host: traffic.libsyn.com > Accept: */* > * HTTP 1.0, assume close after body < HTTP/1.0 302 Found < Date: Tue, 13 Mar 2012 16:03:42 GMT < Server: Apache/2.2.14 (Ubuntu) < X-Powered-By: PHP/5.3.9-ZS5.6.0 ZendServer/5.0 < Location: http://hw.libsyn.com/p/8/9/4/89406686a8fc50bc/Zencast351.mp3?sid=afdf5215984e979b156a06471cc24b22&l_sid=18965&l_eid=&l_mid=2881717&expiration=1331661584&hwt=4318c901816885f4a19f0e3dbe1fd49b < X-Libsyn-Host: traffic3.libsyn.com < Vary: Accept-Encoding < Content-Length: 0 < Connection: close < Content-Type: text/html < * Closing connection #0
The additional debug output on the follwoing is mine and does not effect the result...
What it shows (to me) is that in this case the http-client redirection handling is not called at all.
Instead it gets stuck in http-client after sending the request, when reading the response from the server, which throws and error and so the retry loop is restarted.
The master chicken says:
$ csi -R http-client -p '(with-input-from-request "http://traffic.libsyn.com/amberstar/Zencast351.mp3" #f read-lines)' Req: #<URI-common: scheme=http port=#f host="traffic.libsyn.com" path=(/ "amberstar" "Zencast351.mp3") query=() fragment=#f> send request: #<request> reading response reading from inport #<input port "(tcp)"> ((exn (location #f) (call-chain (#("intarweb.scm:35: display" #f #f) #("intarweb.scm:35: request-headers" #f #f) #("intarweb.scm:35: header-values" #f #f) #("header-parsers.scm:9: header-contents" #f #f) #("header-parsers.scm:5: alist-ref" #f #f) #("http-client.scm:34: print" #f #f) #("http-client.scm:34: writer" #f #f) #("http-client.scm:34: intarweb#request-port" #f #f) #("http-client.scm:34: write-data!" #f #f) #("http-client.scm:34: intarweb#request-port" #f #f) #("http-client.scm:34: flush-output" #f #f) #("http-client.scm:34: print" #f #f) #("http-client.scm:34: intarweb#read-response" #f #f) #("intarweb.scm:35: print" #f #f) #("intarweb.scm:35: read-line-limit" #f #f) #("intarweb.scm:35: read-line" #f #f))) (arguments (3)) (message "cannot read from socket - Connection reset by peer")) (i/o) (net)) Req: #<URI-common: scheme=http port=#f host="traffic.libsyn.com" path=(/ "amberstar" "Zencast351.mp3") query=() fragment=#f> send request: #<request> reading response reading from inport #<input port "(tcp)"> ((exn (location #f) (call-chain (#("intarweb.scm:35: display" #f #f) #("intarweb.scm:35: request-headers" #f #f) #("intarweb.scm:35: header-values" #f #f) #("header-parsers.scm:9: header-contents" #f #f) #("header-parsers.scm:5: alist-ref" #f #f) #("http-client.scm:34: print" #f #f) #("http-client.scm:34: writer" #f #f) #("http-client.scm:34: intarweb#request-port" #f #f) #("http-client.scm:34: write-data!" #f #f) #("http-client.scm:34: intarweb#request-port" #f #f) #("http-client.scm:34: flush-output" #f #f) #("http-client.scm:34: print" #f #f) #("http-client.scm:34: intarweb#read-response" #f #f) #("intarweb.scm:35: print" #f #f) #("intarweb.scm:35: read-line-limit" #f #f) #("intarweb.scm:35: read-line" #f #f))) (arguments (3)) (message "cannot read from socket - Connection reset by peer")) (i/o) (net)) Req: #<URI-common: scheme=http port=#f host="traffic.libsyn.com" path=(/ "amberstar" "Zencast351.mp3") query=() fragment=#f> send request: #<request> reading response reading from inport #<input port "(tcp)"> ((exn (location #f) (call-chain (#("intarweb.scm:35: display" #f #f) #("intarweb.scm:35: request-headers" #f #f) #("intarweb.scm:35: header-values" #f #f) #("header-parsers.scm:9: header-contents" #f #f) #("header-parsers.scm:5: alist-ref" #f #f) #("http-client.scm:34: print" #f #f) #("http-client.scm:34: writer" #f #f) #("http-client.scm:34: intarweb#request-port" #f #f) #("http-client.scm:34: write-data!" #f #f) #("http-client.scm:34: intarweb#request-port" #f #f) #("http-client.scm:34: flush-output" #f #f) #("http-client.scm:34: print" #f #f) #("http-client.scm:34: intarweb#read-response" #f #f) #("intarweb.scm:35: print" #f #f) #("intarweb.scm:35: read-line-limit" #f #f) #("intarweb.scm:35: read-line" #f #f))) (arguments (3)) (message "cannot read from socket - Connection reset by peer")) (i/o) (net)) Error: cannot read from socket - Connection reset by peer: 3 Call history: alist-ref http-client.scm:34: write http-client.scm:34: get-output-string http-client.scm:34: uri-common#uri-port uri-generic#uri-port uri-generic#uri-scheme alist-ref http-client.scm:34: uri-common#uri-port uri-generic#uri-port uri-generic#uri-scheme alist-ref http-client.scm:34: uri-common#uri-host http-client.scm:34: uri-common#uri-host http-client.scm:34: max-retry-attempts http-client.scm:34: max-retry-attempts http-client.scm:34: raise <--
The chicken in use is
CHICKEN (c)2008-2011 The Chicken Team (c)2000-2007 Felix L. Winkelmann Version 4.7.5 (rev 4a0e635) linux-unix-gnu-x86-64 [ 64bit manyargs dload ptables ] compiled 2012-03-13 on devpool08 (Linux)
This does *not* happen with Version 4.7.0.3-st (stability/4.7.0). With the stability chicken the redirect is handled as expected.
Well the following code works in the first attempt but fails at the second:
(use http-client tcp) ; This works (let-values (((i o) (tcp-connect "traffic.libsyn.com" 80))) (fprintf o "GET /amberstar/Zencast351.mp3 HTTP/1.1\r\n\r\n") (print "reading directly with tcp connect...") (printf "~s~%" (read-lines i)) (print "closing connection") (close-output-port o) (close-input-port i)) ; This raises a "connection reset by peer error" (print "Requesting url with-input-from-request...") (printf "Got ~a~%" (length (with-input-from-request "http://traffic.libsyn.com/amberstar/Zencast351.mp3" #f read-lines)))
Again this also works with the stability branch.
The http-client in use in both cases is version 0.5
Change History (8)
comment:1 Changed 13 years ago by
comment:2 Changed 13 years ago by
Indeed, that seems to be related to compiler optimisations, maybe inlining. Compiling the eggs with master and no optimisations (CSC_OPTIONS set to -O0) the test works just fine.
comment:4 Changed 13 years ago by
specifically -no-usual-integrations makes it work. Seems like the usual integrations are the usual suspects...
comment:5 Changed 13 years ago by
Changing the fprintf call in the write-request-line procedure to the following also makes it work:
(define (write-request-line request) (display (sprintf "~A ~A HTTP/~A.~A\r\n" (http-method->string (request-method request)) (uri->string (request-uri request)) (request-major request) (request-minor request)) (request-port request)))
comment:6 Changed 13 years ago by
Milestone: | 4.8.0 |
---|
"-debug 2" shows a correct fprintf compiler-macro expansion.
I assume that the fprintf expands into code that is too slow - it generates several procedure calls, coming in in separate packets, and the server is too impatient to wait for a complete response. It may be the case that the server reads one chunk of incoming data and assume nothing else will follow (a buffering issue, in other words).
So I propose to change http-client to make sure the request line is written in one chunk. The compiler-macro expansion is correct.
comment:7 Changed 12 years ago by
I've changed intarweb to do this now. Could you check, C-Keen, so we can close this ticket?
A git bisect reveals that commit
triggers the error first. This does not make much sense to me atm.
I will further investigate and come up with a patch...