Opened 14 years ago
Closed 13 years ago
#800 closed defect (fixed)
data seems to get lost in tcp unit when using http-client
| Reported by: | Christian Kellermann | Owned by: | |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | core libraries | Version: | 4.7.x | 
| Keywords: | tcp, http-client, connection reset by peer who-is-this-Peer-anyway | Cc: | |
| Estimated difficulty: | 
Description
The original sympthom:
In a podcast fetcher I am using http-client to retrieve data, one of the sources sends a HTTP 302 response that does not seem to get handled by http-client, wherever other redirects with this status code work. I am using with-input-from request for this task.
The effect is that a "Connection reset by peer" condition is raised to the callee on this particular URI: http://traffic.libsyn.com/amberstar/Zencast351.mp3
curl shows what's happening:
$ curl -v http://traffic.libsyn.com/amberstar/Zencast351.mp3 * About to connect() to traffic.libsyn.com port 80 (#0) * Trying 204.16.242.150... connected * Connected to traffic.libsyn.com (204.16.242.150) port 80 (#0) > GET /amberstar/Zencast351.mp3 HTTP/1.1 > User-Agent: curl/7.21.3 (x86_64-pc-linux-gnu) libcurl/7.21.3 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.18 > Host: traffic.libsyn.com > Accept: */* > * HTTP 1.0, assume close after body < HTTP/1.0 302 Found < Date: Tue, 13 Mar 2012 16:03:42 GMT < Server: Apache/2.2.14 (Ubuntu) < X-Powered-By: PHP/5.3.9-ZS5.6.0 ZendServer/5.0 < Location: http://hw.libsyn.com/p/8/9/4/89406686a8fc50bc/Zencast351.mp3?sid=afdf5215984e979b156a06471cc24b22&l_sid=18965&l_eid=&l_mid=2881717&expiration=1331661584&hwt=4318c901816885f4a19f0e3dbe1fd49b < X-Libsyn-Host: traffic3.libsyn.com < Vary: Accept-Encoding < Content-Length: 0 < Connection: close < Content-Type: text/html < * Closing connection #0
The additional debug output on the follwoing is mine and does not effect the result...
What it shows (to me) is that in this case the http-client redirection handling is not called at all.
Instead it gets stuck in http-client after sending the request, when reading the response from the server, which throws and error and so the retry loop is restarted.
The master chicken says:
$ csi -R http-client -p '(with-input-from-request "http://traffic.libsyn.com/amberstar/Zencast351.mp3" #f read-lines)'
Req: #<URI-common: scheme=http port=#f host="traffic.libsyn.com" path=(/ "amberstar" "Zencast351.mp3") query=() fragment=#f>
send request: #<request>
reading response
reading from inport #<input port "(tcp)">
((exn (location #f)
      (call-chain
        (#("intarweb.scm:35: display" #f #f)
         #("intarweb.scm:35: request-headers" #f #f)
         #("intarweb.scm:35: header-values" #f #f)
         #("header-parsers.scm:9: header-contents" #f #f)
         #("header-parsers.scm:5: alist-ref" #f #f)
         #("http-client.scm:34: print" #f #f)
         #("http-client.scm:34: writer" #f #f)
         #("http-client.scm:34: intarweb#request-port" #f #f)
         #("http-client.scm:34: write-data!" #f #f)
         #("http-client.scm:34: intarweb#request-port" #f #f)
         #("http-client.scm:34: flush-output" #f #f)
         #("http-client.scm:34: print" #f #f)
         #("http-client.scm:34: intarweb#read-response" #f #f)
         #("intarweb.scm:35: print" #f #f)
         #("intarweb.scm:35: read-line-limit" #f #f)
         #("intarweb.scm:35: read-line" #f #f)))
      (arguments (3))
      (message "cannot read from socket - Connection reset by peer"))
 (i/o)
 (net))
Req: #<URI-common: scheme=http port=#f host="traffic.libsyn.com" path=(/ "amberstar" "Zencast351.mp3") query=() fragment=#f>
send request: #<request>
reading response
reading from inport #<input port "(tcp)">
((exn (location #f)
      (call-chain
        (#("intarweb.scm:35: display" #f #f)
         #("intarweb.scm:35: request-headers" #f #f)
         #("intarweb.scm:35: header-values" #f #f)
         #("header-parsers.scm:9: header-contents" #f #f)
         #("header-parsers.scm:5: alist-ref" #f #f)
         #("http-client.scm:34: print" #f #f)
         #("http-client.scm:34: writer" #f #f)
         #("http-client.scm:34: intarweb#request-port" #f #f)
         #("http-client.scm:34: write-data!" #f #f)
         #("http-client.scm:34: intarweb#request-port" #f #f)
         #("http-client.scm:34: flush-output" #f #f)
         #("http-client.scm:34: print" #f #f)
         #("http-client.scm:34: intarweb#read-response" #f #f)
         #("intarweb.scm:35: print" #f #f)
         #("intarweb.scm:35: read-line-limit" #f #f)
         #("intarweb.scm:35: read-line" #f #f)))
      (arguments (3))
      (message "cannot read from socket - Connection reset by peer"))
 (i/o)
 (net))
Req: #<URI-common: scheme=http port=#f host="traffic.libsyn.com" path=(/ "amberstar" "Zencast351.mp3") query=() fragment=#f>
send request: #<request>
reading response
reading from inport #<input port "(tcp)">
((exn (location #f)
      (call-chain
        (#("intarweb.scm:35: display" #f #f)
         #("intarweb.scm:35: request-headers" #f #f)
         #("intarweb.scm:35: header-values" #f #f)
         #("header-parsers.scm:9: header-contents" #f #f)
         #("header-parsers.scm:5: alist-ref" #f #f)
         #("http-client.scm:34: print" #f #f)
         #("http-client.scm:34: writer" #f #f)
         #("http-client.scm:34: intarweb#request-port" #f #f)
         #("http-client.scm:34: write-data!" #f #f)
         #("http-client.scm:34: intarweb#request-port" #f #f)
         #("http-client.scm:34: flush-output" #f #f)
         #("http-client.scm:34: print" #f #f)
         #("http-client.scm:34: intarweb#read-response" #f #f)
         #("intarweb.scm:35: print" #f #f)
         #("intarweb.scm:35: read-line-limit" #f #f)
         #("intarweb.scm:35: read-line" #f #f)))
      (arguments (3))
      (message "cannot read from socket - Connection reset by peer"))
 (i/o)
 (net))
Error: cannot read from socket - Connection reset by peer: 3
	Call history:
	alist-ref	  
	http-client.scm:34: write	  
	http-client.scm:34: get-output-string	  
	http-client.scm:34: uri-common#uri-port	  
	uri-generic#uri-port	  
	uri-generic#uri-scheme	  
	alist-ref	  
	http-client.scm:34: uri-common#uri-port	  
	uri-generic#uri-port	  
	uri-generic#uri-scheme	  
	alist-ref	  
	http-client.scm:34: uri-common#uri-host	  
	http-client.scm:34: uri-common#uri-host	  
	http-client.scm:34: max-retry-attempts	  
	http-client.scm:34: max-retry-attempts	  
	http-client.scm:34: raise	  	<--
The chicken in use is
CHICKEN (c)2008-2011 The Chicken Team (c)2000-2007 Felix L. Winkelmann Version 4.7.5 (rev 4a0e635) linux-unix-gnu-x86-64 [ 64bit manyargs dload ptables ] compiled 2012-03-13 on devpool08 (Linux)
This does *not* happen with Version 4.7.0.3-st (stability/4.7.0). With the stability chicken the redirect is handled as expected.
Well the following code works in the first attempt but fails at the second:
(use http-client tcp) ; This works (let-values (((i o) (tcp-connect "traffic.libsyn.com" 80))) (fprintf o "GET /amberstar/Zencast351.mp3 HTTP/1.1\r\n\r\n") (print "reading directly with tcp connect...") (printf "~s~%" (read-lines i)) (print "closing connection") (close-output-port o) (close-input-port i)) ; This raises a "connection reset by peer error" (print "Requesting url with-input-from-request...") (printf "Got ~a~%" (length (with-input-from-request "http://traffic.libsyn.com/amberstar/Zencast351.mp3" #f read-lines)))
Again this also works with the stability branch.
The http-client in use in both cases is version 0.5
Change History (8)
comment:1 Changed 14 years ago by
comment:2 Changed 14 years ago by
Indeed, that seems to be related to compiler optimisations, maybe inlining. Compiling the eggs with master and no optimisations (CSC_OPTIONS set to -O0) the test works just fine.
comment:4 Changed 14 years ago by
specifically -no-usual-integrations makes it work. Seems like the usual integrations are the usual suspects...
comment:5 Changed 14 years ago by
Changing the fprintf call in the write-request-line procedure to the following also makes it work:
(define (write-request-line request)
  (display
   (sprintf
    "~A ~A HTTP/~A.~A\r\n"
    (http-method->string (request-method request))
    (uri->string (request-uri request))
    (request-major request)
    (request-minor request))
    (request-port request)))
    comment:6 Changed 14 years ago by
| Milestone: | 4.8.0 | 
|---|
"-debug 2" shows a correct fprintf compiler-macro expansion.
I assume that the fprintf expands into code that is too slow - it generates several procedure calls, coming in in separate packets, and the server is too impatient to wait for a complete response. It may be the case that the server reads one chunk of incoming data and assume nothing else will follow (a buffering issue, in other words).
So I propose to change http-client to make sure the request line is written in one chunk. The compiler-macro expansion is correct.
comment:7 Changed 13 years ago by
I've changed intarweb to do this now.  Could you check, C-Keen, so we can close this ticket?

A git bisect reveals that commit
be1963374922e42024433f516f15d394a0b2e2f9 is the first bad commit commit be1963374922e42024433f516f15d394a0b2e2f9 Author: felix <felix@call-with-current-continuation.org> Date: Thu Dec 15 11:15:53 2011 +0100 fprintf wasn't internally registered as extended binding :100644 100644 fbe77ac1c8c32a4fee3c01ffbaeaa7a52d517188 fb9c4bdd49b4b79ff531e1c05ad34e86c54aa641 M c-platform.scm bisect run successtriggers the error first. This does not make much sense to me atm.
I will further investigate and come up with a patch...