Opened 5 months ago

Closed 5 months ago

Last modified 5 months ago

#1838 closed defect (fixed)

http-client does not multipart encode string ports properly

Reported by: Woodrow E Douglass Owned by:
Priority: major Milestone: someday
Component: unknown Version: 5.3.0
Keywords: http http-client Cc:
Estimated difficulty:

Description

When passing a string-port as part of the writer argument to with-input-from-request, the encoding step tries to read the port (i guess to get length?), closes it, and then tries to read it again and crashes.

I am relatively new at chicken scheme, and could not see a clean way to solve this issue. I worked around it with the below patch, by allowing a thunk to be passed that gets called for a new string port each time it's needed. this seems to work (although i am still testing)

If there is another place i should submit this patch, or if I've done anything obviously wrong (technically or project ettiquite), Please let me know. Thank you.

From 13eb8902e0505e26e6f41dd9b36e9f85460ba58a Mon Sep 17 00:00:00 2001
From: Woodrow Douglass <wdouglass@carnegierobotics.com>
Date: Mon, 24 Jun 2024 13:04:30 -0400
Subject: [PATCH] accept a procedure for file content, which is a thunk that
 returns a port

---
 http-client.scm | 40 +++++++++++++++++++++++-----------------
 1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/http-client.scm b/http-client.scm
index e9c181b..922ddfd 100644
--- a/http-client.scm
+++ b/http-client.scm
@@ -733,6 +733,7 @@
                 (list "--" boundary "\r\n" hs "\r\n"
                       (cond ((string? file) (cons 'file file))
                             ((port? file) (cons 'port file))
+                            ((procedure? file) (cons 'procedure file))
                             ((eq? keys #t) "")
                             (else (->string keys)))
                   ;; The next boundary must always start on a new line
@@ -744,12 +745,17 @@
   (for-each (lambda (entry)
               (for-each (lambda (chunk)
                           (if (pair? chunk)
-                              (let ((p (if (eq? 'file (car chunk))
-                                           (open-input-file (cdr chunk))
-                                           ;; Should be a port otherwise
-                                           (cdr chunk))))
+                              (let ((p (case (car chunk)
+                                         ((file) (open-input-file (cdr chunk)))
+                                         ((port) (cdr chunk))
+                                         ((procedure) ((cdr chunk)))
+                                         (else (http-client-error
+                                                'write-chunks
+                                                "The a file chunk must be either a string representing a filename, an open port, or a thunk that returns an open port"
+                                                '()
+                                                'multipart-file-error)))))
                                 (handle-exceptions exn
-                                  (begin (close-input-port p) (raise exn))
+                                    (begin (close-input-port p) (raise exn))
                                   (sendfile p output-port))
                                 (close-input-port p))
                               (display chunk output-port)))
@@ -770,18 +776,18 @@
      (fold (lambda (chunks total-size)
              (fold (lambda (chunk total-size)
                      (if (pair? chunk)
-                         (if (eq? 'port (car chunk))
-                             (let ((str-len (maybe-string-port-length (cdr chunk))))
-                               (if str-len
-                                   (+ total-size str-len)
-                                   ;; We can't calculate port lengths
-                                   ;; for non-string-ports.  Let's just
-                                   ;; punt and hope the server won't
-                                   ;; return "411 Length Required"...
-                                   ;; (TODO: maybe try seeking it?)
-                                   (return #f)))
-                             ;; Should be a file otherwise.
-                             (+ total-size (file-size (cdr chunk))))
+                         (if (eq? 'file (car chunk))
+                             (+ total-size (file-size (cdr chunk)))
+                             (let ((p (if (eq? 'port (car chunk)) (cdr chunk) ((cdr chunk)))))
+                               (let ((str-len (maybe-string-port-length (cdr chunk))))
+                                 (if str-len
+                                     (+ total-size str-len)
+                                     ;; We can't calculate port lengths
+                                     ;; for non-string-ports.  Let's just
+                                     ;; punt and hope the server won't
+                                     ;; return "411 Length Required"...
+                                     ;; (TODO: maybe try seeking it?)
+                                     (return #f)))))
                          (+ total-size (string-length chunk))))
                    total-size
                    chunks))
-- 
2.39.2

Attachments (4)

0001-accept-a-procedure-for-file-content-which-is-a-thunk.patch (4.2 KB) - added by Woodrow E Douglass 5 months ago.
client.scm (427 bytes) - added by Woodrow E Douglass 5 months ago.
redirect.scm (917 bytes) - added by Woodrow E Douglass 5 months ago.
redirect-string-port-test.patch (917 bytes) - added by Woodrow E Douglass 5 months ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 5 months ago by sjamaan

Hi Woodrow,

Thanks for the bug report and the patch!

As far as I can see, the code specifically contains a case for the string port, there's even a procedure called {{maybe-string-port-length}} which obtains the string's length. So the port shouldn't be closed before it tries to read.

There's also a {{test-group}} in the test suite named {{"alist form data body with string port"}} which is supposed to test this particular case, but of course I wouldn't rule out any bugs. Perhaps you can have a look at that and see how it differs from your use case?

If you are sure it's a bug, I'd appreciate a self-contained sample (or better yet, a failing test case as a patch to the test suite) which fails.

comment:2 Changed 5 months ago by Woodrow E Douglass

So the test case works, and i think that test is valuable, but I've found an interesting thing:

The request in question is a PUT request against a PULP object server
https://pulpproject.org/

The same request that fails with

Error: (read-string!) port already closed: #<input port "(string)">

against the pulp server, succeeds when put against posttestserver.dev (which also has ssl enabled, so ssl is ruled out as a cause here)

I'll keep investigating, i'm not sure what the issue is yet. Sorry for the noise, this issue is more nuanced then i originally thought

Changed 5 months ago by Woodrow E Douglass

Attachment: client.scm added

Changed 5 months ago by Woodrow E Douglass

Attachment: redirect.scm added

comment:3 Changed 5 months ago by Woodrow E Douglass

So it seems the re-use attampt of a closed string port happens after a 301 redirect -- I've attached a client.scm and a redirect.scm pair that reproduce the issue

Changed 5 months ago by Woodrow E Douglass

comment:4 Changed 5 months ago by Woodrow E Douglass

redirect-string-port-test.patch modifies the string-port test to add a redirect, and this issue pops up

comment:5 Changed 5 months ago by sjamaan

Thanks for the investigation and test case, that makes more sense now. I'll look into it, hopefully within the next two weeks.

From a quick glance, I think the code for closing ports needs to be overhauled - I think we need to rewind the port position. I'm not sure what to do with custom ports in this case though! They don't have a seek operation defined on them.

One option would be to store the body somewhere (in memory, or on disk? What if it's an infinite stream?) and resubmit it, but that seems rather ugly. I'd also have to see how other HTTP libraries deal with this eventuality.

As an immediate workaround, you could simply change the URL in your application to use the final URL after redirection, to avoid getting redirected in the first place. This of course only works if the redirect is stable and predictable.

comment:6 Changed 5 months ago by sjamaan

BTW, why are you using a string port? It might make more sense to simply pass the underlying string.

This would remove all the problems you've been having, even if you must rely on being able to process the redirect.

comment:7 Changed 5 months ago by Woodrow E Douglass

I'm using a string port because the server wants that content as a "file" object (in a multipart/form-data body), and the file: argument of the writer alist interprets a string as a filename. The "procedure that returns a string port" solution is working for me for now -- another possible solution is to have the implicit redirect behavior be optional (maybe automatic if the writer argument or one of it's leaves is or isn't a port), and allow the user to set up another request for the redirected URI.

Thank you so much for being so responsive, and for this great library!

comment:8 Changed 5 months ago by Woodrow E Douglass

Just realized i can get the optional redirect behavior i want by setting max-redirect-path. I'll experiment with that today.

Version 0, edited 5 months ago by Woodrow E Douglass (next)

comment:9 Changed 5 months ago by Woodrow E Douglass

Hmm it seems (max-redirect-depth 0) means unlimited, back to the drawing board

comment:10 Changed 5 months ago by sjamaan

I'm using a string port because the server wants that content as a "file" object (in a multipart/form-data body), and the file: argument of the writer alist interprets a string as a filename.

Hm, that's unfortunate - the body being multipart shouldn't really make a difference, normally. Maybe a way to force that, even if all the entries are strings would be helpful as well...

comment:11 Changed 5 months ago by sjamaan

Hmm it seems (max-redirect-depth 0) means unlimited, back to the drawing board

No, that should indeed stop redirects altogether. Using (max-redirect-depth #f) disables it. The docs state this pretty clearly, and the code definitely checks for this.

Anyway, I just pushed a few commits to master, could you check it out if this makes it easier for you? See https://code.more-magic.net/http-client (you can clone it with git clone https://code.more-magic.net/http-client, change into the dir and type chicken-install to install it)

This new version changes two things:

  • It is possible to use data: with a string instead of file: and a port to force a multipart request even with static strings.
  • It detects whether ports passed in via file: are already closed and raises a specific exception (exn http port-already-consumed) with some hints as how to fix it.

comment:12 Changed 5 months ago by Woodrow E Douglass

the data: designater works as you describe, and solves my problem; thanks! also, (max-redirect-depth 0) does disable redirects, i'm not sure what i was doing wrong before. thank you very much!

comment:13 Changed 5 months ago by Woodrow E Douglass

Resolution: fixed
Status: newclosed

comment:14 Changed 5 months ago by sjamaan

Cool, I've released this as 1.2.2 and updated the documentation

Note: See TracTickets for help on using tickets.