Better file upload convinence about turbo HOT 11 CLOSED

kernelsauce commented on May 10, 2024

Better file upload convinence

from turbo.

Comments (11)

jsolman commented on May 10, 2024

I assume this means support for receiving files via multipart/form-data?

I was thinking about adding an issue to the list, but noticed this one.

For Content-Type: multipart/form-data
Currently only non-control character data can be received that is directly after the:
Content-Disposition:form-data; name="whatever"

In order to support receiving files, it is likely necessary to handle more of what is discussed in RFC2388.
Specifically handle parsing the content-type, charset, and content-transfer-encoding such as given this example:

    --AaB03x
    content-disposition: form-data; name="field1"
    content-type: text/plain;charset=windows-1250
    content-transfer-encoding: quoted-printable

    Joe owes =80100.
    --AaB03x

Perhaps for each name/value being added to getting added the arguments[][] table it would be better to save a table that contained in index [1] = value data (as determined by what appears before the end boundary), the content-type as key ["content-type"]=..., the charset as key ["charset"], and the content-transfer-encoding as key ["content-transfer-encoding"]. So for the prior example the following entry is added to the arguments table:

arguments["field1"][1] = { [0]="Joe owes =80100.", ["content-type"] = "text/plain", ["charset"] = "windows-1250",
["content-transfer-encoding"] = "quoted-printable" }

I am considering implementing this today to finish adding some file upload functionality to the site I am designing. Please let me know if you have any suggestions.

from turbo.

kernelsauce commented on May 10, 2024

Yes, jsolman, that is exactly what it means. The ability to parse different body payloads is lacking at the moment. If you want to take a shot at it then I would be very glad :). The only suggestion or tip I have is that you keep the API's compatible and also think a little about minimizing the vector of possible attacks.

from turbo.

jsolman commented on May 10, 2024

I am working on it. You should see something by the end of tomorrow I expect. The code that was there before was running escape.unescape on the entirety of the request body. I am only unescaping the actual content, but that probably should only happen if the content-type indicates javascript, json (ie. application/javascript, application/json, application/x-javascript, text/x-javascript, text/x-json), or some others?

Also, the default content-type and content-transfer-encoding I believe would indicate plain text us-ascii (ISO-8859-1) according to RFC2045 (Content-Type: text/plain; charset=ISO-8859-1). I would think that really probably shouldn't unescape anything, however maybe to be compatible with the design so far, I should still unescape the default (no content-type or transfer-encoding given the way it was previously being done). This basically is making the default content-type javascript/json rather than text/plain. Is this the intent?

One other note however is that I would like to support content-transfer-encoding: base64, and implement decoding the base64 data (I'll put the lua decode function in util.lua), in this case, the data would be unescaped after base64 decoding only if the content-type was set to something indicating javascript/json. This would allow uploading binary files with ease by using base64 encoding with an appropriate binary content-type (as long as it isn't json javascript, it will come out correctly).

from turbo.

kernelsauce commented on May 10, 2024

Hi. This is probably one of the parts of Turbo that is the least tested and thought through. The "intent" has probably been lost somewhere along the road :(. If not unescaping does not break the current integration tests and things still work as expected I'm happy with removing it.

As for base64 encoding I'd say go for it. Its a nice feature.

from turbo.

jsolman commented on May 10, 2024

Great, I am almost finished. I have taken care to accept what is specified as valid by the appropriate RFCs. I am only unescaping the javascript/json types for now as I mentioned: application/javascript, application/json, application/x-javascript, text/x-javascript, text/x-json. Any other content types will be the responsibility of the application to handle by looking at the content-type in the arguments table. In addition to what I described above, for the content-disposition I am storing any additional key values pairs that appear with the content-disposition in arguments[name][x]["content-disposition"] = { keys=values }

Some form-data such as for example uploading files contain key values such as "filename=uploadfilename.typ".
So for example if there was a form-data with name=webmasterfile and filename=test.html

Example of what a request handler will do to get an uploaded file:
web_file_args = self:get_arguments("webmasterfile")
file_name = web_file_args["content-disposition"].filename
file_data = webfileArgs[1]

from turbo.

jsolman commented on May 10, 2024

I finished this today and commited to my repo that has axTLS support. The changes for this commit shouldn't conflict with anything in your master, so you can cherry pick it and give it a look or wait till I finish the merge with kernelsauce/turbo master that I am starting today, which will also add the axTLS secure cookie support.

from turbo.

jsolman commented on May 10, 2024

My master is now more closely merged to kernelsauce/turbo master. I have now tested multipart/form-data a little. The parsing code I added required a boundary, but the previous code would work if no boundary existed. Is it necessary to parse a single argument encoded using multi-part/form-data that doesn't have a boundary? It is my understanding that the boundary is mandatory for multipart/form-data, so I think my current implementation is doing what it should to not parse anything if the boundary doesn't exist.

from turbo.

jsolman commented on May 10, 2024

This has been merged along with the axTLS merge. I just discovered an issue though, that was introduced when I changed the default TURBO_SOCKET_BUFFER_SZ to be big enough to accommodate the axTLS default read buffer size. The default size of the buffer for http headers was being set at 16KB by a call to IOStream:set_max_buffer_size(). The 16KB was smaller than the new TURBO_SOCKET_BUFFER_SZ which could cause additional body data to be read after the headers that was unwanted and could lead to the socket being closed due to the max buffer size erroneously being reached. Pull the latest commit from my fork for a fix that doesn't allow the max buffer size to be set below or equal to the internal TURBO_SOCKET_BUFFER_SZ.

from turbo.

kernelsauce commented on May 10, 2024

The buffer size issue has been handled. And I guess the file upload convinence is complete. So lets close this.

from turbo.

ryannining commented on May 10, 2024

is there any example of parsing file uploads on turbo.web.RequestHandler ?

from turbo.

kernelsauce commented on May 10, 2024

See #158

from turbo.

Better file upload convinence about turbo HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs