GithubHelp home page GithubHelp logo

The Content-Length header for string `data` counts Unicode characters in the string when it should count encoded bytes about requests HOT 6 OPEN

bruceadams avatar bruceadams commented on July 20, 2024 1
The Content-Length header for string `data` counts Unicode characters in the string when it should count encoded bytes

from requests.

Comments (6)

goelbenj avatar goelbenj commented on July 20, 2024 3

I do not understand what you are saying. What hack?

Ha, looks like we made the same conclusion here. What I meant regarding the "hack" was requiring the user to encode their string data as UTF-8 for the Content-Length header to be correctly initialized.

from requests.

goelbenj avatar goelbenj commented on July 20, 2024

I assume that it is incorrect to require the data to be encoded as UTF-8, so I will work on a fix that removes the need for this hack. @bruceadams

from requests.

bruceadams avatar bruceadams commented on July 20, 2024

I assume that it is incorrect to require the data to be encoded as UTF-8, so I will work on a fix that removes the need for this hack. @bruceadams

I do not understand what you are saying. What hack?

A Python string can contain Unicode characters. To send a Python string as the body of an HTTP request, the string needs to be encoded into bytes. UTF-8 is a common encoding (and I see signs of UTF-8 being assumed elsewhere in the Requests code). In the behavior I saw in the wild, Requests did, in fact, encode the request body as UTF-8.

from requests.

bruceadams avatar bruceadams commented on July 20, 2024

Ah! Your pull request lines up with how I thought this might be properly addressed! Nice! (I just created a similar pull request #6589.)

from requests.

numblr avatar numblr commented on July 20, 2024

Can I fix this by downgrading to a previous version? I don't want to (and some users probably cannot) change the code to convert to bytes before passing it to the request.

Also don't really get your fixes, the body is at some point converted to bytes (there is a body_to_chunks in request.py) that also seems to set the content-length header? But that is just a side note, I'm not into the code, so just ignore it if I'm talking nonsense..

from requests.

sigmavirus24 avatar sigmavirus24 commented on July 20, 2024

Bytes are the language of the Internet regardless of whether you think that. Many things try to paper over that. The right thing is to typically send bytes that you know how they should be encoded but barring that, we should be always dealing with bytes internally. Now that we dropped 2.7 support, I'd support always encoding data parameters that are strs to bytes before doing anything else with them (e.g., calculating content length) internally

from requests.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.