Comments (15)
This is definitely not true.
I was referring to the real world behaviour of Accept-Language, and wondering if the same applies. Clearly that's not the case, so you are probably right that we should use the quality value here.
from aiohttp.
@bdraco I think handling the 1st task here for case-sensitivity is quite easy to do before #8063 so I'll submit that.
However, the 3rd task to completely parsing it correctly per the RFC is less a priority to backport IMO. And implementing it raises a larger question for me...
Although it's relatively easy in python, shouldn't that parsing be done by llhttp
, and probably the conversion to lowercase too? Meaning, by the time it hits the code to use it for picking or compressing the file, it should be lowered and in a form like {"<encoding>": <quality>, ...}
instead of just the raw string. There are other headers that use this format -- Accept
, Accept-Language
, TE
-- but I don't see any of those parsed anywhere in the code.
@Dreamsorcerer is this a question for you?
from aiohttp.
The 3rd task would be partially accomplished by #7679, which is drafted for the 4.0 milestone.
from aiohttp.
For the 2nd task regarding the behavior of the force
parameter in enable_compression()
, the change would be pretty easy; however, a decision is needed on intent and backporting. Either:
- Bug fix back to 3.9: In other words, the intent of
force
was never to ignoreAccept-Encoding
. - Feature for 3.10: Perhaps add a new
prefer
parameter to make the intent clear, and potentially deprecateforce
in version 4. - Breaking for v4: I don't recommend this as it would be problematic for users if Brotli were implemented for 3.10 unless it was the default over deflate.
Thoughts?
from aiohttp.
I think we need to do some git archeology to figure out why force
was added in the first place before a path forward can be determined.
from aiohttp.
Although it's relatively easy in python, shouldn't that parsing be done by
llhttp
, and probably the conversion to lowercase too? Meaning, by the time it hits the code to use it for picking or compressing the file, it should be lowered and in a form like{"<encoding>": <quality>, ...}
instead of just the raw string. There are other headers that use this format --Accept
,Accept-Language
,TE
-- but I don't see any of those parsed anywhere in the code.
My impression is that llhttp is just passing us header values. Meaning that it has parsed it as HTTP, but not converted anything based on the particular header names, which is probably out-of-scope. So, such changes would likely go into _http_parser.pyx
instead, if you want the compiled performance.
However, is there actually any merit in implementing this quality check? I've seen the same syntax for locales, and I think the general consensus is that this was over-engineered and basically pointless. The quality numbers fail to achieve anything more than defining an order of preference (i.e. the actual numbers are never used for anything, they're just sorted on). Therefore, browsers don't provide any way for users to customise these quality values, they just pick evenly spaced values according to order of user preference. Because of all this, browsers will also always order them with the highest quality number at the start, so in reality, you can just read the values left-to-right while ignoring the quality numbers and you'll get the same result. Parsing the quality numbers would therefore just be a waste of CPU-time.
from aiohttp.
I think we need to do some git archeology to figure out why
force
was added in the first place before a path forward can be determined.
It was a boolean before #403 added gzip, and the code before that made it clear that it was actually a "force" to ignore Accept-Encoding
. The boolean was originally added in #239 but there's no comments to explain any use case for it.
from aiohttp.
I think the force parameter is pretty clear in that it forces a given compression. I'm not clear why you'd want to do this, so I'm not clear that turning it into a preference makes sense without understanding the use case (can we find any projects using the parameter?).
Clearly it's not correct behaviour to use compression that the client claims not to support, but it also seems weird to try and use a fixed compression method in the first place...
from aiohttp.
I've seen the same syntax for locales, and I think the general consensus is that this was over-engineered and basically pointless.
OK, there seems to be an important difference between Accept-Encoding and Accept-Language. The former can disallow encodings with q=0, so maybe we need to atleast check for that.
Edit: OK, not a difference, but it's explicitly mentioned under Accept-Encoding, whereas it'd be very weird to try and exclude languages.
from aiohttp.
My impression is that llhttp is just passing us header values. Meaning that it has parsed it as HTTP, but not converted anything based on the particular header names, which is probably out-of-scope. So, such changes would likely go into
_http_parser.pyx
instead, if you want the compiled performance.
Okay, makes sense 👍🏻
However, is there actually any merit in implementing this quality check? I've seen the same syntax for locales, and I think the general consensus is that this was over-engineered and basically pointless. The quality numbers fail to achieve anything more than defining an order of preference (i.e. the actual numbers are never used for anything, they're just sorted on).
That's certainly correct - the numbers mean nothing but a sort order. I think the reason it's not just a list in preference order is that then it becomes impossible to communicate equal preference for some or all items. That's why the default quality if not specified is 1.
To a lesser extent it's also a way to have negations with a wildcard, e.g. "compress;q=0, *" says I'll take any encoding except compress. Although even the spec admits this pattern has little to no practical value.
FWIW, Apache does use the qualities in its content negotiation.
Therefore, browsers don't provide any way for users to customise these quality values, they just pick evenly spaced values according to order of user preference.
True they are not user-facing for sure (nor would they have any reason to be?), and the values are just picked to set the browser's preference.
Because of all this, browsers will also always order them with the highest quality number at the start, so in reality, you can just read the values left-to-right while ignoring the quality numbers and you'll get the same result. Parsing the quality numbers would therefore just be a waste of CPU-time.
This is definitely not true. The spec is clear that values without a quality default to 1 and browsers take advantage of that. For example, both Chrome and Firefox send Accept-Encoding: gzip, deflate, br
, but that certainly does not mean they prefer gzip over brotli. Decoding time is inconsequential so the best interest for a client is smallest transfer size. Google even makes this clear in their Lighthouse audit for compression.
What they send for the Accept
header also makes this clear:
- Firefox = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8"
- Chrome = "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7"
from aiohttp.
Clearly it's not correct behaviour to use compression that the client claims not to support, but it also seems weird to try and use a fixed compression method in the first place...
I completely agree - neither makes sense to me. I think there is a use case for changing the server's default encoding order though. For example, a user might benchmark and find that sending large files over long distances is likely faster with gzip than deflate (i.e. the larger encoding time is offset by the faster transfer time). So they might want the order to be ["br", "gzip", ..others]
.
If I were to redesign the current API, it might look something like this to allow a set of encodings to be preferred first but then fallback to others supported:
def __init__(self, ...) -> None:
...
self._ordered_codings = deque(ContentCoding, len(ContentCoding))
def enable_compression(self, *prefer: ContentCoding) -> None:
"""Enables response compression with preferred encodings."""
self._compression = True
for encoding in prefer:
self._ordered_codings.remove(encoding)
self._ordered_codings.extend(prefer)
self._ordered_codings.rotate(len(prefer))
async def _start_compression(self, request: "BaseRequest") -> None:
# Encoding comparisons should be case-insensitive
# https://www.rfc-editor.org/rfc/rfc9110#section-8.4.1
accept_encoding = request.headers.get(hdrs.ACCEPT_ENCODING, "").lower()
for coding in self._ordered_codings:
if coding.value in accept_encoding:
await self._do_start_compression(coding)
return
from aiohttp.
Yeah, I'd be open to something like that. I wonder if it's worth adding this as a new feature (and potentially removing force
if we can't find any uses for it).
from aiohttp.
Yeah, I'd be open to something like that. I wonder if it's worth adding this as a new feature (and potentially removing
force
if we can't find any uses for it).
Here's my attempt at a search for enable_compression()
uses with an argument, after a lot of filtering to get rid of just copies of aiohttp
itself.
There are some results using the old boolean, some oddly specifying "identity" 😕, some just specifying "gzip" that clearly would have no harm by turning to a preference (e.g. ignore the Home Assistant results), and even one result that is doing the job aiohttp
should be doing and checking the Accept-Encoding
header 😉.
Passing "identity" is concerning because currently that would just do nothing, but under my proposal, it would be skipped because it's not going to be in the Accept-Encoding
header and then the algorithm would go on to deflate. I could either put an assertion in that disallows it, or just adjust the header check because it's implied as included (unless explicitly excluded).
from aiohttp.
Here's my attempt at a search for
enable_compression()
uses with an argument, after a lot of filtering to get rid of just copies ofaiohttp
itself.
That gives me no results for some reason. I did create a script previously to search dependents, though it can be slow to run (it searches 1 repo every ~7 seconds due to rate limit): https://gist.github.com/Dreamsorcerer/70285fac0a11c3d9c26b577f7dd989a7
There are some results using the old boolean
Might be interesting to see if any of them mention any rationale in the commit they were added...
Passing "identity" is concerning because currently that would just do nothing, but under my proposal, it would be skipped because it's not going to be in the
Accept-Encoding
header and then the algorithm would go on to deflate. I could either put an assertion in that disallows it, or just adjust the header check because it's implied as included (unless explicitly excluded).
My counter-proposal is to just add a new prefer
parameter which takes a sequence of strings. Then add a deprecation warning to the force
parameter and remove it from v4. If we get complaints from the deprecation notice, then we can reconsider removing it.
from aiohttp.
That gives me no results for some reason.
The markdown sanitizer keeps removing all the escapes from the regex parts. The search term should look like this with 20 results for me:
aiohttp AND /\.enable_compression\(.+\)/ language:python NOT is:archived NOT is:fork NOT repo:aio-libs/aiohttp NOT is:vendored NOT is:generated NOT path:/(^|\/)aiohttp\/(multipart|web_response|client_reqrep)\.py$/ NOT path:/tests\/test_(web_response|client_functional|http_writer)\.py$/ NOT repo:cndn/intelligent-code-completion
Might be interesting to see if any of them mention any rationale in the commit they were added...
I didn't check, but one of the false arguments did have a comment mentioning it was a workaround for the double compression issue recently fixed by @bdraco.
My counter-proposal is to just add a new
prefer
parameter which takes a sequence of strings. Then add a deprecation warning to theforce
parameter and remove it from v4. If we get complaints from the deprecation notice, then we can reconsider removing it.
Not as elegant, but safer of course. Also, consider that either way this becomes a "breaking" change. For example, the Home Assistant use is definitely not expecting a force, so a change would be required to specify prefer=
.
from aiohttp.
Related Issues (20)
- TypeError: _patch_task.<locals>.task_new_init() got an unexpected keyword argument 'eager_start' HOT 5
- Sending default 'Server' header when failing on wrong HTTP method HOT 6
- Websocket ping test has transient failures on loaded systems HOT 1
- Send 100-continue only when handler reads body
- typing error with `timeout` argument of `aiohttp.ClientSession` requests methods HOT 1
- Rename compress parameter HOT 6
- on_request_headers_sent params type was never imported into the top-level file. HOT 2
- AsyncResolver not passing on loop too aiodns.DNSResolver HOT 4
- test_web_sendfile_functional isn't testing sendfile
- Incorrect re-use future in _make_or_get_ssl_context HOT 10
- ClientSession.get() doesn't support hostname.
- MIsing header file HOT 1
- Replace blockdiag HOT 1
- 3.9.2 server keep-alive timeout can cause disconnect of active connections and hence client server disconnect errors HOT 18
- Disable retry in tests
- add `Sequence[str | int]` typing in `_RequestOptions.params` HOT 2
- Middleware fix up is unexpectedly expensive HOT 2
- Regression in aiohttp 3.10.3 when running multiple loops in multiple threads HOT 2
- Remove make_mocked_coro()
- Replace aioredis in example HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aiohttp.