colin-b / httpx_auth Goto Github PK
View Code? Open in Web Editor NEWAuthentication classes to be used with httpx
License: MIT License
Authentication classes to be used with httpx
License: MIT License
I am looking at using httpx and I like what you have done to add support for authentication to it here. I was wondering if you would be open to me working on and submitting a PR to add aws4auth?
https://developer.okta.com/docs/guides/implement-grant-type/clientcreds/main/#create-custom-scopes
The scopes parameter should be mandatory.
If you continuously create AWS4Auth instances in with the security_token
argument set, it will slowly leak memory and make request signing slower.
Our production service creates a new AWS4Auth instance for every request to AWS S3 (possibly we should just re-use them) and we noticed that after tens of thousands of requests, the requests were getting slower and slower. Restarting the service makes them fast again. Looks like the code below is causing the issue:
Lines 50 to 53 in e3bd739
Every time you create a new AWS4Auth instance, one more copy of x-amz-security-token
gets appended to default_include_headers
. Here's a Python REPL example demonstrating the problem:
>>> from httpx_auth.aws import AWS4Auth
>>> AWS4Auth("test", "test", "us-east-1", "s3", security_token="token").default_include_headers
['host', 'content-type', 'date', 'x-amz-*', 'x-amz-security-token']
>>> AWS4Auth("test", "test", "us-east-1", "s3", security_token="token").default_include_headers
['host', 'content-type', 'date', 'x-amz-*', 'x-amz-security-token', 'x-amz-security-token']
>>> AWS4Auth("test", "test", "us-east-1", "s3", security_token="token").default_include_headers
['host', 'content-type', 'date', 'x-amz-*', 'x-amz-security-token', 'x-amz-security-token', 'x-amz-security-token']
Add a section in documentation to explain how to use botocore with AWS Auth:
As described in #55 (comment)
as per #84 (comment)
There is an issue in the httpx-auth
library where the decoding of base64-encoded JSON within JWT tokens corrupts JSON strings that contain nested JSON. This happens because the double quotes inside the nested JSON string are not correctly handled during the decoding process, leading to a failure when attempting to load the string back into a JSON object.
The issue can be reproduced with the following test case:
import jwt
import json
from httpx_auth._oauth2.tokens import decode_base64
def test_decode_base64_with_nested_json_string():
# Encode a JSON inside the JWT
dummy_token = jwt.encode({"data": json.dumps({"something": ["else"]})}, key="")
header, body, signature = dummy_token.split(".")
# Decode the body
decoded_bytes = decode_base64(body)
# Attempt to load JSON
result = json.loads(decoded_bytes)
assert result == {"data": '{"something": ["else"]}'}
Running this test results in a json.decoder.JSONDecodeError due to incorrect handling of the nested JSON string.
The decoded JSON string should be handled correctly, allowing for proper loading into a Python dictionary without JSON parsing errors.
The test raises the following error due to malformed JSON:
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 12 (char 11)
This error is caused by the way double quotes inside the nested JSON are handled, which corrupts the JSON string during the base64 decoding step.
Python Version: 3.10.11
httpx-auth version: 0.22.0 (2024-03-02)
This issue impacts scenarios where JWT tokens contain nested JSON strings as part of their payload. A fix would likely involve adjusting the base64 decoding function to correctly handle nested JSON strings without corrupting them.
Line 91 in 2580abc
The value for authorization
is actually getting assigned wrongly.
as am getting: 'authorization': '[secure]'
But I've tried to print the auth_str
and it's printed correctly.
From reading the code of botocore it seems that they sign all headers except those in a blacklist:
SIGNED_HEADERS_BLACKLIST = [
'expect',
'user-agent',
'x-amzn-trace-id',
]
httpx_auth
on the other hand works with a include list approach.
Why?
I've checked and the tests pass fine.
Seems like now it is not possible to remove localhost
part from redirect_uri
.
I'm getting this error probably because I can't have exactly they same as I type in admin panel of Timely application:
See https://wakatime.com/developers, by default they send token in html content type, not JSON.
I do not believe that the token request for resource owner password flow requires that the server accept basic auth. See the spec: https://datatracker.ietf.org/doc/html/rfc6749#section-4.3
The spec does, however, require that the server support basic auth for the client credentials grant: https://datatracker.ietf.org/doc/html/rfc6749#section-2.3.1
I could not authenticate with the Drupal Simple OAuth server implementation with Resource Owner Password creds until I removed the basic auth in OAuth2ResourceOwnerPasswordCredentials::_configure_client
:
httpx_auth/httpx_auth/authentication.py
Line 220 in 75e0d19
I think we could remove this altogether. If a server does require/support basic auth for this then a httpx client can be provided to OAuth2ResourceOwnerPasswordCredentials
with this configured. As it is currently implemented basic auth will always be used. If this approach makes sense I'm happy to provide a PR.
As suggested by #23
This package needs to be upgraded to the latest httpx version.
Hi!
I'm experiencing a problem that AWS4 authentication does not seem to work correctly when communicating with the Ceph RADOS Gateway Admin Operations API (https://docs.ceph.com/en/latest/radosgw/adminops/#admin-operations) which uses AWS Signature Version 4 to authenticate. Given the following snippets:
import requests
from requests_aws4auth import AWS4Auth
auth = AWS4Auth("access_key", "secret",
"default", "s3")
with requests.Session() as client:
response = client.put("https://ceph-host/admin/user", auth=auth,
params={"format": "json", "uid": "testtenant$testuser",
"display-name": "Some Test Display name."})
import httpx
from httpx_auth import AWS4Auth
auth = AWS4Auth(access_id="access_key", secret_key="secret",
region="default", service="s3")
with httpx.Client() as client:
response = client.put("https://ceph-host/admin/user", auth=auth,
params={"format": "json", "uid": "testtenant$testuser",
"display-name": "Some Test Display name."})
The first one works like a charm, but the second one using httpx and httpx_auth does not (RADOS Gateway responds with a HTTP status code of 403 an a "SignatureDoesNotMatch" error).
(I chose requests_aws4auth as a counter example here, because it's getting used internally by one of the python libraries the Ceph Admin Ops documentation suggests using: https://github.com/UMIACS/rgwadmin)
It seems that the reason behind this behaviour is that the implementation of httpx_auth does not expect query parameters with arbitrary spaces in them:
def _amz_cano_querystring(qs: str) -> str:
"""
Parse and format querystring as per AWS4 auth requirements.
Perform percent quoting as needed.
qs -- querystring
"""
safe_qs_amz_chars = "&=+"
safe_qs_unresvd = "-_.~"
qs = unquote(qs) # 'Some%20Test%20Display%20name.' gets unquoted here, so split() produces an incorrect result
space = " "
qs = qs.split(space)[0]
qs = quote(qs, safe=safe_qs_amz_chars)
qs_items = {}
for name, vals in parse_qs(qs, keep_blank_values=True).items():
name = quote(name, safe=safe_qs_unresvd)
vals = [quote(val, safe=safe_qs_unresvd) for val in vals]
qs_items[name] = vals
qs_strings = []
for name, vals in qs_items.items():
for val in vals:
qs_strings.append("=".join([name, val]))
qs = "&".join(sorted(qs_strings))
return qs
I'm not very familiar with the AWS Signature Spec, so I can't tell whether this is intended behaviour or not.
If it is, maybe the function should raise something if it detects more than one space in the query string. If it isn't, and it should work with other object storage provider APIs such as Ceph, maybe one could change the implementation to act a bit more forgiving as in requests_aws4auth.
Hello!
Thank you for this project, it greatly simplifies my httpx calls against OAuth protected APIs, without having to deal with secrets exchange and response parsing and everything ^^
While running mypy I saw Skipping analyzing "httpx_auth": found module but no type hints or library stubs
. While looking at the code I think everything is typed, so you could indicate it to type checkers to improve everyone's code safety!
It should be as simple as adding an empty py.typed
file, like in Pydantic or in FastAPI.
You can also add the PyPI classifier Typing :: Typed
(see in FastAPI).
I can open a PR with this change if you want! ๐
I am having an issue with myPython client making requests to an AWS API Gateway endpoint through HTTPX with HTTPX_AUTH AWS Signature Version 4. While version 0.19.0 of httpx_auth works correctly, any version above that results in a 403 Forbidden error.
The error message indicates that the AWS signature you are providing does not match what AWS is expecting. The message also helpfully provides the canonical string and string to sign that AWS generated based on your request.
The changelog for httpx_auth shows that between version 0.19.0 and 0.20.0 there was a significant overhaul of the AWS4Auth implementation to adhere more closely to the AWS documentation. This change may be the cause of the incompatibility you are experiencing.
HTTPx version: v0.27.0, HTTPx_AUTH version: ^v0.20.0 - Results in a 403 Forbidden error
HTTPx version: v0.26.0, HTTPx_AUTH version: v0.19.0 - Works just fine
Snippet of the error 2024-04-12 16:51:02.768 | ERROR | Client error '403 Forbidden' for url 'https://xxxx' Response: {'message': "The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.\n\nThe Canonical String for this request should have been\n'GET\n/xxx\n\nhost:xxx\nx-amz-content-sha256:xxxxx\nx-amz-date:20240412T155102Z\nx-amz-security-token:xxxx\n\nhost;x-amz-content-sha256;x-amz-date;x-amz-security-token\nexxxxx'\n\nThe String-to-Sign should have been\n'AWS4-HMAC-SHA256\nxxxxxxxxx/<region>/execute-api/aws4_request\n262xxx'\n"} For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403
I've tried including headers when making requests with httpx_auth v0.22.0 resulting into a similar error. I am calling the service execute-api
on AWS API Gateway.
Sample code below provided under MIT and Apache 2.0 licenses. This runs and works on your computer if you install the gcloud cli and run gcloud auth login
import asyncio
import threading
import google.auth.transport.requests
import google.auth
import httpx
class GoogleAuth(httpx.Auth):
"""Adds required authorization for requests to Google Cloud Platform.
This gets the default credentials for the running user, and uses them to
generate valid tokens to attach to requests.
"""
def __init__(self, scopes=("https://www.googleapis.com/auth/cloud-platform",)):
self._sync_lock = threading.RLock()
self._async_lock = asyncio.Lock()
self.scopes = scopes
self.creds = None
def _refresh_creds(self):
# Must only be called with a lock.
if self.creds is None:
self.creds, _ = google.auth.default(scopes=self.scopes)
auth_req = google.auth.transport.requests.Request()
self.creds.refresh(auth_req)
def sync_auth_flow(self, request: httpx.Request):
if self.creds is None or self.creds.expired:
with self._sync_lock:
self._refresh_creds()
request.headers["Authorization"] = "Bearer " + self.creds.token
yield request
async def async_auth_flow(self, request: httpx.Request):
if self.creds is None or self.creds.expired:
async with self._async_lock:
await asyncio.to_thread(self._refresh_creds)
request.headers["Authorization"] = "Bearer " + self.creds.token
yield request
client = httpx.Client(auth=GoogleAuth())
response = client.get("https://cloudresourcemanager.googleapis.com/v1/projects")
print(response.json())
Hi!
I have a daemon application that needs to call an API endpoint every X seconds and whenever the token expires the application stop with the following error:
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 787, in request
return self.send(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 878, in send
response = self._send_handling_auth(
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 905, in _send_handling_auth
request = next(auth_flow)
File "/usr/local/lib/python3.8/site-packages/httpx/_auth.py", line 67, in sync_auth_flow
request = next(flow)
File "/usr/local/lib/python3.8/site-packages/httpx_auth/authentication.py", line 284, in auth_flow
token = OAuth2.token_cache.get_token(
File "/usr/local/lib/python3.8/site-packages/httpx_auth/oauth2_tokens.py", line 132, in get_token
new_token = on_missing_token(**on_missing_token_kwargs)
File "/usr/local/lib/python3.8/site-packages/httpx_auth/authentication.py", line 294, in request_new_token
token, expires_in = request_new_grant_with_post(
File "/usr/local/lib/python3.8/site-packages/httpx_auth/authentication.py", line 65, in request_new_grant_with_post
with client:
File "/usr/local/lib/python3.8/site-packages/httpx/_client.py", line 1239, in __enter__
raise RuntimeError(msg)
RuntimeError: Cannot reopen a client instance, once it has been closed.
Looking at the implementation of your client it seems you are using a context manager for managing it. This works correctly only once, because if you use the same httpx client when it enter this function again, the context manager has already closed the connection and that's why the error occurs.
To reproduce the issue just create an httpx client without a context manager and wait till the renewal needs to be done.
Other OAuth2 flows accept scope and if it's a list, convert it string. In case of Implicit, scope is passed directly to request, and if it's a list, it gets encoded with brackets.
Thank you @Colin-b and all contributors for this awesome package!
Just wanted to share some thoughts on possibility of async OAuth2 flow. I don't know all the corner cases this package handles, so please let me know whenever I write something naive.
OAuth2 flows use a global cache with a (threading) lock. It's a problem for async code (#48 (comment)), but also doesn't seem necessary. The token and lock could be simply instance variables. The only problem that that I can think of, is when user creates multiple instances with the same auth server and key. Is there a valid case for such usage?
Actually the cache holds two locks, one for accessing cache, the other for refreshing the lock. Again this seem unnecessary:
The OAuth2 flows use locks within .auth_flow()
method, against the Auth documentation.
If the authentication scheme does I/O such as disk access or network calls, or uses
synchronization primitives such as locks, you should override.sync_auth_flow()
and/or.async_auth_flow()
instead of.auth_flow()
to provide specialized
implementations that will be used byClient
andAsyncClient
respectively.
This is addressed in #48 patch.
Having two locks makes sense when storage and refresh were split. In this case acquiring the refresh lock could be pulled to .a/sync_auth_flow()
.
.auth_flow()
, via .request_new_token()
uses own instance of httpx.Client
. Again, it doesn't seem necessary. Httpx already provides a/sync-portable protocol for making HTTP requests from Auth instances: response = yield request
.
If the auth server needs different transport options than the target server, mounts can be used. And if authentication is needed by the authentication server (meta-auth, client_auth
param), the yielded requests need to be processed by meta-auth first.
Since all OAuth2 implementations in this package follow the pattern:
self.request_new_grant()
steps 1+2 run with a lock, and 3 is very lightweight, the entire flow can run with a single lock.
When the above are addressed, supporting both sync and async should be as simple as implementing this class:
class LockingRefreshingAuth(Auth):
def __init__(self):
self._sync_lock = threading.Lock()
self._async_lock = anyio.Lock()
def sync_auth_flow(self, request: Request) -> Generator[Request, Response, None]:
if self.requires_request_body:
request.read()
with self._sync_lock:
# apply the loop over `self.auth_flow()`
flow = self.auth_flow(request)
...
async def async_auth_flow(self, request: Request) -> AsyncGenerator[Request, Response]:
# like above, but async
Implementations would still yield their auth requests instead of directly using Client, and yield the user request before returning.
I've forked the repo to work on a POC. If it makes sense, and it's welcome, I'll make a PR. Meanwhile, comments are more than welcome.
edit 1: I've just noticed that auth token refresh requests also need to have authentication headers, and I guess that's why a client is used. But I still think it's unnecessary, and simply another Auth instance can be used
edit 2: It makes sense to have two locks if one lock protects a single Auth instance and the other the global cache.
I tested with the latest versions and the aws auth is failing for temporary credentials ( those that require a session token). The version from the original PR I pushed still works so its not a change in AWS. I will take a look soon ( next couple of days) to see if I can figure out what broke it.
Hello Team,
i find your library very helpfull but often i have to automate some scripts also for Java Server Pages or by Spring Basic Login.
But there have a separate way:
an action attribute like action="j_security_checkl"
the entry field for username like name="j_username" and field for password like name="j_password"
i often use the mechanize package but it is better i think to implement it into your package, because i normaly use httpx library
Best regards
The test seems to pass for me just by updating httpx
and pytest-httpx
.
The current code seems to use tokens exactly up to their expiry date (see
httpx_auth/httpx_auth/oauth2_tokens.py
Line 26 in 99ba755
There could be two possible solutions:
expiry
is reached, but expire it a little earlier. E.g. If the token says to expire in 60 seconds, let it expire at 40 seconds and get a new one already. This leaves 20 seconds of leeway. The downside with this approach is that 20 seconds is just an arbitrary value and could still not be sufficient to prevent the problem. It will also not solve issue 2).class MyCustomAuth(httpx.Auth):
def __init__(self, token):
self.token = token
def auth_flow(self, request):
response = yield request
if response.status_code == 401:
# If the server issues a 401 response then resend the request,
# with a custom `X-Authentication` header.
request.headers['X-Authentication'] = self.token
yield request
Of course there should be some kind of limit to the number of retries, so you don't get stuck indefinitely.
Unfortunately I don't have the time to prepare a proper PR, so I am using this as a workaround now (implementation of solution 1):
class _TokenCache(TokenMemoryCache):
premature_expiration_seconds = 30
def _add_token(self, key: str, token: str, expiry: float):
""" If we use tokens exactly up to their expiry date, we will run into problems and race conditions,
so we make the token expire a little earlier in the hope to prevent this problem, see
https://github.com/Colin-b/httpx_auth/issues/23
"""
return super()._add_token(key, token, expiry - self.premature_expiration_seconds)
OAuth2.token_cache = _TokenCache()
Following what is done here:
https://github.com/DavidMuller/aws-requests-auth/blob/2e1dd0f37e3815c417c3b0630215a77aab5af617/aws_requests_auth/boto_utils.py
Any interest in using boto3 to automatically retrieve AWS credentials?
If you want to install httpx_auth with the current version of httpx (0.27) with poetry you get this error:
Using version ^0.21.0 for httpx-auth
Updating dependencies
Resolving dependencies... (0.2s)
Because no versions of httpx-auth match >0.21.0,<0.22.0
and httpx-auth (0.21.0) depends on httpx (==0.26.*), httpx-auth (>=0.21.0,<0.22.0) requires httpx (==0.26.*).
So, because http-scripts depends on both httpx (^0.27.0) and httpx-auth (^0.21.0), version solving failed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.