GithubHelp home page GithubHelp logo

Comments (6)

JWCook avatar JWCook commented on June 10, 2024 1

Thanks for looking into this. So the more general case of this is when a request results in a cache miss, but the same request with updated headers results in a cache hit. That isn't something I expected, but it makes sense after looking at your example with the GitHub API.

Specifically, this is a corner case with:

  • streaming requests
  • conditional requests
  • response redirects that change based on validation status

In this case, when the remote content has changed (200 response instead of 304) GitHub will redirect the request from api.github.com to codeload.github.com.

from requests-cache.

JWCook avatar JWCook commented on June 10, 2024

Thanks for the bug report. That attribute is supposed to have been set in requests.Response.__init__() before being wrapped by OriginalResponse.

I'm not sure yet why that's happening, but it should be easy enough to fix.

from requests-cache.

netomi avatar netomi commented on June 10, 2024

I am not 100 percent sure why it happens, will try to compile a test case. It happens when accessing the api from GitHub to download an archive of a repository with stream enabled.

Atm I use force refresh to circumvent the problem. I have not noticed with a previous version of the cache, only since I updated to the latest version but that might be a concidence.

from requests-cache.

netomi avatar netomi commented on June 10, 2024

Ok so this test fails after the response has been cached in a filesystem cache.
Strangely, if I run the test a second time, it still works, but from the third time on it fails consistently with the error as above. So something weird is going on that I will need to dig further into it.

from requests_cache import CachedSession
from tempfile import TemporaryFile


def test():
    session = CachedSession(
            "test",
            backend="filesystem",
            use_cache_dir=False,
            cache_control=True,
            allowable_methods=["GET"],
        )

    headers = {
        "Accept": "application/vnd.github+json",
        "X-GitHub-Api-Version": "2022-11-28",
        "X-Github-Next-Global-ID": "1",
    }

    response = session.request(
            "GET",
            url="http://api.github.com/repos/requests-cache/requests-cache/zipball/main",
            headers=headers,
            refresh=True,
            stream=True,
        )

    with TemporaryFile() as file:
        for chunk in response.iter_content(chunk_size=8192):
            file.write(chunk)

from requests-cache.

netomi avatar netomi commented on June 10, 2024

I do believe its due to redirects. The url that is requested is redirected to some other url.
At some point in the _send_and_cache method we receive a CachedResponse that we try to wrap in an OriginalResponse class which then leads to the problem. If I change the code to the following:

        if isinstance(response, CachedResponse):
            return response
        else:
            return OriginalResponse.wrap_response(response, actions)

the problem goes away. or this could also happen in the wrap_response method itself. Though I do not fully understand if the execution flow is correct in case of redirects and this would require a fix at a different place to make it clean.

from requests-cache.

netomi avatar netomi commented on June 10, 2024

Initially I thought its related to an update of the requests-cache library but I guess it was just a concidence and I tested it with older versions and triggered the same behavior. GitHub must have added a redirect for that API call that resulted in that behavior as it was working perfectly for a couple of months before.

Ty for fixing this.

from requests-cache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.