GithubHelp home page GithubHelp logo

Comments (10)

cooperlees avatar cooperlees commented on August 15, 2024

bandersnatch should already support http along with socks proxies - https://bandersnatch.readthedocs.io/en/latest/mirror_configuration.html#proxy

Do we need to update docs somewhere else to avoid confusion there? What made you think we couldn't do http?

https is not as simple due to limitations with aiohttp/asyncio that we use. Please see https://docs.aiohttp.org/en/stable/client_advanced.html#proxy-support

I would consider a PR that makes https work if that's now reliable with >= python3.10 ... I'd even accept it if it's >= 3.11 and we gate it appropriately.

Sadly, as of today we still have dependencies that don't work in 3.12. Would love any help there too, if 3.12 allows better https proxy support.

from bandersnatch.

encbladexp avatar encbladexp commented on August 15, 2024

Hi, thanks for your quick reply. Lets dig a little bit deeper, I am debugging an Issue with the Katello Foreman Plugin, which is using Pulp and Bandersnatch to Mirror parts of pypi. The issue we are facing is, that on the beginning of the process we see traffic on our Proxy, but later on we see timeouts in the logs, but no more traffic on our Proxy.

While debugging this, I found

socks_connector = self._check_for_socks_proxy()
In the code. Which pointed me into the direction that only SOCKS Proxies are supported at all by bandersnatch.

HTTP is enough, not need for HTTPS proxies as of now, in the end its just HTTP CONNECT towards a Proxy anyways.

Based on your Feedback, I checked Pulp Python on how the Proxy gets passed: By using a Environment Variable.

Which means it is possible to configure a SOCKS Proxy by using Environment Variables in Bandersnatch, but needing a config entry for SOCKS Proxies, right?

Of course, I will create a Issue for Pulp Python, probably they write a configuration file including the needed Proxy settings.

from bandersnatch.

cooperlees avatar cooperlees commented on August 15, 2024

Yeah, I had to use a plugin for SOCKS proxy support someone requested long ago, everything else is native aiohttp proxy support.

I checked - We allow the aiohttp proxy env vars too so that should work, providing you don't supply a SOCKS proxy:

trust_env=True if not socks_connector else False,

from bandersnatch.

encbladexp avatar encbladexp commented on August 15, 2024

It took some time, but I was able to debug the issue, and have at least a hotfix for my demo environment. I hardcoded self.proxy in

self.proxy = proxy
, and it worked immediately.

I also implemented some debug statements to find out if the proxy environment variables are configured properly: They are. Which means we know from Katello to Pulp, from Pulp to Pulp Python and finally to bandersnatch we have a proper Proxy configuration place. The missing piece is now: Why is aiohttp not using my configured Proxy? Are we passing a configuration to aiohttp, which disables proxy support?

Any guidance on how I could debug this further? The Proxy we are using is a normal HTTP Proxy, nothing special, and works well after hardcoding self.proxy and with curl on a shell.

from bandersnatch.

cooperlees avatar cooperlees commented on August 15, 2024

The env vars being set by the code above seem correct ... I did a test with uppercase and lowercase of http_proxy and https_proxy set and it worked for me ...

I did a quick test from my laptop, as I have squid proxies across my personal infra and it all worked as I imagined.

(Cloned just to get the CI config file that runs on GitHub actions as it downloads a small set of packages from PyPI)

git clone https://github.com/pypa/bandersnatch.git
cd bandersnatch
python3.11 -m venv --upgrade-deps /tmp/tb
/tmp/tb/bin/pip install bandersnatch
mkdir /tmp/pypi
export http_proxy=http://10.254.254.15:3128/
export https_proxy=http://10.254.254.15:3128/
/tmp/tb/bin/bandersnatch --debug  -c src/bandersnatch/tests/ci.conf mirror
  • Using tcpdump -n -nn -I any 'tcp port 3128' I see proxy connections working
20:41:04.958580 IP 10.254.254.15.3128 > 10.6.9.69.64544: Flags [P.], seq 218547034:218547058, ack 49518, win 501, options [nop,nop,TS val 251298194 ecr 2140719684], length 24
20:41:04.958580 IP 10.254.254.15.3128 > 10.6.9.69.64544: Flags [F.], seq 218547058, ack 49518, win 501, options [nop,nop,TS val 251298194 ecr 2140719684], length 0

So I don't know what to help debug here. Config set proxy takes precedence over ENV Variables from reading the code too as we set it on the respective GET calls ...

from bandersnatch.

encbladexp avatar encbladexp commented on August 15, 2024

I assume you are running the most current version of bandersnatch, right? On our end we have 5.3.0 (bandersnatch) and 3.8.3 of aiohttp. Could this be the difference?

from bandersnatch.

cooperlees avatar cooperlees commented on August 15, 2024

Maybe, but i would expect that version aiohttp to work with a http proxy (not https). Just so it's known I'm also using Python 3.11. I'd use 3.12 but we still have dependencies not ready for 3.12. Much sad.

I tried the same but with your version + aiohttps, but got an error (/tmp/tb/bin/pip install aiohttp==3.8.3 bandersnatch==5.3.0 ):

  File "/tmp/tb/lib64/python3.11/site-packages/pkg_resources/__init__.py", line 2522, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/tb/lib64/python3.11/site-packages/bandersnatch_filter_plugins/latest_name.py", line 5, in <module>
    from packaging.version import LegacyVersion, Version, parse
ImportError: cannot import name 'LegacyVersion' from 'packaging.version' (/tmp/tb/lib64/python3.11/site-packages/packaging/version.py)

So I then tried latest bandersnatch with aiohttp 3.8.3 and it worked (/tmp/tb/bin/pip install -U bandersnatch):

07:59:23.155503 IP 10.254.254.15.3128 > 10.6.9.12.44204: Flags [P.], seq 7011372:7012696, ack 1154, win 501, options [nop,nop,TS val 714933405 ecr 1452479961], length 1324
07:59:23.155922 IP 10.6.9.12.44204 > 10.254.254.15.3128: Flags [.], ack 7012696, win 21745, options [nop,nop,TS val 1452479962 ecr 714933405], length 0
  • IP differs as I'm on my work Linux desktop this morning

So I would double check your environment is correctly set and unless bandersnatch 5.3.0 had a proxy bug, I don't think it's bandersnatch's fault here.

from bandersnatch.

encbladexp avatar encbladexp commented on August 15, 2024

I am out of office for about a week, I would do the same test as you did on the machine, so we know if it something wired on the Katello/Foreman Integration, something host specific, or something else.

But its good to know that is works for you. 👍

from bandersnatch.

encbladexp avatar encbladexp commented on August 15, 2024
git clone https://github.com/pypa/bandersnatch.git
cd bandersnatch
python3.9 -m venv --upgrade-deps /tmp/tb
/tmp/tb/bin/pip install bandersnatch==5.3.0 aiohttp==3.8.3 packaging==21.3
mkdir /tmp/pypi
export http_proxy=http://10.254.254.15:3128/
export https_proxy=http://10.254.254.15:3128/
/tmp/tb/bin/bandersnatch --debug  -c src/bandersnatch/tests/ci.conf mirror

I was able to get it working on the system which has issues. The error you got was related to packaging >= 22.0. It also works with the system provided version of bandersnatch. Of course, I used the correct variables for http_proxy and https_proxy.

So its not working in a Foreman/Katello case, in every other scenario I could test it works well. Bandersnatch is just using aiohttp within the same process right, so it could not be something wired / unexpected like environment variables get not passed properly?

(Should be very unlikely.)

from bandersnatch.

encbladexp avatar encbladexp commented on August 15, 2024

I updated our Demo Environment to the current Katello / Foreman release, and the issue disappeared.

from bandersnatch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.