aivarsk / scrapy-proxies Goto Github PK

Random proxy middleware for Scrapy

License: MIT License

Python 100.00%

scrapy-proxies's Issues

Getting proxy and scraping in one Scrapy project

Hi, Aivars!
I use your Random proxy middleware for Scrapy - scrapy_proxy. It works fine, thank you a lot!

At first, I get list.txt (list of proxies) by scraping free-proxy-site (without proxy rotating)
Then I make scraping of another site, (with scrapy_proxy)
When I run it by two different Scrapy projects it works well.

I tried to run it together in one Scrapy project, unfortunately, it doesn't work. Probably because in this case it tries to use list.txt for proxy rotating which is empty at that moment by request to free-proxy-site.
Is there another way around to handle it?

Thank you

use_real_when_empty':False

Hello, on Reposhub about this I found setting:
'use_real_when_empty':False,
is it works? I've no found function inside..

When run on cloud it return 403 forbidden for every reuest

when i run locally it works fine but i ran it on cloud it says 403 forbidden for every request.

proxylist cant be loaded on Scrapy Cloud

I try several time diferente ways

I add the file "proxylist.txt" in the same folder than setting than the project in addition i upload it to "https://dl.dropboxusercontent.com/s/esdm19mnvz2yguf/proxylist.txt"

I substitute the name in the:
PROXY_LIST = 'https://dl.dropboxusercontent.com/s/esdm19mnvz2yguf/proxylist.txt'
or
PROXY_LIST = 'proxylist.txt'
or
PROXY_LIST = '/proxylist.txt'
PROXY_LIST = '../proxylist.txt'

if i do it like PROXY_LIST = 'proxylist.txt' in my PC, it works like a charm but not once i load it in Scrapy Cloud.

Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 72, in crawl
self.engine = self._create_engine()
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 97, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/usr/local/lib/python2.7/site-packages/scrapy/core/engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "/usr/local/lib/python2.7/site-packages/scrapy/core/downloader/init.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/usr/local/lib/python2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/usr/local/lib/python2.7/site-packages/scrapy/middleware.py", line 36, in from_settings
mw = mwcls.from_crawler(crawler)
File "/app/python/lib/python2.7/site-packages/scrapy_proxies/randomproxy.py", line 55, in from_crawler
return cls(crawler.settings)
File "/app/python/lib/python2.7/site-packages/scrapy_proxies/randomproxy.py", line 35, in init
fin = open(self.proxy_list)
IOError: [Errno 2] No such file or directory: '../proxylist.txt'

please i need some help.

All proxies are unusable, cannot proceed

my proxies.txt file looks like

http://username:password@IP:Port
http://username:password@IP1:Port
.........

there are over 100 dedicated proxies(all are active) but using above library with PROXY_LIST = 'proxies.txt' all
I get All proxies are unusable error.

Does this support shadowsocks

Hi,
buddy,
does this support shadowsocks?

Proxy is set only if proxy_user_pass exists

I looked through your code and found out that proxy meta value was set only when there was proxy_user_pass from proxy record:

def process_request(self, request, spider):
	# Don't overwrite with a random one (server-side state for IP)
	if 'proxy' in request.meta:
		if request.meta["exception"] is False:
			return
	request.meta["exception"] = False
	if len(self.proxies) == 0:
		raise ValueError('All proxies are unusable, cannot proceed')

	if self.mode == ProxyMode.RANDOMIZE_PROXY_EVERY_REQUESTS:
		proxy_address = random.choice(list(self.proxies.keys()))
	else:
		proxy_address = self.chosen_proxy

	proxy_user_pass = self.proxies[proxy_address]

	if proxy_user_pass:
		request.meta['proxy'] = proxy_address
		basic_auth = 'Basic ' + base64.b64encode(proxy_user_pass.encode()).decode()
		request.headers['Proxy-Authorization'] = basic_auth
	else:
		log.debug('Proxy user pass not found')
	log.debug('Using proxy <%s>, %d proxies left' % (
			proxy_address, len(self.proxies)))

Have I missed smth?

How to force remove proxy ??

This is not an Issue.
Sometimes, when we open an URL we can get HTTP.RESPONSE<200>
but gives 0 result... or i can say, i got banned from that website.

Is there any way to force remove the proxy item?

Thank you! Any help accepted :)

Proxy Error // Start and Stop Time for Requests

When I attempt to request the proxy I get a "KeyError: 'proxy'". Previously, I was able to get the IP address prior to using the proxies. Is there anyway to to get the proxy address that is used.

   def parse_item(self, response):
    	item = {}
    	item['url'] = response.url
    	item['download_latency'] = download_latency = response.request.meta['download_latency']
    	**item['proxy'] = response.request.meta['proxy']**

Separate question from the previous, I was wondering if there was any way to get the start and stop time for a request. I'm trying to get a better understanding of concurrent_requests and how best to maximize request / second.

Non auth proxies doesn't work

There's no setting proxy to meta if the proxy url without username and password.

How to check that a proxy is really being used?

In the process_request function the proxy is passed to the request only if has an proxy_user_pass, otherwise only print that the proxy is beign used and which are left. That means that a proxy like https://176.37.14.252:8080 does not work?

This is the function:

def process_request(self, request, spider):
     # Don't overwrite with a random one (server-side state for IP)
     if 'proxy' in request.meta:
         if request.meta["exception"] is False:
             return
     request.meta["exception"] = False
     if len(self.proxies) == 0:
         raise ValueError('All proxies are unusable, cannot proceed')

     if self.mode == Mode.RANDOMIZE_PROXY_EVERY_REQUESTS:
         proxy_address = random.choice(list(self.proxies.keys()))
     else:
         proxy_address = self.chosen_proxy

     proxy_user_pass = self.proxies[proxy_address]

     if proxy_user_pass:
         request.meta['proxy'] = proxy_address
         basic_auth = 'Basic ' + base64.b64encode(proxy_user_pass.encode()).decode()
         request.headers['Proxy-Authorization'] = basic_auth
     else:
         log.debug('Proxy user pass not found')
     log.debug('Using proxy <%s>, %d proxies left' % (
             proxy_address, len(self.proxies)))

proxy failed due to {'status': 407, 'reason': b'Proxy Authentication Required'}

Hi i getting this error may time i have used my own custom middleware i have passed proxy like this http://username:[email protected]:12345"

error message
scrapy.core.downloader.handlers.http11.TunnelError:Could not open CONNECT tunnel with proxy 104.120.33.32:12345 [{'status': 407, 'reason': b'Proxy Authentication Required'}]

Not working with scrapy-splash

I'm using scrapy-splash to crawl an ajax site. And when using scrapy-proxies, it seems that the request is not sending through the proxy, the proxy is not working at all.

Properly formatting ProxyList.txt file

Hey there,

Just looking for some basic info. I'm trying to figure out how to properly build my ProxyList.txt file. I've got the IP addresses from HMA pro but i'm not sure how to locate the port which goes at the end. I've tried searching on google how to find the ports but still not sure. Is there another free service i could use to find the information I need(IP address and port)?

Thanks a ton

Can you add it to Pypi?

and what's the license of the code?
Thank you =)

Retry won't pick a new proxy.

Hi,
I use a proxies list to run my spider. However, it failed to pick a new porxy when the connection failure happens.

2016-09-20 17:48:25 [scrapy] DEBUG: Using proxy http://xxx.160.162.95:8080, 3 proxies left
2016-09-20 17:48:27 [scrapy] INFO: Removing failed proxy http://xxx.160.162.95:8080, 2 proxies left
2016-09-20 17:48:27 [scrapy] DEBUG: Retrying <GET http://jsonip.com/> (failed 1 times): User timeout caused connection failure: Getting http://jsonip.com/ took longer than 2.0 seconds..
2016-09-20 17:48:29 [scrapy] INFO: Removing failed proxy http://xxx.160.162.95:8080, 2 proxies left
2016-09-20 17:48:29 [scrapy] DEBUG: Retrying <GET http://jsonip.com/> (failed 2 times): User timeout caused connection failure: Getting http://jsonip.com/ took longer than 2.0 seconds..
2016-09-20 17:48:31 [scrapy] INFO: Removing failed proxy http://xxx.160.162.95:8080, 2 proxies left
2016-09-20 17:48:31 [scrapy] DEBUG: Gave up retrying <GET http://jsonip.com/> (failed 3 times): User timeout caused connection failure: Getting http://jsonip.com/ took longer than 2.0 seconds..

Please help to fix this problem.
thanks a lot

Change proxy on http code 429 and dont die

Hi, is it possible to change proxy on http code 429?

If i get 429 error, i want to change to another proxy from the list

So i want run PROXY_MODE = 1 but if i get 429, i want to check/change to new proxy

I am confused about the non-username-password proxy logic

if proxy_user_pass:
            request.meta['proxy'] = proxy_address
            basic_auth = 'Basic ' + base64.b64encode(proxy_user_pass.encode()).decode()
            request.headers['Proxy-Authorization'] = basic_auth
else:
        log.debug('Proxy user pass not found')
        log.debug('Using proxy <%s>, %d proxies left' % (
        proxy_address, len(self.proxies)))

I am very confused here as a noob python developer. From this part of the logic in randomProxy file. It seems like if the proxy provided in the list.txt is in the http://username:password@host2:port format, then it will work by assigning proxy_address to the request, otherwise, do nothing but logging debug...

What am I missing here?

Feature: Proxy mode: change proxy every N-th request

If it's possible, please, add a possibility to change proxy every N-th request.
Add a variable (for setting N) and a new value for "Proxy mode" for this.

Proxy error

when the flie have empty line this code many case error

when the flie have empty line this code many case error ？
if parts.group(2): exceptions.AttributeError: 'NoneType' object has no attribute 'group'
so,before group we can check it? like this

if parts:
    if parts.group(2):
    ...

How to check failure manually?

Is there any way to check failure through something except HTTP status code?
Maybe based on response body, headers or something else !?

proxyfile name as an argument

Can I pass the name of the proxyfile as a variable to scrapy?
So if I'm running multiple crawlers at the same time, I would be able to use different list of proxies.

Thank you

I want to use databases like mysql to store ips,how should I do?

Proxy API

It's possible use API ProxyList like https://getproxylist.com/#the-api ?

Verify slow crawling

I gave a list of about 300 proxies but set CONCURRENT_REQUESTS = 64. Still it seems that crawling is very slow (like 1 page every few seconds on average), much slower than not using any proxy at all. Of course DOWNLOAD_DELAY is low.

Looking at it, it seems that people should usually also increase CONCURRENT_REQUESTS_PER_DOMAIN in these cases (i.e. with a list of many possibly bad proxies), but even then it's still pretty slow.

ValueError: All proxies are unusable, cannot proceed

I'm getting this error:

ValueError: All proxies are unusable, cannot proceed

2017-05-13 14:09:02 [scrapy.utils.log] INFO: Scrapy 1.3.3 started (bot: scrapy_bets)
2017-05-13 14:09:02 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'scrapy_bets.spiders', 'FEED_URI': 'matches.json', 'SPIDER_MODULES': ['scrapy_bets.spiders'], 'RETRY_TIMES': 10, 'BOT_NAME': 'scrapy_bets', 'RETRY_HTTP_CODES': [500, 503, 504, 400, 403, 404, 408], 'FEED_FORMAT': 'json'}
2017-05-13 14:09:02 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2017-05-13 14:09:02 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy_proxies.RandomProxy',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-05-13 14:09:02 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-05-13 14:09:02 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-05-13 14:09:02 [scrapy.core.engine] INFO: Spider opened
2017-05-13 14:09:02 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-05-13 14:09:02 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-05-13 14:09:02 [scrapy.core.scraper] ERROR: Error downloading <GET http://url_to_parse>
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1301, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "/usr/local/lib/python2.7/site-packages/scrapy_proxies/randomproxy.py", line 63, in process_request
raise ValueError('All proxies are unusable, cannot proceed')
ValueError: All proxies are unusable, cannot proceed
2017-05-13 14:09:02 [scrapy.core.scraper] ERROR: Error downloading <GET http://url_to_parse>
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1301, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "/usr/local/lib/python2.7/site-packages/scrapy_proxies/randomproxy.py", line 63, in process_request
raise ValueError('All proxies are unusable, cannot proceed')
ValueError: All proxies are unusable, cannot proceed
2017-05-13 14:09:02 [scrapy.core.engine] INFO: Closing spider (finished)
2017-05-13 14:09:02 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 2,
'downloader/exception_type_count/exceptions.ValueError': 2,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2017, 5, 13, 13, 9, 2, 915138),
'log_count/DEBUG': 1,
'log_count/ERROR': 2,
'log_count/INFO': 7,
'scheduler/dequeued': 2,
'scheduler/dequeued/memory': 2,
'scheduler/enqueued': 2,
'scheduler/enqueued/memory': 2,
'start_time': datetime.datetime(2017, 5, 13, 13, 9, 2, 694730)}
2017-05-13 14:09:02 [scrapy.core.engine] INFO: Spider closed (finished)

Response never received errors

Does anyone else experience timeout errors, specifically immediately after redirects?

I've only set this up today, but specifically https://www.game.co.uk/en/hardware/xbox-series-x/?contentOnly=&inStockOnly=true&listerOnly=&pageSize=100

I can fetch it ok with Scrapy fetch, but if I try to use a spider that crawls the URL, I hit a 302 redirect and my crawl just completely errors out from that point with immediate failures "response never received". It's not long timeouts, it's literally just erroring immediately.

Please could somebody help me? I'm fairly new to this and I have no idea what the cause may be.

I'm using a pool of 10 http proxies on port 80

Error

2018-07-26 10:26:02 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-07-26 10:26:02 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-07-26 10:26:02 [scrapy.proxies] DEBUG: Proxy user pass not found
2018-07-26 10:26:02 [scrapy.proxies] DEBUG: Using proxy https://185.93.3.70:8080, 1 proxies left
2018-07-26 10:26:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://piknu.com/u/isabel_sanzz/similar> (failed 1 times): 403 Forbidden
2018-07-26 10:26:03 [scrapy.proxies] DEBUG: Proxy user pass not found
2018-07-26 10:26:03 [scrapy.proxies] DEBUG: Using proxy https://185.93.3.70:8080, 1 proxies left

ImportError: No module named scrapy_proxies

I'm getting this error when I run:

Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = g.send(result)
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 90, in crawl
six.reraise(*exc_info)
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 72, in crawl
self.engine = self._create_engine()
File "/Library/Python/2.7/site-packages/scrapy/crawler.py", line 97, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/Library/Python/2.7/site-packages/scrapy/core/engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "/Library/Python/2.7/site-packages/scrapy/core/downloader/init.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "/Library/Python/2.7/site-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/Library/Python/2.7/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/Library/Python/2.7/site-packages/scrapy/utils/misc.py", line 44, in load_object
mod = import_module(module)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/init.py", line 37, in import_module
import(name)
ImportError: No module named scrapy_proxies

https proxy issue

I have an issue where proxies are not being used when accessing https:// websites (uses my actual ip).

I've verified that my proxies do support https:// (by setting the env variable set HTTPS_PROXY=proxy address works).

Setting proxies in my proxy_list to http:// and https:// does not make a difference.

process exception type

process_exception(request, exception, spider)
Scrapy calls process_exception() when a download handler or a process_request() (from a downloader middleware) raises an exception (including an IgnoreRequest exception)

There has a problem when the character “@” within password

There has a problem when the character “@” within password, maybe we should make the regex pattern more compatible? :) Here is my solution:

parts = re.match('(\w+://)([^:]+?:.+@)?(.+)', line.strip())

instead of

parts = re.match('(\w+://)([^:]+?:[^@]+?@)?(.+)', line.strip())

TypeError: memoryview: a bytes-like object is required, not 'str'

Hi, I am getting the error below when using the DOWNLOADER_MIDDLEWARES indicated in the ReadMe (I added a proxy list, etc..). Read a bunch of threads on SO but couldn't fix my issue.

Appreciate any help
thanks

Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/base64.py", line 517, in _input_type_check
m = memoryview(s)
TypeError: memoryview: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/twisted/internet/defer.py", line 1386, in _inlineCallbacks
result = g.send(result)
File "/usr/local/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 37, in process_request
response = yield method(request=request, spider=spider)
File "/usr/local/lib/python3.6/site-packages/scrapy_proxies/randomproxy.py", line 70, in process_request
basic_auth = 'Basic ' + base64.encodestring(proxy_user_pass)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/base64.py", line 547, in encodestring
return encodebytes(s)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/base64.py", line 534, in encodebytes
_input_type_check(s)
File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/base64.py", line 520, in _input_type_check
raise TypeError(msg) from err
TypeError: expected bytes-like object, not str
2017-10-16 23:19:20 [scrapy.core.engine] INFO: Closing spider (finished)
2017-10-16 23:19:20 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 1,
'downloader/exception_type_count/builtins.TypeError': 1,
'finish_reason': 'finished

when hanppen 403,why not can del bad proxy

2017-07-12 14:35:33 [scrapy.proxies] DEBUG: Using proxy http://208.92.94.191:1080, 91 proxies left
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)
/search/category/2/10/g251p6?aid=79417082%2C20944119%2C67545588%2C512124%2C4665606%2C2517868%2C68124250%2C77336676%2C19331058%2C91955011%2C52802565%2C92076417&cpt=79417082%2C20944119%2C67545588%2C512124%2C4665606%2C2517868%2C68124250%2C77336676%2C19331058%2C91955011%2C52802565%2C92076417&tc=1 ==================
2017-07-12 14:35:34 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.dianping.com/shop/70170698> (failed 1 times): 403 Forbidden
2017-07-12 14:35:34 [scrapy.proxies] DEBUG: Using proxy http://110.244.119.139:80, 91 proxies left
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Acoo Browser; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506)
2017-07-12 14:35:35 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.dianping.com/shop/507618> (failed 1 times): 403 Forbidden
2017-07-12 14:35:35 [scrapy.proxies] DEBUG: Using proxy http://125.89.121.179:808, 91 proxies left
Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9) Gecko/20080705 Firefox/3.0 Kapiko/3.0

Dynamic Proxy restart

Is it possible to restart the proxylist if it gets to 0? I have dynamic proxies that refresh every 15 min, so I want scrapy to restart the list if len(self.proxies) == 0.

Thanks!

Passing proxy via meta in start request throws KeyError: 'exception'

I see this is caused by the line no 83.

 if 'proxy' in request.meta:
            if request.meta["exception"] is False:
                return

If we have used proxy in start requests function then this issue arises which makes sense because exception is not defined in meta up to this point for our first request.

I guess most of us either use a random proxy or custom proxy. So no one ever bothered about it.
I think the line 83 is important because it enables to change proxies in each retry or after exception.

def start_requests(self):

        yield scrapy.Request('http://quotes.toscrape.com/', callback=self.parse, meta={'proxy': 'http://xxxx:xxxx@xxxx:xxxx'})

Also to change the proxy in retry. Comment out this in process_exception. #15

       if 'proxy' not in request.meta:
             return

aivarsk / scrapy-proxies Goto Github PK

scrapy-proxies's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs