GithubHelp home page GithubHelp logo

aox-lei / aox_proxy_pool Goto Github PK

View Code? Open in Web Editor NEW
143.0 143.0 40.0 927 KB

本项目是为了解决在抓取代理ip后, 代理ip失效快, 不稳定的问题 以及代理ip使用不方便等问题。

License: Apache License 2.0

Python 100.00%

aox_proxy_pool's People

Contributors

aox-lei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aox_proxy_pool's Issues

执行scrapy crawl xici 报错?

我使用的是windows 7 +py3.7
F:\aox_proxy_pool\proxy_pool>scrapy crawl xici
2019-07-14 08:17:05,617 - log.py[line:146] - INFO: Scrapy 1.6.0 started (bot: proxy_pool)
2019-07-14 08:17:05 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: proxy_pool)
2019-07-14 08:17:05,630 - log.py[line:149] - INFO: Versions: lxml 4.2.5.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.1, Python 3.7.0 (v3.7.0:1bf9cc50
93, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.3, Platform Windows-7-6.1.7601-SP1
2019-07-14 08:17:05 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.1, Python 3.7.0 (v3.7.0:1bf9cc5093, Ju
n 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.3, Platform Windows-7-6.1.7601-SP1
2019-07-14 08:17:05,651 - crawler.py[line:38] - INFO: Overridden settings: {'BOT_NAME': 'proxy_pool', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'proxy_pool.spiders', 'ROBOTSTXT_OBEY
': True, 'SPIDER_MODULES': ['proxy_pool.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/
537.36'}
2019-07-14 08:17:05 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'proxy_pool', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'proxy_pool.spiders', 'ROBOTSTXT_OBEY': True, 'S
PIDER_MODULES': ['proxy_pool.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36'}
2019-07-14 08:17:05,754 - telnet.py[line:60] - INFO: Telnet Password: 1c62fc3f0d34ef85
2019-07-14 08:17:05 [scrapy.extensions.telnet] INFO: Telnet Password: 1c62fc3f0d34ef85
2019-07-14 08:17:05,873 - middleware.py[line:48] - INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2019-07-14 08:17:05 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2019-07-14 08:17:07,454 - init.py[line:58] - ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init_.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07,653 - init.py[line:58] - ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "https"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "https"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07,830 - init.py[line:58] - ERROR: Loading "scrapy.core.downloader.handlers.s3.S3DownloadHandler" for scheme "s3"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\s3.py", line 6, in
from .http import HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.s3.S3DownloadHandler" for scheme "s3"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\s3.py", line 6, in
from .http import HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
Unhandled error in Deferred:
2019-07-14 08:17:08,096 - _legacy.py[line:154] - CRITICAL: Unhandled error in Deferred:
2019-07-14 08:17:08 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 172, in crawl
return self._crawl(crawler, *args, **kwargs)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 176, in _crawl
d = crawler.crawl(*args, **kwargs)
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1613, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1529, in _cancellableInlineCallbacks
_inlineCallbacks(None, g, status)
--- ---
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 105, in create_engine
return ExecutionEngine(self, lambda : self.stop())
File "c:\python37-32\lib\site-packages\scrapy\core\engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "c:\python37-32\lib\site-packages\scrapy\core\downloader_init
.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import

File "", line 983, in _find_and_load

File "", line 967, in _find_and_load_unlocked

File "", line 677, in _load_unlocked

File "", line 728, in exec_module

File "", line 219, in _call_with_frames_removed

File "c:\python37-32\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
builtins.ModuleNotFoundError: No module named 'win32api'

2019-07-14 08:17:08,236 - _legacy.py[line:154] - CRITICAL:
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 105, in create_engine
return ExecutionEngine(self, lambda : self.stop())
File "c:\python37-32\lib\site-packages\scrapy\core\engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "c:\python37-32\lib\site-packages\scrapy\core\downloader_init
.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:08 [twisted] CRITICAL:
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 105, in create_engine
return ExecutionEngine(self, lambda : self.stop())
File "c:\python37-32\lib\site-packages\scrapy\core\engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "c:\python37-32\lib\site-packages\scrapy\core\downloader_init
.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'

想请问下为什么我运行manager.py里面的check-ip指令时,会有报错

大佬你好~具体情况是,在WINDOWS环境下运行了三个爬取爬虫,然后执行python manager.py check-ip指令会得到以下报错(The system cannot find the path specified.):
(proxy_demo-V1UiAS_e) C:\Users\CLOUDING\proxy_demo\proxy_pool>python manager.py check-ip
2019-04-07 16:00:27,013 - check_proxy.py[line:112] - WARNING: 58.254.220.116:52470 ---- The port is not open
2019-04-07 16:00:31,171 - check_proxy.py[line:129] - WARNING: 182.92.113.183:8118 ------ can"t access
The system cannot find the path specified.
2019-04-07 16:00:33,774 - check_proxy.py[line:112] - WARNING: 121.225.25.134:3128 ---- The port is not open
2019-04-07 16:00:33,779 - check_proxy.py[line:112] - WARNING: 125.40.109.154:44641 ---- The port is not open
2019-04-07 16:00:35,346 - check_proxy.py[line:112] - WARNING: 211.147.239.101:60999 ---- The port is not open
2019-04-07 16:00:35,843 - check_proxy.py[line:112] - WARNING: 182.88.191.79:8123 ---- The port is not open
2019-04-07 16:00:36,177 - check_proxy.py[line:112] - WARNING: 116.209.55.39:9999 ---- The port is not open
2019-04-07 16:00:37,116 - check_proxy.py[line:129] - WARNING: 124.235.135.87:80 ------ can"t access
The system cannot find the path specified.

然后去检查数据库(Mysql),使用pymysql连接,发现数据库中只更新了update_time,不可用的代理score被置为0,其他的可用代理score和weight并没有更新:
image
运行了挺久的时间,发现都没有一个weight>0的IP,speed也都是0,尽管有些明明日志上都打印出代理可用以及速度了:
2019-04-07 16:05:48,644 - check_proxy.py[line:137] - INFO: 119.102.188.101:9999 ------ active proxy, speed:3607, open_ports:9999,22,80
The system cannot find the path specified.
image

所以我考虑是不是这个报错导致更新数据库错误了,还是我有哪里使用出问题了,麻烦大佬帮忙看下呢~感激不尽
以下是我的数据库文件配置:
[mysql]
dsn = mysql+pymysql://root:[email protected]:3306/proxy
init.py文件中的配置:

-- coding: utf-8 --

import logging
import configparser
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

logging.basicConfig(
level=logging.INFO,
format=
'%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s')
#engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')
config = configparser.ConfigParser()
config.read('C:\Users\CLOUDING\proxy_demo\proxy_pool\proxy_pool\config.ini')
engine = create_engine(config.get('mysql', 'dsn'), echo=False, pool_size=500, pool_recycle=3600)
#engine = create_engine('mysql+pymysql://root:[email protected]:3306/proxy', echo=False, pool_size=500, pool_recycle=3600)
Session = sessionmaker(engine)

发现了几处错误

  1. 和上一个issue的问题一样,缺少很多使用文档中没写的包
    2.pipfile中用的是python3.6,说明中pipent --three 这个命令就是错误的,需要指定现有的Python版本。如:pipent -- python 3.7
    3.在我本地能启动mysql服务的情况下执行scrapy crawl xici
    返回错误:sqlalchemy: 2003, "Can't connect to MySQL server on '127.0.0.1'
    懒得解决了,放弃了

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.