Hola, lo primero agradecer el estupendo proyecto y tutorial que realizaste, en concreto estoy realizando un mismo proyecto con la web de einforma y una lista de empresas, la cual me encantaria categorizar, pero reproduciendo tu mismo ejemplo me salta el mismo error que cuando tiro el modificado para mi caso, cuando sigo los pasos del tuorial y descargo tu git y ejecuto el scrapy me salta:
(mercado) root@kali:~/mercado/mercado# scrapy crawl mercado -t csv
/root/mercado/mercado/mercado/spiders/spider.py:6: ScrapyDeprecationWarning: Module scrapy.spider
is deprecated, use scrapy.spiders
instead
from scrapy.spider import CrawlSpider, Rule
2018-03-01 12:50:17 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: mercado)
2018-03-01 12:50:17 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.9.0, Python 3.6.4 (default, Jan 5 2018, 02:13:53) - [GCC 7.2.0], pyOpenSSL 17.5.0 (OpenSSL 1.1.0g 2 Nov 2017), cryptography 2.1.4, Platform Linux-4.13.0-kali1-amd64-x86_64-with-Kali-kali-rolling-kali-rolling
2018-03-01 12:50:18 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'mercado', 'DOWNLOAD_DELAY': 2, 'NEWSPIDER_MODULE': 'mercado.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['mercado.spiders']}
2018-03-01 12:50:18 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.logstats.LogStats']
2018-03-01 12:50:18 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-03-01 12:50:18 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
Unhandled error in Deferred:
2018-03-01 12:50:18 [twisted] CRITICAL: Unhandled error in Deferred:
2018-03-01 12:50:18 [twisted] CRITICAL:
Traceback (most recent call last):
File "/root/mercado/mercado/lib/python3.6/site-packages/twisted/internet/defer.py", line 1386, in _inlineCallbacks
result = g.send(result)
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/crawler.py", line 105, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/core/engine.py", line 70, in init
self.scraper = Scraper(crawler)
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/core/scraper.py", line 71, in init
self.itemproc = itemproc_cls.from_crawler(crawler)
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/middleware.py", line 58, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/utils/misc.py", line 44, in load_object
mod = import_module(module)
File "/root/mercado/mercado/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 955, in _find_and_load_unlocked
File "", line 665, in _load_unlocked
File "", line 678, in exec_module
File "", line 219, in _call_with_frames_removed
File "/root/mercado/mercado/mercado/pipelines.py", line 11, in
from scrapy.pipelines.images import ImagesPipeline
File "/root/mercado/mercado/lib/python3.6/site-packages/scrapy/pipelines/images.py", line 15, in
from PIL import Image
ModuleNotFoundError: No module named 'PIL'
Podrias decirme a que es debido, desde ya muy agradecido.
Un saludo