3spiders-with-itemloader's Introduction

3 scrapy spiders implemented with ItemLoader

There are 3 spiders implemented with ItemLoader and without it. You can consider usage of ItemLoader for your spiders after reading the readme file until the end.

Table of Contents

Introduction
Migration steps
Installation
Scraping
More spiders
Summary
License

Introduction

If you are a newbie in scrapy but have already written several spiders and gonna write more spiders, you should consider usage of ItemLoader if you don't use it yet. I will not describe features of ItemLoader and processors, check out official docs for this. But I will show migration from real world spiders without ItemLoader to spiders with ItemLoaders.

Migration steps

Installation

$ pip install scrapy
$ git clone [email protected]:taroved/3spiders-with-itemloader.git
# check contracts for spiders
$ cd 3spiders-with-itemloader
$ scrapy check

Scraping

$ scrapy crawl apple

Output scrapped data to the file and write log file:

$ scrapy crawl apple -o apple.json --logfile=apple.log

More spiders

The second spider scrape locations from wetseal.com:

$ scrapy crawl wetseal -o wetseal.json

The third spider scrape products from hhgregg.com:

$ scrapy crawl hhgregg -o hhgregg.json

Summary

I haven't said a lot, but you can take a look at the full diff between versions without and with ItemLoader for the spiders and make the right decision.

License

WTFPL

Recommend Projects

bowlofstew / 3spiders-with-itemloader Goto Github PK

3spiders-with-itemloader's Introduction

3 scrapy spiders implemented with ItemLoader

Introduction

Migration steps

Installation

Scraping

More spiders

Summary

License

3spiders-with-itemloader's People

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs