GithubHelp home page GithubHelp logo

flipkart_scraping_using_scrapy's Introduction

Flipkart Web Scraping with Scrapy

Overview

This Python project allows you to scrape product information from Flipkart based on your search query and the number of pages you want to scrape. It utilizes the Scrapy framework for efficient and structured web scraping.

Features

  • Scraping Flipkart: The script scrapes product details (e.g., name, price, rating) from Flipkart based on your search query.
  • Customizable Page Limit: You can specify the number of pages to scrape for more extensive results.
  • Structured Data: The scraped data is organized into structured output formats (e.g., JSON, CSV).

Prerequisites

Before running the script, make sure you have the following installed:

  • Python 3.x
  • Scrapy framework (install via pip)

Installation

  1. Clone this repository to your local machine:
git clone https://github.com/yourusername/flipkart-scrapy.git
  1. Navigate to the project directory:
cd flipkart

Usage

  1. Modify the Scrapy spider, located in flipkart/spiders/flipkart_spider.py, to include your search query and any additional details you want to scrape.

  2. In the project's root directory, run the Scrapy spider with the following command:

scrapy crawl flipkart -a search_query="your_query" -a pages_to_scrape=5 -o output.json
  • Replace your_query with the search query you want to use.
  • Adjust pages_to_scrape to specify the number of pages to scrape.
  • The scraped data will be saved in the specified output file (output.json in this example).

Customization

You can customize the spider by modifying the spider script to extract specific information or add more scraping functionalities.

Contributing

If you'd like to contribute to this project, please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/fooBar).
  3. Make your changes.
  4. Commit your changes (git commit -am 'Add some fooBar').
  5. Push to the branch (git push origin feature/fooBar).
  6. Create a new Pull Request.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

flipkart_scraping_using_scrapy's People

Contributors

mrnithish avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.