GithubHelp home page GithubHelp logo

olx-scraper's Introduction

OLX Scraper

A tool for those who (like me) spend a lot of time looking for useless things on OLX (a kind of Brazilian craigslist)

After dedicating a few good hours to studies related to Web Scraping, Groovy with its various libraries and building APIs with Spring, I decided to improve one of the projects proposed in the process of selecting internships at ZG Soluções.

When it was presented to me, the project, which was relatively simple, was a Web Scraper of ads published on the OLX website (https://www.olx.com.br), running locally and saving the ads found in a spreadsheet at the end of its execution. My proposal for improvement concerns the construction of an API using Spring, where the user informs the title of the advertisement and the state of Brazil in which it is located and then executes the request, in order to obtain the response that consists of several information such as average price of the searched ad, cheapest ad, most expensive ad, number of ads found and finally, a vast list of ads containing their title, value, address and URL.

Reviews and suggestions are always welcome.

Main URL:

https://olx-scraper.herokuapp.com

Stack utilized

Backend: Spring with Groovy and JSOUP.

API documentation

Main requisition

  POST /item
Parâmetro Tipo Descrição
RequestModel json The parameter will be on the body of the requisition.

Example

{
    "state":"go",
    "title":"Fiat Argo"
}

Returns a response with a lot (A LOT) of informations.

{
    "searchTitle": "Fiat Argo",
    "avaragePrice": 367322.08,
    "cheapestItem": {
        "title": "FIAT ARGO 2018 ( PARCELO NO BOLETO )",
        "value": 56.0,
        "address": "Goiânia, Setor Central - DDD 62",
        "adURL": "https://go.olx.com.br/grande-goiania-e-anapolis/autos-e-pecas/pecas-e-acessorios/carros-vans-e-utilitarios/fiat-argo-2018-parcelo-no-boleto-1118460572"
    },
    "moreExpensiveItem": {
        "title": "Fiat Argo 1.0",
        "value": 4.5E7,
        "address": "Itapaci - DDD 62",
        "adURL": "https://go.olx.com.br/grande-goiania-e-anapolis/autos-e-pecas/carros-vans-e-utilitarios/fiat-argo-1-0-1118389270"
    },
    "adAmount": 145,
    "advertisements": [
        {
            "title": "FIAT ARGO 1.0 FIREFLY FLEX DRIVE MANUAL",
            "value": 65644.0,
            "address": "Goiânia, Setor Aeroporto - DDD 62",
            "adURL": "https://go.olx.com.br/grande-goiania-e-anapolis/autos-e-pecas/carros-vans-e-utilitarios/fiat-argo-1-0-firefly-flex-drive-manual-1121983631"
        },
        ...,
        {
            "title": "FIAT ARGO 2021/2021 1.0 FIREFLY FLEX MANUAL",
            "value": 60380.0,
            "address": "Goiânia, Setor Marista - DDD 62",
            "adURL": "https://go.olx.com.br/grande-goiania-e-anapolis/autos-e-pecas/carros-vans-e-utilitarios/fiat-argo-2021-2021-1-0-firefly-flex-manual-1123260717"
        }
    ]
}

Running locally

Clone the project

  git clone https://github.com/mourarezendecas/olx-scraper

Navigate to the project file

  cd olx-scraper

Run the application

  mvn spring-boot:run

olx-scraper's People

Contributors

mourarezendecas avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

olx-scraper's Issues

Running localy

Hi,
I am trying to run olx-scraper locally but I can't access page on http://localhost:8080 Could you please give some more explanation how to get it working?
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.