GithubHelp home page GithubHelp logo

whimfoome / web_scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tusharojha/web_scraper

0.0 1.0 0.0 153 KB

Fix web_scraper package overflow in RegExp with too big files in optimized builds.

Home Page: https://github.com/Whimfoome/web_scraper/tree/overflowregexp

License: Apache License 2.0

Ruby 12.50% Objective-C 0.14% Kotlin 1.63% Dart 84.17% Swift 1.55%

web_scraper's Introduction

Pub

A Simple Web Scraper for Dart & Flutter

A very basic web scraper implementation to scrap html elements from a web page.

Pull requests are most welcome.

Getting Started

In your pubspec.yaml root add:

dependencies:
  web_scraper:

then,

import 'package:web_scraper/web_scraper.dart';

Note that as of version 0.0.6, the project supports not only Flutter projects, but also Dart projects.

Implementation

    final webScraper = WebScraper('https://webscraper.io');
    if (await webScraper.loadWebPage('/test-sites/e-commerce/allinone')) {
      List<Map<String, dynamic>> elements = webScraper.getElement('div.thumbnail > div.caption', ['h4']);
      print(elements);
    }

Checkout web_scraper_test.dart file to have closer look on all functionalities.

Methods

Method Description Arguments Return Type
loadWebPage Loads the webpage into response object and then parse it into the document object String route Future <bool>
loadFromURL Loads the webpage from provided URL into response object and then parse it into the document object String page Future <bool>
loadFromString Loads the webpage from a String (usually stored by the getPageContent method) into response object and then parse it into the document object. This operation is completely synchronous and exists as a helper method to perform compute() flutter operations and avoid jank String responseBodyAsString Future <bool>
getPageContent Returns webpage's html in string format Void String body
getElement Returns List of elements found at specified address String address, List <String> attributes List <Map<String, dynamic>>
getElementTitle Returns List of element titles found at specified address String address List <String>
getElementAttribute Returns List of elements single attribute found at specified address (if you wish to get multiple attributes at once, please use getElement instead) String address, List <String> attributes List <String>
getAllScripts Returns the list of all data enclosed in script tags of the document Void List <String>
getScriptVariables Returns Map between given variable names and list of their occurence in the script tags List <String> variableNames Map <String, dynamic>

Contributing

  • Please branch from develop to implement bug fix/new feature.
  • Ensure that code is formatted according to base dart rules & using the latest stable version of dart.
  • Open a PR with develop as the PR target with a clear description.

web_scraper's People

Contributors

defuncart avatar evandrmb avatar praharshbhatt avatar tushargupta00 avatar tusharojha avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.