GithubHelp home page GithubHelp logo

mariostsatsos / container-web-scraper-example Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-samples/container-web-scraper-example

0.0 1.0 0.0 7 KB

An example of a web scraper in a Docker container running on AWS

License: MIT No Attribution

Shell 4.63% Python 25.32% Dockerfile 70.05%

container-web-scraper-example's Introduction

A web scraper in a Docker container hosted on AWS

This example illustrates how to build and run a Docker image containing Firefox web browser, Python libraries, such as Selenium and etc., to host a web scraper on AWS.

Instructions

The example contains a CloudFormation script to rebuild the project and infrastructure automatically on AWS. Please update home.py file in code folder with your logic. Then, archive the content of code folder while naming code.zip the archive and upload it to your S3 bucket, which is specified in CF script under the name S3HostingBucket.

Note

The solution relies on Firefox browser, which has constant updates with important security fixes. Please make sure that you are running the latest version of it or consider alternatives. For example, Selenium requires a web browser, however other scraping libraries can run independently.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

container-web-scraper-example's People

Contributors

amazon-auto avatar brycahta avatar kafka399 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.