A simple web scrapper that uses Puppeteer under the hood to scrape an Airbnb listing and shove it into a MongoDB.
Below is a quick outline of the structure of the app:
.
├── api # API routes
| ├── listing.ts
│ └── ...
├── config # Various configuration objects
| ├── db.ts
│ └── ...
├── interfaces # TS interfaces
| ├── listing.ts
│ └── ...
├── middlewares # Custom Express middlewares
| ├── errorHandler.ts
│ ├── ...
├── models # MongoDB models
| ├── listing.ts
│ └── ...
├── modules # Modules are used to separate code to make it more testable
| ├── puppeteer # This handles all the Puppeteer magic
| | ├── scrapeListing.ts
| | ├── index.ts
| │ └── ...
│ └── ...
├── schemas # MongoDB schemas
| ├── listing.ts
│ └── ...
├── index.ts # Entrypoint - starts the server.
└── server.ts # This is where the Express app is setup and configured.
These are the instructions that tell you how to get up and running.
- Install Docker
- Install Docker Compose
- Build the image and start the container:
docker-compose up
- Once the container is up it will be running on: http://localhost:1337
- Install Node.js v6.11.0+
- Install MongoDB
- Install Yarn
- (Optional) Install a MongoDB GUI client. I recommend MongoDB Compass... for no other reason other then I use it and couldn't be arsed to test others.
- Install the
node_modules
:
yarn install
- Copy the
.env.example
into a.env
file using:
cp -n .env.example .env
-
Ensure MongoDB is running.
-
Start the server:
yarn start
- You can check the API using the following cURL command:
curl -X POST \
http://localhost:1337/api/listing \
-H 'Content-Type: application/json' \
-d '{"url": "https://www.airbnb.co.uk/rooms/28299515?location=London%2C%20United%20Kingdom&toddlers=0&_set_bev_on_new_domain=1572300146_ZKC6996OiM8G0CT3&source_impression_id=p3_1572300147_bRb1KSr%2FXjuPRPDg&guests=1&adults=1"}'
- (Optional) If you installed a MongoDB GUI client, you can now see the listing has been created/updated.
- You can run the tests using:
yarn test
- You shouldn't need your
mongod
running (or even installed) as the tests spin up an in-memory MongoDB and tear down at the end.
- Credit to nodkz and his mongodb-memory-server. I discovered it building this project and I am thoroughly impressed!