Vizion API Challenge

👋 Hi there!

In this project, you'll build a simple API that fetches some info about a given URL/webpage and makes the results accessible. The goal of this project is to see how you approach a given problem & set of requirements with little constraint on how to approach it.

Vizion API Challenge
Setting Up
Requirements
Bonus Points
- Submitting Your Work

Setting Up

To get started, make sure you have Node.js installed. We recommend the active LTS release. Afterward, clone this repository. The project will contain an empty index.js file you're free to begin working in. If you have another approach in mind, just delete this file.

FYI, our stack is largely based on TypeScript & Node.js. We use PostgreSQL for our primary database, but any relational database is fine. How you tackle this project is entirely up to you!

Requirements

Develop a RESTful API to complete the following:

1. Create a New Reference

Add an endpoint that accepts a URL in the request body and create and return a new Reference record as JSON.
During this process, you should also initiate an asynchronous task to fetch data from the URL saved in the Reference. More information on fetching data is below.
Note: The endpoint should return the Reference record without waiting for it to be processed.

2. Process the New Reference

Implement an async worker function that processes the reference. This function should take a Reference as an argument.
Given the Reference url field, get the text content from the page's title and any meta elements (if they exist) with their names and values serialized to create a semi-structured representation of a page's title & metadata.
Return the data as an object and create a new Result record in the database, storing the info as JSON or a serialized string into the record's data column.

3. Make the Results Accessible

Add a new GET endpoint that allows a user to fetch results for a given Reference ID. This endpoint should return a list of saved Results for a given Reference as JSON. Don't forget to keep it RESTful and keep resource naming best practices in mind as you go.

Data Fetching Notes

In your processing task, you'll need to fetch the contents of a webpage and extract information from its DOM. To do this we recommend fetching and working with the page content using browser automation tools like Puppeteer or Playwright.

⚠️ BEWARE! ⚠️

Fetching HTML via HTTP and being able to extract your information without any additional effort is becoming increasingly less common these days with the rise of JS-dependent rendering, SPAs, and other complexities like bot detection or browser fingerprinting. If you'd like to challenge yourself a bit further, check out ToScrape, which has a number of great scenarios already laid out and designed to be extracted!

Data Models

Reference

A reference is created when a user makes a call to POST /references

Field	Type	Description
`id`	primary key	the reference identifier
`url`	string	a valid web address
`created_at`	timestamp	reference created time

Result

A result is created after a data fetching task for a Reference is completed.

Field	Type	Description
`id`	primary key	the reference identifier
`reference_id`	foreign key	the related reference
`data`	json	Result from the fetching task
`created_at`	timestamp	result created time

Bonus Points

Other things that are not required, but we would love to see:

Test coverage (We tend to use Jest)
Additional validations
More endpoints (fetch all references, delete a reference & its results, etc.)
Make use of an actual job queue (Redis, ElasticMQ, etc.)
Scheduling/interval-based reprocessing of existing references to monitor changes
Anything else you can think of!

Suppose you don't implement bonus items, no worries. Feel free to share some notes of things you might do and how you might have gone about them given more time.

Submitting Your Work

‼️ Be sure to commit your changes to the main branch before submitting ‼️

When you have finished the exercise, please create a bundle of your work by running npm run bundle in the project root.

This will create a bundle file called take-home-challenge.bundle based on your local main branch. Send the file to us via email, or if you received a submission link from your hiring manager, please upload it there.

Thank you, and good luck! 🙏

cjaro / vizion-api-challenge Goto Github PK