GithubHelp home page GithubHelp logo

microlink's Introduction

MicroLink

URL link shortner service

This is my attempt at the assignment. I didn't have any knowledge of GCP or Kubernetes prior to this. I encorporated known patterns that I have used in previous large scale projects to ensure this solution can scale to millions of concurrent requests.

There is more work to do but only so much time (see TODO). For instance the autoscaling rules are not configured. The project is also coupled in some parts to GCP which is not necessary.

WEB UI:

API:

Example shorten URL:

curl 'http://34.111.237.154/shorten_url?long_url=google.com' -X POST -H 'Cache-Control: no-cache' -H 'Content-Length: 0'

{"short_url":"34.111.237.154/baea954b"}

Problem

Context: At OMITTED, we have a semi-realtime environment with thousands of changes happening concurrently. f Project: Create a web service that shortens URLs for 1000s of concurrent users. Users should be able to submit a long URL, then receive a unique shortened URL that redirects to the long URL.

Requirements Users must be able to send a long URL and receive a shortened URL. Implement the logic using Python 3 and asyncio. Prepare the app for deployment to a Kubernetes cluster. Document the app interfaces, choice of database (if any), metrics and logging. Be ready to discuss how you would scale the app for millions of concurrent users.

Note: Don't spend effort on front-end layout/design unless you have spare time. We will only be assessing functionality.

Bonus Actually deploy the app somewhere and provide us the URL.

Solution

To fulfill the requirements of this project, I would recommend using FastAPI as a web framework for Python, and Redis for the database.

Interfaces:

The interface for submitting a long URL would be a POST endpoint, "/shorten_url". The long URL would be passed as a JSON parameter. The interface for retrieving the original long URL from a shortened URL would be a GET endpoint, "/{short_url}", where "{short_url}" is the unique shortened URL hash.

Database:

Redis would be a good choice for this project as it provides fast data retrieval and storage, and is easy to use with Python through the redis-py library. In the Redis database, each long URL would be stored with a unique key generated by hashing the long URL. The shortened URL would be the key itself, prefixed by the domain name.

Metrics and Logging:

To monitor the performance of the app, I would recommend using Prometheus for metrics collection and Grafana for dashboard visualization. For logging, I would recommend using the logging module in Python and sending the logs to a centralized logging service such as Logstash or Fluentd.

Scaling:

To scale the app for millions of concurrent users, I would recommend splitting the API and the web server into separate microservices, each deployed on multiple replicas for redundancy. I would also consider using a load balancer such as Nginx or HAProxy to distribute incoming traffic evenly across the replicas. Additionally, I would consider using a caching layer such as Varnish or Nginx to cache frequently accessed URLs and reduce the load on the Redis database.

Deployment:

For the deployment to a Kubernetes cluster, I would recommend using a GitOps approach with tools such as Argo CD or Flux CD to automate the deployment process and ensure consistency across environments. For this project I went with a simpler approach and used Github actions to GKE cluster. The deployment process would involve creating and applying Kubernetes manifests for the various components, such as the API, web server, Redis, load balancer, and metrics/logging components.

Note: The deployment to a live environment would require a proper domain name and a valid SSL certificate for secure communication. I have skipped this effort due to time constraints.

Closing Note:

An IngressController is used as an entry point to the Kubernetes cluster. This is used to route requests to the appropriate service (api, web). This is important as the web service is not used to resolve and redirect to a shortened url, as this would put additional strain on that service and break the single responsibility principle.

First time project setup

  1. Create virtual py env
pyenv install 3.11.0b5
pyenv local 3.11.0b5

pip install -r requirements-api.txt
pip install -r requirements-web.txt
  1. Running pytest
export APP_ENV=dev
pytest .
  1. Build images and deploy to Kubernetes

Note: The current K8S files expect the image to be deployed to GKE, replace 'image' in those files with the images created here.

docker build -t docker_image_api -f Dockerfile_api .;
docker tag docker_image_api:latest <DOCKER-HUB-USER>/microlink_api:latest;
docker push <DOCKER-HUB-USER>/microlink_api:latest

docker build -t docker_image_web -f Dockerfile_web .;
docker tag docker_image_api:latest <DOCKER-HUB-USER>/microlink_web:latest;
docker push <DOCKER-HUB-USER>/microlink_web:latest

kubectl apply -f k8s/;
kubectl get pods;

TODO

  • SSL Certificate and HTTPS enforcement
  • Domain name for Microlink
  • Prometheus and Grafana created and deployed (GCP provides equivalents for now)
  • Switch to NginxIngressController
  • Introduce FastAPI Models instead of using Dict for request/response
  • Additional unit tests + pytest Github action
  • Put 'src' on path remove the need to ref imports by 'src.*'
  • Use the OpenAPI spec in the FastAPI views.py (todo-url-shortener-api.yaml)
  • Setup a deep health check which runs as part of the deployment.
  • Clean-up TODO comments in code related to these items
  • VPC setup and network partitioning
  • Handle case when client provides url without http or https prefix.
  • Impl strict Kubernetes security controls (least privilege)

microlink's People

Contributors

dependabot[bot] avatar tom-xyz avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.