GithubHelp home page GithubHelp logo

Andrii Voitkiv - portfolio

About

This is a repository to showcase skills, share projects and track my progress in Data Science / Machine Learning related topics.

Table of contents

Study projects

In this section I will provide links to my github repositories containing code and jupyter notebooks I created while passing courses.

Master of Data Sciene and Analytics

This is 12 months master program at the University of Calgary, Canada. For more details ---> go to repo...
List of courses:

Pytorch fundamentals

Code: go to repo...
Status: In progress

NLP Cook Dishes Project

Code: go to repo...
Description: This project is an in-depth exploration of various NLP models with the purpose of generating text based on a dataset of recipes.
Skills in focus: N-gram Language Model, Neural Language Models (RNN-LSTM, Convolutional), Sampling strategies, Evaluating language model
Status: Completed in November 2023

Portfolio end-to-end projects

In this section I will list projects briefly describing the technology stack used to solve cases.

Client Segmentation for targeted marketing in a credit union

Screenshot 2023-10-21 at 23 46 11

Code: go to repo...
Presentation: go to google slides...
Industry: Banking and Finance
Description: The focus of the project was to build a highly flexible and automated ML pipeline to run experiments. Then, the best model is deployed to an app by a series of automated workflows.
Skills in focus: Clustering, Model selection, Data and model versioning, Experimentations, CI/CD pipelines
Tools:

  • Environment: GitHub Codespaces, devcontainer, Docker, venv, Hydra
  • Data Management: DVC (Data Version Control), AWS S3
  • DS and ML: scikit-kearn (PCA, clustering algorithms), keras (autoencoder)
  • Continuous Integration: GitHub Actions, CML, AWS EC2
  • Continuous Deployment: Fast API, Heroku

Results: This helps the credit union make better decisions about how to reach out to different groups of clients.
Status: Completed in August 2023.

Predicting job salary

Screenshot 2023-09-28 at 11 02 13

Code: go to repo...
Description: This is from Kaggle competition: "Adzuna wants to build a prediction engine for the salary of any UK job ad, so they can make huge improvements in the experience of users searching for jobs, and help employers and jobseekers figure out the market worth of different positions."
Data: large dataset (hundreds of thousands of records), which is mostly unstructured text, with a few structured data fields.
Skills in focus: Regression, Tokenization, Categorical Vectorization, Neural Networks, OOP, ML Pipeline (Azure CLI), Components (Azure CLI), Deployment
Tools:

  • Environment: GitHub Codespaces, devcontainer, conda, Azure CLI, Azure ML Studio
  • DS and ML: PyTorch, scikit-learn

Status: Completed in September 2023.

Predicting diabetes on Azure ML with GitHub Actions

Screenshot 2023-10-21 at 23 47 36

Code: go to repo...
Industry: Healthcare
Description:
Skills in focus: Logistic Regression, CI/CD pipelines, Linting, Testing, Package and Register the Model
Tools:

  • Environment: GitHub Codespaces, devcontainer, Docker, venv
  • Data Management: Azure ML Datastore
  • DS and ML: scikit-kearn (Logistic regression)
  • Continuous Integration: GitHub Actions, Azure ML Resources (Job, Compute, Environment), flake8, pytest
  • Continuous Deployment: MLFlow

Results: An automated workflow that will be triggered when a new model is registered. Once the workflow is triggered, the new registered model will be deployed to the production environment.
Status: Completed in October 2023.

Fine-Tuning-LLM-with-SkyPilot-and-DVC

Screenshot 2023-10-21 at 23 48 21

Code: go to repo...
Description: Fine-tune the foundational LLM for hotel reviews' sentiment classification in the cloud on GPUs.
Skills in focus: Text classification, Fine-tune LLM, Provision infrastructure, Checkpointing
Tools:

  • Environment: GitHub Codespaces, devcontainer, Docker, venv
  • Infrastructure Management: SkyPilot
  • DS and ML: Transformer, PyTorch
  • Continuous ML: DVC, Weights and Biases

Results: Cost-optimized setup to run in the cloud to fine-tune LLM with continuous machine learning.
Status: Completed in October 2023.

Contacts

avoytkiv's Projects

mdsa-uofc icon mdsa-uofc

Tasks and projects solved while passing a Master of Data Science and Analytics program

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.