GithubHelp home page GithubHelp logo

edumunozsala / llama-2-7b-4bit-python-coder Goto Github PK

View Code? Open in Web Editor NEW
63.0 63.0 14.0 153 KB

Fine-tune and quantize Llama-2-like models to generate Python code using QLoRA, Axolot,..

License: GNU General Public License v3.0

Jupyter Notebook 99.23% Python 0.77%

llama-2-7b-4bit-python-coder's Introduction

Hello, my name is Eduardo Muรฑoz Sala ๐Ÿ‘‹


I'm an experienced Software Engineer, Data Scientist and a Machine Learning Engineer and practitioner in a continuous learning path. I love building projects in ML, Generative AI and testing tools for MLOps architectures. Great skills in Data Analysis, Project Management and leading SW Development teams

  • ๐Ÿ“š Iโ€™m always studying courses on Coursera, Deeplearning.ai, Udemy, etc... improving my skills on ML, Generative AI and SW engineering
  • ๐Ÿ’ฅ I love testing new tools and platforms about Vector databases, Data management and MLOps
  • ๐Ÿ”จ Great interest in NLP in spanish: train and fine-tune models and test for improvements
  • โœ Iโ€™m looking to write more articles on Medium

Detailed board on my Studies & Personal Projects on Data Science and Machine Learning

Certifications

  • ๐Ÿ“ Microsoft Certified: Azure Data Scientist Associate
  • ๐Ÿ“ Servicenow Certified System Administrator
  • ๐Ÿ“ Microsoft Certified: Azure AI Fundamentals
  • ๐Ÿ“ Microsoft Certified: Azure Data Fundamentals
  • ๐Ÿ“ AWS Certified Machine Learning Specialty
  • ๐Ÿ“ Deeplearningai Specializations: Generative AI with LLM, Practical Data Science on AWS, NLP, GANs, Deep Learning, Tensorflow
  • ๐Ÿ“ Machine Learning Engineer Nanodegree on Udacity
  • ๐Ÿ“ Data engineer, Big Data and Machine Learning in Google Cloud Platform
  • ๐Ÿ“ Microsoft Professional Program in Artificial Intelligence
  • ๐Ÿ“ Microsoft Professional Program in Data Science
  • ๐Ÿ“ Applied Data Science in Python Specialization Michigan University

Connect with me:

edumunozsala | LinkedIn edumunozsala | Medium

Languages and Tools:

Python

PyTorch

Keras

TensorFlow

Scikit

AWS

GoogleCloud

SQL

git

Oracle




๐Ÿ’ป Latest Projects

  • ๐Ÿ“ฐ Fine-Tuning a Llama-2 7B Model for Python Code Generation. Blog Code
  • ๐Ÿ“ฐ Embeddings with Sentence Transformers and Pinecone for Question Answering in Spanish. Blog Code
  • ๐Ÿ“ฐ And end-to-end AWS Sagemaker Pipeline to prepare data, train, evaluate and conditional deployment of a Keras Text Classification model. Code
  • ๐Ÿ“ฐ Training a PyTorch Text Summarization Model on Vertex AI using a pretrained Huggingface model. Code
  • ๐Ÿ“ฐ Train and Evaluate a Sentence Transformer model for news in spanish in AWS Sagemaker. Code
  • ๐Ÿ“ฐ Train and Deploy on AWS Sagemaker a Sentiment Classifier on IMDb Reviews in Spanish. Code
  • ๐Ÿ“ฐ Fine-tune a RoBERTa Encoder-Decoder model trained on MLM for Text Generation. Blog Code
  • ๐Ÿ“ฐ Create a Tokenizer and Train a Huggingface RoBERTa Model from Scratch. Blog Code
  • ๐Ÿ“ฐ Apply Machine Learning in an Easy and Fast Way with AWS AutoPilot: Diving Deep on AutoML. Blog Code
  • ๐Ÿ“ฐ Fine-tune a Huggingface T5 model for Text Summarization using AWS Sagemaker and Weights and Biases. Code
  • ๐Ÿ“ฐ Text Summarization Guide: Development of EDA for text data, Encoder-Decoder with Attention, Transformer using Tensorflow 2, AWS SageMaker and Weight&Biases. Blog Code
  • A Guide to the Encoder-Decoder Model and the Attention Mechanism: build a encoder decoder architecture using LSTM units with the Luong's Attention in Tf2- Blog Code
  • Character-level text generator with Pytorch and Amazon SageMaker - Blog Code
  • ๐Ÿ“œ Intro to NLP and Text Classification: Compilation of notebooks introducing some relevants topics and concepts to get started in Natural Language Processing. - Blog Code
  • Malaria Predictor based on Image Classification: An image classifier to predict if a cell is infected or not with malaria. CNNs, data augmentation and transfer learning using Azure Machine Learning Services - Code
  • ๐ŸŒฒ Forest Cover Type Predictor: EDA to analyze the problem and develope a XGBoost model and an stacked ensemble model to predict the cover type in Roosevelt National Forest. - Code
  • ๐Ÿ”ˆ Using Machine Learning to identify accents in spectrograms of speech: Capstone Project in the Microsoft Professional Program in Artificial Intelligence, CNN to identify accents in spectrograms using Azure Machine Learning Services - Blog Code
  • ๐Ÿ  Predicting Mortgage Approvals: A capstone project in the Microsoft Professional Program in Data Science, an Exploratory Data Analysis on applications and machine learning model in Azure Machine Learning Studio. - Blog Code

๐Ÿ“ฐ Latest Blog Posts

Anurag's github stats

llama-2-7b-4bit-python-coder's People

Contributors

edumunozsala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

llama-2-7b-4bit-python-coder's Issues

Cant run the code on Colab

hi

thz for so comprehensive guide which it is really helpful for me to understanding the state-of-art in the field. However, I could not run the code which you shared on Colab.

There are the Erorrs :
#1 on # Define the training arguments box: ValueError: Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0
so I charged it from "fp16 = False, bf16 = True" to "fp16 = True, bf16 = False" on the Setting Global Parameters cell. is it ok to charge it?

#2 after # train cell called OutOfMemoryError since "CUDA out of memory. Tried to allocate 2.00 GiB (GPU 0; 14.75 GiB total capacity; 10.23 GiB already allocated; 790.81 MiB free; 12.92 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"
this time I have no ideas to fix :(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.