GithubHelp home page GithubHelp logo

mathematicator / crypto_research Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 5.89 MB

This Project, we will try to find the underlying relationships between the 14 coins as suggested in the introductory notebook by the competition host

Python 0.01% Jupyter Notebook 99.99% CSS 0.01%

crypto_research's Introduction

πŸš€ Welcome to my Github page

Background Image

---

Hello! πŸ–οΈ I am a Senior AI & Data Science Engineer with 6 years of experience. As a Senior Data Scientist, Machine Learning Engineer, and MLOPS expert, I specialize in Natural Language Processing (NLP) and Large Language Models (LLMs) in the field of Generative AI. I excel in designing and implementing data-driven models tailored to client needs.

πŸ” Areas of Passion:

  • Advanced NLP techniques including:
    • LLMs πŸ€–
    • Qlora πŸ¦„
    • LORA 🦜
    • RAG (retrieval-augmented generation) πŸ•ΈοΈ
  • 🀿 Currently, I’m immersed in exploring augmented data generation techniques for NLP tasks.
  • πŸ’¬ Got queries about NLP, AI, or machine learning? Don't hesitate to ask! 🧠

🧠 Expertise Areas

  • Large Language Models (LLMs) πŸ€–
  • MLE | MLOPS πŸš€
  • Data Analysis πŸ“Š
  • Natural Language Processing πŸ“
  • Dashboard Realization πŸ“ˆ
  • Business Intelligence πŸ“‰
  • Data Management & Transformation βš™οΈ
  • Machine Learning 🦾
  • Deep Learning 🧬
  • Data Visualization πŸ–Ό

Whether you're looking to tell a compelling story with your data, develop a real-time dashboard with KPIs for monitoring your company's health, or explore natural language processing solutions πŸ—£, I can assist you in your next venture.

Over the years, I've successfully managed projects worth +€1M across 12+ countries 🌍 for major clients including the European Parliament, Kering, Atos, Renault Nissan Mitsubishi, Damart, and more. I wear multiple hats as a πŸ§‘β€πŸ”¬ Data Scientist, πŸ“Š Data Analyst, and πŸ§‘β€πŸ’Ό Project Manager with both functional and managerial expertise.

πŸ† Certifications

πŸ† Hackathons

πŸ† 1st Prize "Hack to Act" Kering

Location: Paris (October 2020)
Prize: €10,000

  • Objective: Development of a prediction/recommendation platform based on AI.
  • Description:
    • Prediction of the environmental impact of Kering's various activities throughout the supply chain.
    • Accurate evaluation of environmental impacts: resource depletion, biodiversity, greenhouse gases.
    • Decision support for designers, material researchers, and consumers in their choices to reduce the impact of luxury.
    • Automatic creation of predictive models from user-provided data or directly from Kering data and integrated native models.
    • Recommendation technique: Collaborative-based method, Content-based method, Hybrid method.
  • Results: The team won 1st place for the best platform for predicting the environmental footprint of Kering's products.
  • References:

πŸ† 1st Prize Accenture Hackathon

Location: Paris (February 2019)

πŸ§‘β€πŸŽ¨ My Recent Roles

πŸ“Š Senior Data Scientist | LLM expert - TOTAL ENERGIES, Paris

Objective: SQL Chatbot for Database Management

  • Led the development of an advanced SQL chatbot to enhance database querying and data visualization using NLP and LLMs.
  • Architected the SQL chatbot leveraging LangChain and OpenAI's GPT-4, enabling intuitive data visualizations and command translations.
  • Enhanced model efficiency & performance using LLMs.
  • Designed a full-stack solution hosted on Azure SQL Database, integrating Azure Bot Services and Azure Language Understanding (LUIS) for a dynamic user interface.
  • Optimized model performance with hyperparameter tuning.
  • Adapted the model to cater to different building types.
  • Result: Enhanced model efficiency & performance using advanced NLP techniques & LLMs.
  • Technical Stack: Python, Azure, PostgreSQL, LangChain, Streamlit, Gitlab, AzureDevOps

πŸ§‘β€πŸ’Ό IFACI – Expert LLM

Objective: AI GEN Assistant for Auditing Profession

  • Role: AI GEN Assistant for Auditing Profession
  • Developed a generative AI base and an assistant for field agents, enhancing natural language understanding capabilities using Spacy.
  • Utilized a combination of Azure, LanceDB, RAG, Vector Store, HNSW, and Hybrid search technologies to optimize performance.
  • Result: Improved efficiency and accuracy in the auditing process through the implementation of advanced AI techniques.
  • Technical Stack: Python, Azure, Neo4J, AzureDevOps, LanceDB, Chroma, Milvus, MLFlow, HNSW, Hybrid search

πŸ§‘β€πŸ’Ό GSF – Senior Data Scientist / MLE Architect

Objective: Predictive Maintenance for Cleaning Services

  • Role: Workplace Accident Prediction + Explainability
  • Spearheaded the implementation of CI/CD pipelines and developed a system for the evaluation of prediction explainability.
  • Utilized Azure Machine Learning, Azure Datafactory, Azure Pipelines, Azure Devops, and integrated with Snowflake and Control-M for workflow management.
  • Result: Improved workplace safety through accurate accident prediction and enhanced model explainability.
  • Technical Stack: Python, Azure, MLFlow, Databricks, Pandas, Terraform, TensorFlow, Scikit-learn, PyTest, Docker, Azure Machine Learning, Azure Datafactory, Azure Pipelines, Azure Devops, Snowflake, Control-M

πŸ“§ Expert NLP LLM / Senior Data Scientist - THUASNE

Objective: Email Order System (1K orders/day)

  • Created an email order management system and a multimodal model for information extraction employing BERT, Azure, ChatGPT, NLP, LLMs, and Melusine.
  • Enhanced the system with Azure Document AI for advanced document processing.
  • Improved order processing efficiency & anomaly detection.
  • Used explainability tools like LIMETextExplainer, ELI5NLP, SHAP, and AnchorsNLP.
  • Optimized system performance with hyperparameter tuning.
  • Adapted the system to cater to different orthopedic domains.
  • Result: Enhanced system efficiency & performance using advanced NLP techniques, LLMs, and Melusine tool.
  • Technical Stack: BERT, Azure, OpenAI, NLP, LLMs, Melusine, Azure Document AI, Scikitlearn,Docker

πŸ“¨ Senior Data Scientist - ADELAIDE

Objective: Automatic Email Processing (10K emails/day)

  • Developed an explainability module for email classification and automatic responses using open-source tools such as Melusine, LIMETextExplainer, ELI5NLP, SHAP, AnchorsNLP, and integrations with Hugging Face and RASA.
  • Managed version control and continuous integration using Git and CI/CD practices.
  • Result: Streamlined email processing and improved response accuracy through the implementation of advanced NLP techniques and explainability tools.
  • Technical Stack: Melusine, LIMETextExplainer, CNN, ELI5NLP, SHAP, AnchorsNLP, Hugging Face, RASA, Git, CI/CD

πŸ•΅οΈβ€β™‚οΈ Expert LLM - ACOSS/URSSAF/CNAF/CNAM

Objective: Documentary AI for Social Fraud Prevention

  • Developed a demonstrator for multimodal processing of large data volumes using Transformers, LayoutLM, OCR, NLP, and Topic Modeling to detect fraud.
  • Result: Enhanced fraud detection capabilities through the implementation of advanced AI techniques for multimodal data processing.
  • Technical Stack: Transformers, OpenCV, PyTorch, CNN LayoutLM, OCR, NLP, Topic Modeling

πŸ•΅οΈβ€β™€οΈ Senior Data Scientist - Quantmetry, Paris

Objective: Documentary AI Fraud Demonstrator

  • Designed a demonstrator for document processing (insurance invoices).
  • Detected fraudulent patterns & document falsification.
  • Extracted key invoice fields & verified their consistency.
  • Identified potentially suspicious overbilling cases.
  • Result: Significant improvement in fraud detection using AI, outperforming traditional OCR techniques.
  • Technical Stack: Python, Azure, OCR, OpenCV, PyTorch, CNN, NLP, Machine Learning, Scikitlearn, Docker

πŸš— Senior Data Scientist - Stellantis, Paris

Objective: Part Forecasting: PFO – Technical Lead/ Technical Expert

  • Provided 18-month forecasts to suppliers, mitigating semiconductor crisis impact.
  • Developed PFO architecture as a Streamlit web app hosted in Azure.
  • Integrated data from Oracle Exadata Database.
  • Used Azure Data Factory for file transfer & processing tasks.
  • Containerized the PFO app using Docker & deployed to Azure Container Registry.
  • Result: Enhanced inventory management & supplier collaboration, improving part prediction accuracy.
  • Technical Stack: Streamlit, MLFlow, Airflow, Terraform, PyTest, Databricks, Azure, Oracle Exadata Database, Azure Data Factory, Docker, Azure Container Registry

πŸ’Έ Project Manager / Technical Expert - ATOS, Grenoble

Objective: Travel & Expense Dashboard Atos (€10M+/ year)

  • Developed a KPI dashboard to monitor Atos' expenses in real-time with geolocation and carbon footprint of travel.
  • Recovered +10% VAT + billable expense reports (+€1 million annual gain).
  • Gained more than 3214 hours of work per year.
  • Result: Automation of weekly reports, significant cost savings, and improved efficiency through real-time expense monitoring and analysis.
  • Technical Stack: Pandas, Matplotlib, Numpy, Scikit-learn, Jupyter Notebook, Power BI

πŸ—£οΈ Lead Data Scientist - ATOS, Grenoble

Objective: R&D – Expressive TTS System

  • Collected and adapted a large corpus of interactive behaviors in English (LJ Speech) and French (MAILABS).
  • Developed and trained an expressive TTS system based on the Tacotron2 model by NVIDIA.
  • Implemented a methodology for evaluating the learning quality of the prototype based on the distribution of lengths (number of spectrogram frames) of the predicted clips compared to the originals.
  • Prepared a scientific paper: "Linking Utterances via Punctuations for Improved End-to-End Speech Synthesis".
  • Captured the variability of styles and emotional state and their syntheses to the user profile for better prediction of speech synthesis applied to the text-to-speech (TTS) system.
  • Result: Improved robustness and accuracy of TTS e-spectrogram generation, control and generation of styles, verbal behaviors, and prosody based on the user.
  • Technical Stack: Pytorch, Tensorflow, Python, LSTM, Transformers, Attention Mechanism

πŸš— Data Scientist - Renault Nissan Mitsubishi, Paris

Objective: Industry Automobile – Home to Car Next Generation Alliance

  • Realized prototypes and developed the first generation of voice assistants of the Renault Nissan Mitsubishi alliance.
  • Integrated the Google Assistant with Nissan cars to receive information from the car and control it remotely from your phone or from a Google Home.
  • Connected to the authentication servers of the RNM Alliance and complied with cybersecurity specifications.
  • Deployed Alexa and Google Actions project environments fully configured and ready to use.
  • Documented user journey to configure the service.
  • Result: Launched these features with the Nissan Juke at the 2019 Frankfurt Motor Show.
  • Technical Stack: Python, Azure, Tensorflow, Keras, Dialogflow, Luis, Reddit, Alexa Skill, Bot Framework

🏨 Data Scientist - ATOS (European Parliament), Grenoble

Objective: Service – SAMBOT an intelligent conversational agent for room reservation (+3000 users)

  • Developed a multilingual chatbot for room reservation in natural language (text and voice).
  • Implemented a recommendation system based on user habits, locations, and room occupancy.
  • Paired with Outlook calendars & email systems (Skype).
  • Documented functional, technical, and user journey aspects.
  • Result: Realized room reservations in record time considering user habits.
    Technical Stack: Python, Azure, OCR, OpenCV, PyTorch, CNN, NLP, Machine Learning, Scikitlearn, Docker

πŸ‘¨β€πŸ’» Languages

Python Pyspark R C++

☁️ Cloud Services

Microsoft Azure AWS Google Cloud

🐳 Containers & Orchestration

Docker Kubernetes

πŸ€– Machine Learning

TensorFlow PyTorch Scikit-Learn Airflow

🌐 NLP

NLTK spaCy Hugging Face Lambda chatgpt Llama

πŸ“Š Data Science

Jupyter Notebook LlamaIndex NumPy Pandas Matplotlib TensorFlow PyTorch Scikit-learn OpenCV Power BI Tableau Databricks Qlik SAS Microsoft Azure Google Cloud

🌐 Web Development

Flask Django

πŸ”§ Infrastructure

Terraform GitLab Jenkins Kafka Linux Pytest

πŸ—ƒοΈ Databases

MySQL PostgreSQL MongoDB Microsoft SQL Server Oracle Neo4js

πŸ› οΈ Engineering

Git MLflow Terraform Ansible Puppet Chef

πŸ“ˆ Big Data

Apache Kafka Apache Spark Apache Hadoop Apache Cassandra Apache Hive Apache Storm Apache NiFi

πŸ“‹ Functional/Managerial Skills πŸ“‹

  • Project Management πŸ“…
    • Preparation πŸ“
    • Planning πŸ—“οΈ
    • Management πŸ“Š
    • Evaluation πŸ“ˆ
    • Monitoring and Control of:
      • Resources πŸ’°
      • Calendar πŸ“†
      • Costs πŸ’Έ
      • Scope 🎯
      • Risk 🚨
      • Quality 🌟
      • Requirements πŸ“‘
      • Value πŸ’Ž
      • Satisfaction 😊
  • Tools: TFS, MS Project, GANT, PERT

Github repositories stats

Mathematicator

Let's connect and collaborate! πŸš€

crypto_research's People

Contributors

mathematicator avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.