GithubHelp home page GithubHelp logo

rishabbh-sahu / information_retrieval Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 2.0 1.42 MB

Given a document, identifying the closest documents within the list of documents using tf-idf matrix and cosine similarity

Python 100.00%
tfidf-vectorizer text-vectorization information-retrieval matrix-multiplication similarity-search similar-patterns root-cause-analysis lookalike-queries

information_retrieval's Introduction

Hi ๐Ÿ‘‹, I'm Rishabbh

Data scientist and consultant, having experience across multiple industries like retail, manufacturing, FMCG, banking(fintech), insurance, consumer business etc.

I'm currently:
  • ๐Ÿ”ญ Working on deep learning projects such as transfer learning, pre-trained models (transformers), incremental models, layer pruning, quantization & distillation, language models to solve NLP downstream tasks like document parsing pipeline (banking sector), summarization, Q&A, NLU/NLG, context based auto-completion, RestAPI's flask-endpoint, deployment & productionization, docker images & containers, kubernetes, GCR-GCP, CI/CD pipeline, gitHub hooks/actions(pre/post commit, workflows), DVC-pipeline, mongoDB, google-OCR, label studio, Transformer's model interpretability etc.

  • ๐ŸŒฑ Learning autoencoders, self-supervised learning, optimization, time series analysis using deep learning (deepstates model-gluonTS), linear programming (optimization), anomaly detection, feature learning, data comprssion techniques(SVD, matrix factorization), MLOps (ML pipeline), model interpretability (Explicable-AI), ablation study, TextRank - grpah representation of text with PageRank

  • ๐Ÿ‘€ Interested in doing research work/consulting assignments by sharing, learning and exploring to/from open source communities. Motivated to create numerous projects in the field of AI/ML/DL with the focus to deploy into production.

  • ๐Ÿ‘ฏ Looking forward to collaborate on ML/DL projects and Kaggle competitions


About my work:
  • ๐Ÿ‘จโ€๐Ÿ’ป All of my projects are available at https://github.com/Rishabbh-Sahu

  • ๐Ÿ’ฌ Ask me about NLU/NLP, intent/sequence/text/email classification, NER (named entity recognition), sentence/document/semantic similarity, information retrieval using tfidf/context-based, time series analysis (forecasting), deep learning, data augmentation, dialog system (voice models), PLM's (pre-trained language models), model ensembling/stacking, feature selection methods, segmentation, tokenization (text), optimization, recommendation engine, customer-360 analysis, statistics, retail analytics, supply chain analytics, dimensionality reduction, regularization techniques, bias & varaince, DOE-design of experiments (ANOVA,T/F/Chi^2/Z-test), KS-test, sampling methods, A/B testing, crowd-sourcing (Toloka, Mtark etc.), Big-query, SQL

  • ๐Ÿ“ซ You can reach me on www.linkedin.com/in/rishabbh-sahu-pmp



Languages and Tools:

python tensorflow R aws azure git linux gcp mysql sqlite hadoop hive jenkins mssql oracle postman c cplusplus



rishabbh-sahu

rishabbh-sahu's GitHub stats

information_retrieval's People

Contributors

rishabbh-sahu avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.