ML-You-Can-Use

Practical Machine Learning and Natural Language Processing with examples.

Featuring

Interesting applications of ML, NLP, and Computer Vision
Practical demonstration notebooks
Reproducible experiments
Illustrated best practices:
- Code extracted from notebooks for:
  - automatic formatting with Black
  - Type checking via MyPy annotations
  - Linting via Pylint
  - Doctests whenever possible

Setup

Download this repo using git with the submodule command, e.g.:

git pull --recurse-submodules

Submodules are used to pull in some data and external data processing utilities that we'll use for preprocessing some of the data.

Install Python 3

Create Virtual Environment

mkdir p3
 `which python3` -m venv ./p3
 source setPythonHashSeed.sh
 source p3/bin/activate

Install Requirements

pip install -r requirements.txt

For running all notebook examples

pip install -r requirements-dev.txt

Note: some examples will have a conda `environment.yaml` file that you will want to use.

Installing Test Corpora

Many notebooks use data that needs to be installed, do so by running the install script.

install_corpora.sh

installs Python ssl certificates
installs CLTK data for Latin and Greek
installs NLTK data

Testing

./runUnitTests.sh

Interactivity

juypter notebook

Notebooks

Getting data

Extracting Occupation and Employer data from Wikidata

Labeling Data

Modeling Language

Detecting Duplicate Documents

Merge corpora by detecting and filtering duplicate documents

Classifying Texts

Detecting Loanwords

Wikipedia Corpus Processing

Quality Embeddings

Computer Vision - Object Detection

Summarizing Texts

Assessing Headline Generation

Searching and Search Relevance

Search Results Relevance using BERT

todd-cook / ml-you-can-use Goto Github PK

ml-you-can-use's Introduction

ML-You-Can-Use

Featuring

Setup

Install Python 3

Create Virtual Environment

Install Requirements

For running all notebook examples

Note: some examples will have a conda environment.yaml file that you will want to use.

Installing Test Corpora

Testing

Interactivity

Notebooks

Getting data

Labeling Data

Modeling Language

Detecting Duplicate Documents

Classifying Texts

Detecting Loanwords

Wikipedia Corpus Processing

Quality Embeddings

Computer Vision - Object Detection

Summarizing Texts

Searching and Search Relevance

References and Acknowledgements

ml-you-can-use's People

Contributors

Stargazers

Watchers

Forkers

ml-you-can-use's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Note: some examples will have a conda `environment.yaml` file that you will want to use.