GithubHelp home page GithubHelp logo

mohd-faizy / machine-learning-algorithms Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 4.0 6.03 MB

This Repository consist of some popular Machine Learning Algorithms and their implementation of both theory and code in Jupyter Notebooks

License: MIT License

Jupyter Notebook 100.00%
machine-learning-algorithms linear-regression logistic-regression decision-trees support-vector-machines naive-bayes-classifier k-nearest-neighbors k-means-clustering random-forest dimensionality-reduction-algorithms

machine-learning-algorithms's Introduction

author made-with-Markdown Language Platform Maintained Last Commit GitHub issues Open Source Love svg2 Stars GitHub GitHub license Size

Machine Learning Algorithms

Classification according to the ways of learning:

⚫ Supervised learning

⚪ Unsupervised learning

⚫ Semi-supervised learning

⚪ Reinforcement learning


Classification according to the function

 Regression algorithm

  • Linear regression
  •  Logistic regression
  • Multiple Adaptive Regression (MARS)
  •  Local scatter smoothing estimate (LOESS)

 Instance-based Learning Algorithm

  • K — proximity algorithm (kNN)
  • Learning vectorization (LVQ)
  • Self-Organizing Mapping Algorithm (SOM)
  • Local Weighted Learning Algorithm (LWL)

 Regularization Algorithm

  • Ridge Regression
  • LASSO(Least Absolute Shrinkage and Selection Operator)
  • Elastic Net
  • Minimum Angle Regression (LARS)

 Decision tree Algorithm

  • Classification and Regression Tree (CART)
  • ID3 algorithm (Iterative Dichotomiser 3)
  • C4.5 and C5.0
  • CHAID(Chi-squared Automatic Interaction Detection)
  • Random Forest
  • Multivariate Adaptive Regression Spline (MARS)
  • Gradient Boosting Machine (GBM)

 Bayesian Algorithm

  • Naive Bayes
  • Gaussian Bayes
  • Polynomial naive Bayes
  • AODE(Averaged One-Dependence Estimators)
  • Bayesian Belief Network

 Kernel-based Algorithm

  • Support vector machine (SVM)
  • Radial Basis Function (RBF)
  • Linear Discriminate Analysis (LDA)
 

 Clustering Algorithm

  • K — mean
  • K — medium number
  • EM algorithm
  • Hierarchical clustering
 

 Association Rule Learning

  •  Apriori algorithm
  •  Eclat algorithm
 

 Neural Networks

  • Sensor
  • Backpropagation algorithm (BP)
  • Hopfield network
  • Radial Basis Function Network (RBFN)

 Deep Learning

  • Deep Boltzmann Machine (DBM)
  • Convolutional Neural Network (CNN)
  • Recurrent neural network (RNN, LSTM)
  • Stacked Auto-Encoder

 Dimensionality Reduction Algorithm

  • Principal Component Analysis (PCA)
  • Principal component regression (PCR)
  • Partial least squares regression (PLSR)
  • Salmon map
  • Multidimensional scaling analysis (MDS)
  • Projection pursuit method (PP)
  • Linear Discriminant Analysis (LDA)
  • Mixed Discriminant Analysis (MDA)
  • Quadratic Discriminant Analysis (QDA)
  • Flexible Discriminant Analysis (FDA

 Integrated Algorithm

 
  • Boosting
  • Bagging
  • AdaBoost
  • Stack generalization (mixed)
  • GBM algorithm
  • GBRT algorithm
  • Random forest

 Other Algorithms

 
  • Feature selection algorithm
  • Performance evaluation algorithm
  • Natural language processing
  • Computer vision
  • Recommended system
  • Reinforcement learning
  • Migration learning

 


Popular Machine Learning Algorithms

1️⃣Linear Regression:

# Import Library
# Import other necessary libraries like panda, numpy...

from sklearn import linear_model

# Load Train and Test datasets
# Identify feature and response variable(s) and 
# values must be numeric and numpy arrays

x_train = input_variables_values_training_datasets
y_train = target_variables_values_training_datasets  
x_test = input_variables_values_test_datasets

# Create linear regression object
linear = linear model.LinearRegression()

#Train the model using the training sets and
#check score 

linear.fit(x train, y_train)
linear.score(x train, y_train)

# Equation coefficient and Intercept

print('Coefficient: \n', linear.coef_)
print('Intercept: \n', linear. intercept_) 

#Predict Output 
predicted = linear.predict(x_test) 

2️⃣Logistic Regression:

# Import Library 
from sklearn.linear model import LogisticRegression

# Assumed you have, X (predictor) and Y (target) 
# for training data set and x_test(predictor) of test dataset 

# Create logistic regression object 
model = LogisticRegression()

# Train the model using the training sets and check score 
model.fit(X, y)
model.score(X, y)

# Equation coefficient and Intercept 
print('Coefficient: \n', model.coef_) 
print('Intercept: \n', model.intercept_)

# Predict Output
predicted = model. predict(x_test) 

3️⃣Decision Tree:

# Import Library
# Import other necessary libraries like pandas, numpy...

from sklearn import tree

# Assumed you have, X (predictor) and Y (target) for
# training data set and x_test(predictor) of test dataset 

# Create tree object 
model = tree.DecisionTreeClassifier(criterion='gini') 

# for classification, here you can change the
# algorithm as gini or entropy (information gain) by 
# default it is gini 

model = tree.DecisionTreeRegressor() # for regression

# Train the model using the training sets and check score 
model.fit(X, y)
model.score(X, y) 

# Predict Output 
predicted = model.predict(x_test) 

4️⃣Support Vector Machine(SVM):

# Import Library
from sklearn import svm

# Assumed you have, X (predictor) and Y (target) for
# training data set and x_test(predictor) of test_dataset 

# Create SVM classification object
model = svm.svc()

# there are various options associated with it, this is simple for classification.

# Train the model using the training sets & check the score
model.fit(X, y)
model.score(X, y)

# Predict Output 
predicted = model.predict(x_test) 

5️⃣Naive Bayes:

# Import Library
from sklearn.naive bayes import GaussianNB

# Assumed you have, X (predictor) and Y (target) for
# training data set and x_test(predictor) of test_dataset 

# Create SVM classification object 
model = GaussianNB()

# there is other distribution for multinomial classes like Bernoulli Naive Bayes

# Train the model using the training sets and check score
model.fit(X, y)

# Predict Output 
predicted = model.predict(x_test) 

6️⃣K-Nearest Neighbors(kNN):

# Import Library 
from sklearn.neighbors import KNeighborsClassifier

# Assumed you have, X (predictor) and Y (target) for 
# training data set and x_test(predictor) of test_dataset

# Create KNeighbors classifier object model
KNeighborsClassifier(n_neighbors=6) # default value for n neighbors is 5


# Train the model using the training sets and check score
model.fit(X, y)

# Predict Output
predicted = model.predict(x_test) 

7️⃣k-Means Clustering:

# Import Library
from sklearn.cluster import KMeans

# Assumed you have, X (attributes) for training data set 
# and x test(attributes) of test dataset

# Create KNeighbors classifier object model
k means - KMeans(n clusters-3, random state=0)

#Train the model using the training sets and check score
model.fit(X)

#Predict Output 
predicted = model.predict(x_test) 

8️⃣Random Forest:

# Import Library
from sklearn.ensemble import RandomForestClassifier

# Assumed you have, X (predictor) and Y (target) for 
# training data set and x_test(predictor) of test_dataset

# Create Random Forest object
model= RandomForestClassifier()

# Train the model using the training sets and check score
model.fit(X, y)

# Predict Output 
predicted = model.predict(x_test) 

9️⃣Dimensionality Reduction Algorithms(e.g. PCA):

# Import Library 
from sklearn import decomposition

# Assumed you have training and test data set as train and test

# Create PCA object 
pca= decomposition.PCA(n_components=k) # default value of k -min(n sample, n features)

# For Factor analysis 
fa= decomposition.FactorAnalysis()

# Reduced the dimension of training dataset using PCA 
train_reduced = pca.fit_transform(train)

# Reduced the dimension of test dataset
test_reduced = pca.transform(test) 

1️⃣0️⃣Gradient Boosting & AdaBoost(e.g. GBDT):

 
# Import Library 
from sklearn.ensemble import GradientBoostingClassifier

# Assumed you have, X (predictor) and Y (target) for 
# training data set and x_test(predictor) of test_dataset

# Create Gradient Boosting Classifier object
model= GradientBoostingClassifier(n_estimators=100, \
         learning_rate=1.0, max_depth=1, random_state=0)
         
# Train the model using the training sets and check score 
model.fit(X, y) 

# Predict Output 
predicted = model.predict(x_test) 

Connect with me:

codeSTACKr | Twitter codeSTACKr | LinkedIn codeSTACKr.com


Faizy's github stats

Top Langs

machine-learning-algorithms's People

Contributors

mohd-faizy avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.