GithubHelp home page GithubHelp logo

vision1307 / custom-svm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nihil21/custom-svm

0.0 0.0 0.0 8.89 MB

Custom implementation of Support Vector Machines using Python and NumPy

License: GNU General Public License v3.0

Python 2.67% Jupyter Notebook 97.33%

custom-svm's Introduction

custom-svm

Custom implementation of Support Vector Machines using Python and NumPy, as part of the Combinatorial Decision Making and Optimization university course (Master in Artificial Intelligence, Alma Mater Studiorum - University of Bologna).

Authors

Mattia Orlandi
Lorenzo Mario Amorosa

Requirements

This project requires the following libraries:

  • numpy for basic operations on matrices;
  • cvxopt for the quadratic convex optimization;
  • scikit-learn for generating and splitting the dataset, to assess accuracy, and to confront our implementation with SVC;
  • matplotlib for plotting graphs.

The complete list of the packages used in the virtual environment is in requirements.txt; to install all those modules, it is sufficient to issue the command pip install -r requirements.txt (better if done in an Anaconda environment).

Design and Implementation: Overview

The repository is structured in the following way:

We provided also a script version (custom-svm/svm_usecase.py) of the Jupyter notebook which can be run either in a terminal or in Spyder (recommended). For more clarity, it is suggested to at least read the notebook comments.

Lagrangian Formulation of the SVM and Optimization

The Lagrangian problem for SVM is formulated as follows:

Latex image not found :(

To integrate the soft margin in the formulation, for each data point a variable is introduced; such variable represents the distance of from the corresponding class margin if lies on the wrong side of such margin, otherwise they are zero. In other words, represents the penalty of the misclassified data point , and controls the trade-off between the amount of misclassified samples and the size of the margin.

Every point must satisfy the following constraint:

Latex image not found :(

By integrating it into the Lagrangian, the following is obtained:

Latex image not found :(

Its dual problem is formulated as follows:

LaTeX image not found :(

subject to:

LaTeX image not found :(

LaTeX image not found :(

It is a quadratic optimization problem that can be solved using the quadratic library cvxopt in python, so it is necessary to match the solver's API which, according to the documentation, is of the form:

LaTeX image not found :(

subject to:

LaTeX image not found :(

LaTeX image not found :(

Let be a matrix such that LaTeX image not found :( , then the function to optimize becomes:

LaTeX image not found :(

We then convert the sums into vector form and multiply both the objective and the constraint by −1, which turns this into a minimization problem and reverses the inequality in constaints. The optimization problem can be written as:

LaTeX image not found :(

subject to:

LaTeX image not found :(

LaTeX image not found :(

LaTeX image not found :(

It is now necessary to convert the numpy arrays that express the optimization problem accordingly to cvxopt format. Supposed m the number of samples and using the same notation as in the documentation, this gives:

  • LaTeX image not found :( a matrix of size m×m
  • LaTeX image not found :( a vector of size m×1
  • LaTeX image not found :( a matrix of size 2m×m, such that a diagonal matrix of -1s of size m×m is concatenated vertically with another diagonal matrix of 1s of size m×m
  • LaTeX image not found :( a vector of size 2m×1, with zeros in the first m cells and C in the other m cells
  • LaTeX image not found :( the label vector of size m×1
  • LaTeX image not found :( a scalar

It has to be noticed that in case of hard margin the constraints on the upper bound of the Lagrangian multipliers are not given, hence and are smaller in that case.

Kernel trick

Since the hyperplane is a linear function, the SVM model defined so far is suited only to linearly separable datasets, which is not very useful in real-world scenarios.
To enable the correct classification in the non-linear case, the data to classify is mapped by into a new space, in which the data is linearly separable and thus in which SVM can be applied.

However, computing the mapping for every is computationally expensive; therefore, since only the product is relevant as far as fitting and classification are concerned, only the mapping of such product is considered (kernel trick):

where is called kernel function, and it can be:

  • dot product (linear case);
  • polynomial;
  • radial basis function;
  • sigmoid.

In the python code the parameters needed by the solver are defined as follows, using the guideline previously provided:

       K = np.zeros(shape=(n_samples, n_samples))
       for i, j in itertools.product(range(n_samples), range(n_samples)):
           K[i, j] = self.kernel_fn(X[i], X[j])
       P = cvxopt.matrix(np.outer(y, y) * K)
       q = cvxopt.matrix(-np.ones(n_samples))
       # Compute G and h matrix according to the type of margin used
       if self.C:
           G = cvxopt.matrix(np.vstack((-np.eye(n_samples),
                                        np.eye(n_samples))))
           h = cvxopt.matrix(np.hstack((np.zeros(n_samples),
                                        np.ones(n_samples) * self.C)))
       else:
           G = cvxopt.matrix(-np.eye(n_samples))
           h = cvxopt.matrix(np.zeros(n_samples))
       A = cvxopt.matrix(y.reshape(1, -1).astype(np.double))
       b = cvxopt.matrix(np.zeros(1))

       sol = cvxopt.solvers.qp(P, q, G, h, A, b)

The support vectors can be get exploiting the variable sol, which are those with positive Lagrangian multipliers.

       lambdas = np.ravel(sol['x'])
       is_sv = lambdas > 1e-5
       self.sv_X = X[is_sv]
       self.sv_y = y[is_sv]
       self.lambdas = lambdas[is_sv]

Computation of the separating hyperplane

It is possible to compute then , if the kernel is linear, and , which are the parameters of the "hyperplane" which separates the classes, in fact:

LaTeX image not found :(

And given as the set of the support vectors:

LaTeX image not found :(

In the python code the computation is made as follows:

        self.w = np.zeros(n_features)
        for i in range(len(self.lambdas)):
            self.w += self.lambdas[i] * self.sv_X[i] * self.sv_y[i]
        self.b = 0
        for i in range(len(self.lambdas)):
            self.b += self.sv_y[i]
            self.b -= np.sum(self.lambdas * self.sv_y * K[sv_index[i], is_sv])
        self.b /= len(self.lambdas)

Prediction of the class label

Supposed S the number of support vectors, an input is assignment to a class label with the following formula. As a side node, in case of linear kernel taking simply the dot product between input and support vectors is enough.

LaTeX image not found :(

In code:

        y_predict = 0
        for lamda, sv_X, sv_y in zip(self.lambdas, self.sv_X, self.sv_y):
                y_predict += lamda * sv_y * self.kernel_fn(X, sv_X)
        y_predict = np.sign(y_predict + self.b)

SVM for Multiclass Classification

The module multiclass_svm.py contains the implementation of Support Vector Machine for multi-classification purposes based on one-vs-one strategy.
It offers full support to kernel functions and soft margin, in fact the signature of its __init__ method is the same of the binary SVM.
Given N different classes to classify, the algorithm provides LaTeX image not found :( SVM binary classifiers from the module svm.py.
Each classifier is trained to correctly classify 2 of the N given classes. In the training process there are used only the entries in the dataset to which it corresponds a label of the 2 classes.
Given an unseen example, the prediction of the class is computed deploying a voting schema among the binary SVM classifiers.
The voting process is based on the standard predict function for binary SVM classifiers, so the tested entry is assigned to the class which wins the highest number of binary comparisons. In addition, it is available a mechanism to counteract the possible risk of draw in voting, based on the raw values predicted by the binary classifiers before the application of 'sign' function.

Workflow

  • The SVM model is initially created by specifying the type of kernel ('rbf'/'poly'/'sigmoid') and the value of the associated parameters ('gamma', 'deg' and 'r'); also, the parameter 'C' regulating the soft margin is specified.
  • When the 'fit' method is called (passing a supervised training set), the model learns the correct parameters of the hyperplane by minimizing the dual lagrangian function discussed in the previous section.
  • When the 'predict' method is called, new instances are classified according to the learnt parameters.

Credits

Tristan Fletcher, Support Vector Machines Explained
Humboldt-Universität zu Berlin, Lagrangian formulation of the SVM

custom-svm's People

Contributors

nihil21 avatar lostefra avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.