GithubHelp home page GithubHelp logo

poisonedpawn / sophosmachinelearningbuildingblockstutorial Goto Github PK

View Code? Open in Web Editor NEW

This project forked from maddy12/sophosmachinelearningbuildingblockstutorial

0.0 1.0 0.0 22.11 MB

A tutorial on how to build an artificial neural network model based on URL data.

Python 100.00%

sophosmachinelearningbuildingblockstutorial's Introduction

Sophos Machine Learning Building Blocks Tutorial

A tutorial on how to build an artificial neural network model based on URL data.

Getting Started

The first thing you will need is python. If you have no experience with python, a good start is to download Anaconda which provides you with some starter packages and a handy IDE: https://www.continuum.io/downloads. The code illustrated in this example is from Python 2.7, so it would be advisable to download the 2.7 Python version of Anaconda. The IDE that comes with Anaconda is called Spyder.

Prerequisites

There are several packages you will need to install prior to using the code. Tensorflow and Python 2.7 in addition to the following Python packages include: numpy, pandas, baker, sklearn, mmh3, nltk, matplotlib, and keras.

Installing

Before installing the machine learning package Keras, you must install its dependencies. Instructions on how to install Tensorflow can be found here: https://www.tensorflow.org/install/. Instructions on how to do so for Anaconda is under the Windows installation page: https://www.tensorflow.org/install/.

Once you have tensorflow installed and set up, there are a few more dependencies that need to be installed. From the command line, you use the command 'pip install'. This also applies to the Spyder python

pip install numpy pandas baker sklearn mmh3 nltk matplotlib keras

From the Anaconda command line interface, you use 'conda install'

conda install numpy pandas baker sklearn mmh3 nltk matplotlib keras

When using Spyder, either method works from a command prompt.

Running the Model

To run the code, use 'python' then our function 'compare', the parameters you are changing prefixed with '--' and the value. The paramaters here are the filepath for which you have your "clean.csv" and "dirty.csv" stored and n is the number of urls and dirty urls you would like to run on. If the n chosen is larger than the amount of urls available, it will just use all.

python urlmodel.py compare --filepath "<path where the data is stored>" --n 

The results of the model will be stored in the filepath you passed.

sophosmachinelearningbuildingblockstutorial's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.