GithubHelp home page GithubHelp logo

analysis-of-mushroom-species-dataset's Introduction

DSG_Recruitment

Mushroom Data Set

It is the analysis of the mushroom data set where we were expected to classify the mushroom categories into two categories i.e. edible or poisonous.The dataset has 25 attributes including the target variable class.It had two continuous variables radius and weight. So first after performing the library and dataset imports, I have performed exploratory data analysis. F
First performing a univariate analysis and then a bi-variate analysis of all the attributes. For the univariate analysis first analysing the continuous variable, first using the describe() method on the dataset I found out the mean, standard deviation and other statistical quantities.After that I performed some visual analysis by plotting histograms,jointplot and boxplots. After the visualisation we could infer about the outliers and also comment about the correlation between the two continuous variables.

After this moving to the categorical variables, I used some factorplot to plot the variation in the categorical variables with respect to the target variable class.From this we get some insights which are mentioned in the jupyter notebook.

After the visual analysis part, I performed outlier treatment on the radius attribute.

After this I did variable preprocessing for fitting our data into our machine learning model.For that first I performed feature scaling on the continuous variables radius and weight so that their ranges are same,and a longer range of one variable does'nt disturb our prediction. After that I performed label encoding of the categorical variables.

So after the variable preprocessing part I applied random forest algorithm on the dataset,for the parameter tuning of the algorithm I used the GridSearchcv method and then applied the algorithm to the training data set.Thereafter I used the train_test_split method to verify the accuracy of my method.Then I predicted the target variable by applying the random forest model on the test dataset.

analysis-of-mushroom-species-dataset's People

Contributors

kunalmessi10 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.