GithubHelp home page GithubHelp logo

datasets_practice_scienceacademy's Introduction




Practice Datasets -- Data Science and Machine Learning

Several useful public datasets are included in this repository to practice your Data Science and Machine Learning skills. These datasets are also used in the course on "Data Science and Machine Learning using Python - A Bootcamp".

The course is available on the following platforms:

For free contents, please subscribe to our Youtube Channel.

The repository is created to ensure that the datasets remain available without any dependence on party involvement.

Datasets For Visualization (general purpose)

  • 2010 Alcohol Consumption by Country
  • 2011 US Agricultural Exports (modified)
  • 2012 US Election Data
  • 2014 World GDP
  • 2016 World Happiness Index
  • 2020 CoVID19 Geographic Distribution Worldwide Data click here to download the most recent one
  • Bees Data
  • Emergency Calls (911 Calls Data)
  • Stocks (multiple csv files):
    • BMO
    • CIBC
    • CNQ
    • Encana
    • RBC
    • Suncor
    • USO
    • WTI
  • bootstrapping (sample data from StartCraft game on AMP -- Actions Per Minute only)

Datasets For Binary Class Classification

  • Australian Credit Approval
  • Breast Cancer (Wisconsin)
  • Breast Cancer (Yugoslavia)
  • Bank Note Authentication
  • Coded Data (Synthetically Created)
  • Heart Disease Cleveland Data Clean
  • Horse Colic
  • Ionosphere
  • Loan
  • Mammographic Masses Data Clean
  • Pima Indians Diabetes
  • Sonar Returns
  • Titanic Data (multiple csv files)
  • BioAssay dataset (highly imbalanced data)
  • Chronic Kidney Disease

Datasets For Multi-class Classification

  • Abalone Age (or regression)
  • Glass Identification
  • Iris Flower Species
  • Seed Quality Data
  • Wheat Seeds
  • Wine Quality (or regression)
  • Wine Quality Merged (red & white => column "red_wine" 1/0)

Datasets For Regression (simple/multiple)

  • Auto Insurance Total Claims
  • Big Mart Sales
  • Boston Housing
  • Kings County House Price
  • Longley Economic
  • StarCraft 1 dataset and description

Datasets For Univariate Time Series

  • Armed Robberies in Boston (Monthly )
  • Car Sales (Monthly)
  • Champagne Sales (Monthly)
  • Female Births in California (Daily )
  • International Airline Passengers (Monthly )
  • Shampoo Sales (Monthly)
  • Specialty Writing Paper Sales (Monthly)
  • Sunspots (Monthly)
  • Temperatures in Melbourne (Daily Minimum )
  • Temperatures in Melbourne (Daily Maximum )
  • Temperatures in Nottingham Castle (Mean Monthly)
  • Water Usage in Baltimore (Yearly )

Datasets For Multivariate Time Series

  • Historical Product Demand Dataset -- Forecasts the demand for thousands of different products
  • Pollution Levels in Beijing (Hourly)
  • Minutely Individual Household Electric Power Consumption
  • Human Activity Recognition Using Smartphones
  • Indoor Movement Prediction

Recommender Systems

  • Movies and Rating data (two separate csv files)

datasets_practice_scienceacademy's People

Contributors

junaidqazi avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

rasuljon86 hperez

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.