GithubHelp home page GithubHelp logo

probability-density-functions-lab-ds-alumni's Introduction

The Probability Density Function - Lab

Introduction

In this lab, we will look at building visualizations known as density plots to estimate the probability density for a given set of data.

Objectives

You will be able to:

  • Plot and interpret density plots and comment on the shape of the plot
  • Estimate probabilities for continuous variables by using interpolation

Let's get started

Let's import the necessary libraries for this lab.

# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import pandas as pd 

Import the data, and calculate the mean and the standard deviation

  • Import the dataset 'weight-height.csv' as a pandas dataframe.

  • Next, calculate the mean and standard deviation for weights and heights for men and women individually. You can simply use the pandas .mean() and .std() to do so.

Hint: Use your pandas dataframe subsetting skills like loc(), iloc(), and groupby()

data = None
male_df =  None
female_df =  None

# Male Height mean: 69.02634590621737
# Male Height sd: 2.8633622286606517
# Male Weight mean: 187.0206206581929
# Male Weight sd: 19.781154516763813
# Female Height mean: 63.708773603424916
# Female Height sd: 2.696284015765056
# Female Weight mean: 135.8600930074687
# Female Weight sd: 19.022467805319007
Male Height mean: 69.02634590621737
Male Height sd: 2.8633622286606517
Male Weight mean: 187.0206206581929
Male Weight sd: 19.781154516763813
Female Height mean: 63.708773603424916
Female Height sd: 2.696284015765056
Female Weight mean: 135.8600930074687
Female Weight sd: 19.022467805319007

Plot histograms (with densities on the y-axis) for male and female heights

  • Make sure to create overlapping plots
  • Use binsize = 10, set alpha level so that overlap can be visualized
# Your code here

png

# Record your observations - are these inline with your personal observations?

Create a density function using interpolation

  • Write a density function density() that uses interpolation and takes in a random variable
  • Use np.histogram()
  • The function should return two lists carrying x and y coordinates for plotting the density function
def density(x):
    
    pass


# Generate test data and test the function - uncomment to run the test
# np.random.seed(5)
# mu, sigma = 0, 0.1 # mean and standard deviation
# s = np.random.normal(mu, sigma, 100)
# x,y = density(s)
# plt.plot(x,y, label = 'test')
# plt.legend()

png

Add overlapping density plots to the histograms plotted earlier

# Your code here 

png

Repeat the above exercise for male and female weights

# Your code here 

png

Write your observations in the cell below

# Record your observations - are these inline with your personal observations?


# What is the takeaway when comparing male and female heights and weights?

Repeat the above experiments in seaborn and compare with your results

# Code for heights here

png

# Code for weights here

png

# Your comments on the two approaches here. 
# are they similar? what makes them different if they are?

Summary

In this lesson, you learned how to build the probability density curves visually for a given dataset and compare the distributions visually by looking at the spread, center, and overlap. This is a useful EDA technique and can be used to answer some initial questions before embarking on a complex analytics journey.

probability-density-functions-lab-ds-alumni's People

Contributors

mike-kane avatar mas16 avatar loredirick avatar lmcm18 avatar

Watchers

 avatar Mohawk Greene avatar Victoria Thevenot avatar Bernard Mordan avatar Otha avatar raza jafri avatar  avatar Joe Cardarelli avatar The Learn Team avatar Sophie DeBenedetto avatar  avatar  avatar Matt avatar Antoin avatar  avatar Alex Griffith avatar  avatar Amanda D'Avria avatar  avatar Ahmed avatar Nicole Kroese  avatar Kaeland Chatman avatar Lisa Jiang avatar Vicki Aubin avatar Maxwell Benton avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.