GithubHelp home page GithubHelp logo

learn-co-students / dsc-1-09-05-statistical-distributions-with-stem-and-leaf-plots-labs-online-ds-pt-112618 Goto Github PK

View Code? Open in Web Editor NEW
0.0 25.0 0.0 226 KB

License: Other

Jupyter Notebook 89.53% Python 10.47%

dsc-1-09-05-statistical-distributions-with-stem-and-leaf-plots-labs-online-ds-pt-112618's Introduction

Statistical Distributions with Stem and Leaf Plots - Lab

Introduction

In this lab, you'll practice your stem and leaf plots knowledge.

Objectives

You will be able to:

  • Create stem and leaf plots from given data in matplotlib
  • Compare effectiveness of stem plots as compared to histograms

Analyzing Students Results

Below is list of marks (out of 100) that students obtained in a certain project. You can clearly see that there is a huge spread in the data reflecting a range of numbers going from 10 to 95.

10,11,22,24,35,37,45,47,48,58,56,59,61,71,81,92,95

We would like to give grades to these students using a very naive criteria:

  • Anything below 30 is a Fail
  • 30 - 50 is a Referral for repeating the project
  • 5 - 59 is a Pass
  • 60 - 69 is a Merit
  • 70 - 79 is a Distinction
  • 80+ is a high distinction

Once the criteria is established, we would like to see how many students fall in each of these classes/grades using a visual approach.

We shall go ahead and build a stem and leaf plot for this data. This plot would help us visualize above grading classes and how many students fall in each class.

Let's get started

First lets import necessary libraries. We would need numpy for processing data and matplotlib for visualizations.

import matplotlib.pyplot as plt
import numpy as np
plt.style.use('ggplot')

First we need to make a numpy array containing all of those above values.

marks = None
marks

The pyplot.stem() method

the pyplot module in matplotlib comes packaged with a .stem() method for visualizing stem and leaf plots. Heres a general syntax for calling this method

plt.stem(x=stems, y=leaves, linefmt, markerfmt, basefmt)

And here is the official documentation if you want to dig deeper for customizations. We shall simply pass the stem(grades) and leaves(marks) arrays to this function with some simple formatting to visualize the plot.

As you can see, in order to plot the stem and leaf plot, we will need to first seperate our data into stems and leafs. To do this, write a function or use a loop to seperate eachdata point into tens and ones digits. For example, 65 would get split into stem: 6 (the tens digit) and leaf: 5 (the ones digit). Preferably, use numerical methods on the integers themselves as opposed to converting the number to a string and using slicing.

# Create stems and leafs arrays to store the grades for all the marks in marks array, in the same order.
stems = []
leafs = []

Great! Now that you have your stems and leafs defined, use the pyplot.stem() method to created a stem and leaf plot!
Be sure to style your plot including:

  • Use a figure size of 12 x 8
  • Set suitable limits for x and y - axis
  • Apply label and axes formatting
# Create a stem and leaf plot including the above styling

Analyzing the output

So there we have it, our stem and leaf plot. While all the underlying data is retrievable, the plot can be a little bizarre to decipher. The number of points shows how many data points are in each bucket. The x-axis, or stems, represent the tens digit of each datapoint. So we can see that since most points have a stem of 5 or below, most students scored in the 50s or lower on this exam.

Just to get a bit more intuition behind this, let's build a histogram and compare both plots.

# Create a histogram for marks

While we can't retrieve the original data points, it is easier to visualize where the data lies. As we saw before, we can get an idea about the placement frequency of marks in a certain class/grade, but theres no way to see individual values. For an indepth analysis, it is highly recommended to use the appropriate plotting style to have a clear understanding of underlying data.

Summary

In this lab, we saw how to create stem and leaf plot using matplotlib. We also re-enforced the idea that these plots could be more insightful than histograms in some cases. In the upcoming labs, we shall talk about other statistical visualizations to dive deeper into the distributions.

dsc-1-09-05-statistical-distributions-with-stem-and-leaf-plots-labs-online-ds-pt-112618's People

Contributors

shakeelraja avatar loredirick avatar mathymitchell avatar

Watchers

James Cloos avatar Kevin McAlear avatar  avatar Victoria Thevenot avatar Belinda Black avatar  avatar Joe Cardarelli avatar Sam Birk avatar Sara Tibbetts avatar The Learn Team avatar Sophie DeBenedetto avatar  avatar Jaichitra (JC) Balakrishnan avatar Antoin avatar Alex Griffith avatar  avatar Amanda D'Avria avatar  avatar Nicole Kroese  avatar  avatar  avatar Lisa Jiang avatar Vicki Aubin avatar Maxwell Benton avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.