dsc-1-08-17-section-recap-summary-online-ds-sp-000's Introduction

Section Recap

Introduction

This short lesson summarizes the topics we covered in section 08 and why they'll be important to you as a data scientist.

Objectives

You will be able to:

Understand and explain what was covered in this section
Understand and explain why this section will help you become a data scientist

Key Takeaways

In this section, wee dug into a number of foundational concepts - from NumPy to the basics of Probability

Under the hood, Pandas relies on NumPy for computationally efficient processing of large data sets
In addition to providing a base for Pandas, NumPy has many useful features built right in - including the ability to perform random sampling
A scalar is a quantity that can be fully described by a magnitude (a single number). A vector can only fully be described by multiple numbers - e.g. a magnitude and a direction
NumPy supports a range of powerful Scalar and Vector mathematical operations
Probability is "how likely" it is that an event will happen
Sets in Python are unordered collections of unique elements
The inclusion exclusion principle is a counting technique to calculate the number of elements in a collection of sets with overlapping elements
The "sum rule" of probability states that $P(A\cup B) = P(A) + P(B) - P(A \cap B) $
Factorials provide the basis for calculating permutations
The difference between permutations and combinations is that with combinations, order is not important
The Bernoulli distribution can be used to describe a single, binary event
The probability of n-independent Bernoulli events can be described by a binomial distribution

In this section, we introduced the binomial distribution. In the next section, we'll look at a number of other types of distributions and how they relate to data science.

Recommend Projects