In this optional project, we will create decision functions that attempt to predict survival outcomes from the 1912 Titanic disaster based on each passenger’s features, such as sex and age. We will start with a simple algorithm and increase its complexity until we are able to accurately predict the outcomes for at least 80% of the passengers in the provided data.
This project requires Python 2.7 and the following Python libraries installed:
You will also need to have software installed to run and execute an iPython Notebook
Udacity recommends our students install Anaconda, a pre-packaged Python distribution that contains all of the necessary libraries and software for this project.
Template code is provided in the notebook titanic_survival_exploration.ipynb
notebook file. Additional supporting code can be found in titanic_visualizations.py
. While some code has already been implemented to get you started, you will need to implement additional functionality when requested to successfully complete the project.
In a terminal or command window, navigate to the top-level project directory titanic_survival_exploration/
(that contains this README) and run one of the following commands:
jupyter notebook titanic_survival_exploration.ipynb
or
ipython notebook titanic_survival_exploration.ipynb
This will open the iPython Notebook software and project file in your web browser.
The dataset used in this project is included as titanic_data.csv
. This dataset is provided by Udacity and contains the following attributes:
survival
: Survival (0 = No; 1 = Yes)pclass
: Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd)name
: Namesex
: Sexage
: Agesibsp
: Number of Siblings/Spouses Aboardparch
: Number of Parents/Children Aboardticket
: Ticket Numberfare
: Passenger Farecabin
: Cabinembarked
: Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton)