A little exploratory data analysis project using biodiversity data of species in four National Parks.
This data analysis project investigates biodiversity data from the National Parks Service about endangered species in different National Parks. The data includes two files - recorded sightings of different species at several national parks for the "past 7 days" and data about different species and their conservation status - both provided by Codecademy. (Note: The data for this project is inspired by real data, but is mostly fictional.)
The goal of this project is to prepare, analyze and visualize the data, and then trying to explain the findings. This is done inside a Jupyter Notebook using several data analysis Python libraries (e.g, Pandas, Matplotlib, Seaborn, Scipy).
Some questions the project seeks to answer:
- What is the distribution of conservation status for the different species?
- Are certain types of species more likely to be endangered?
- Are the differences between type of species and their conservation status significant?
- Which species is most rare and what is their distribution amongst parks?
- Which species is most prevalent and what is their distribution amongst parks?
- Which species is spotted the most at each park?
This project is created with:
- Python - 3.8.5
The project is based on the "Biodiversity" portfolio project from Codecademy.
GitHub @anlifi | LinkedIn Angelina Fischer