GithubHelp home page GithubHelp logo

datafun-06-projects's Introduction

Data Analyst Project

In this module, you will:

1. Perform a guided exploratory data analysis project.
2. Conduct a unique data analysis exploration.

The objective is to craft a compelling narrative ("tell a story") using data, showcasing not only your analytical skills but also your professional and engaging communication abilities.

Chapters

This module requires the skills learned in previous chapters. The first is a guided exploratory data project that focuses on diamonds.csv and is based on in Exercise 9.16 beginning on page 352 of the text. The second is a project of your choice, related to your domain.

Decide what you would like your second project to focus on / showcase. Review the requirements for the project and make sure your topic lends itself to successfully completing all requirements.

Task 1 - Prepare Your Module Repository

  1. GitHub
    1. Create a new repository named datafun-06-projects on GitHub.
    2. Initialize it with the default README.md.
  2. Local Machine
    1. In VS Code, clone your new repo into your Documents folder.
  3. Repository Essentials
    1. Add a .gitignore file from a previous Python project.
    2. Add a requirements.txt file to hold external dependencies for Jupyter notebooks and others as you need them.
  4. Update README.md
    1. Modify the README.md to include your name, the link to your repo, and the focus of this project repository.
    2. Include instructions with the exact commands to:
      1. Create and activate your virtual environment.
      2. Install all required external dependencies.
      3. Execute your Python files.
  5. Create your local virtual environment (hint: use venv to create a .venv folder)
  6. Activate your local virtual environment (hint: call a command in the .venv subfolder)
  7. Install any external dependencies you need (hint: use requirements.txt and all the files needed for Jupyter notebooks, pandas, etc.)
  8. Push to GitHub
    1. Add and commit all your changes with a commit message "Initialized repo"
    2. Push your changes to GitHub

Task 1 - Verify Repository

  1. Take a screenshot of your GitHub project repository after you've pushed these changes to GitHub.
  2. Display the screenshot as evidence of task completion.

Task 2 - Guided Diamonds Project

This first project is a guided exploration.

  1. Follow the instructions for Exercise 9.16 (starting pg. 350).
  2. Complete the exercise in a Jupyter notebook.
  3. Include the title of the notebook, your name, and date at the top.
  4. Include the following Markdown Section Headings in your notebook.
    1. Section 1-Load: Get the file, store it in your repo, and load it into a DataFrame.
    2. Section 2-View: Display the first 7 rows and the last 7 rows.
    3. Section 3-Describe: Use the DataFrame describe() function to calculate basic descriptive statistics for all numeric columns.
    4. Section 4-Series: Use the Series method describe() to calculate the descriptive stats for all category/text columns.
    5. Section 5-Unique: Use the Series method unique() to get unique category values.
    6. Section 6-Histograms: Use the DataFrame's hist() function to create a histogram for each numerical column.

Task 2 - Push to GitHub

  1. Execute the completed notebook
  2. Add, commit, and push your changes to GitHub. You can use incremental commits as you work - provide useful commit messages.
  3. At the end, use a commit message like "Task 2 complete".
  4. Verify your GitHub notebook appears complete and well-presented.

Task 2 - Verify

  1. Capture a screenshot of your completed notebook as viewed in GitHub at the conclusion of this task.
  2. Display the screenshot as evidence of task completion.

Task 3 - Custom Exploratory Data Project

Use everything you've learned to conduct a unique data exploration project using some information related to your domain. Create a new notebook that uses a dataset of your choice. The notebook name should make it clear this is your unique project.
Use this project to feature all of the key skills learned. See the list above. Include challenging Python programming aspects - find a reason to use filter(), map(), and list comprehensions. Have fun and make it unique. Your second project must show the following Python skills and Markdown sections:

Section 1-Load - Read from a data file into a pandas DataFrame. Section 2-View - Display the first 5 rows and the last 5 rows. Section 3-Describe: Use the DataFrame describe() function to calculate basic descriptive statistics for all numeric columns. Section 4-Series: Use the Series method describe() to calculate the descriptive stats for all category/text columns. Section 5-Unique: Use the Series method unique() to get unique category values. Section 6-Histograms: Use the DataFrame's hist() function to create a histogram for each numerical column. Section 7-List: Get some of your information into a list. Process each item in the list (use for or comprehensions as you like). Section 8-Filter: Use filter() to show only part of the information. Section 9-Map: Use map() to transform some of the data. Include a title section with your name - this is your branding - make it professional and attractive. Use Markdown section headings to professionally present your work. Tell a story with data - lead us through your project and summarize your interesting results.

Task 3 - Push to GitHub

Execute the completed notebook Add, commit, and push your changes to GitHub. You can use incremental commits as you work - provide useful commit messages. At the end, use a commit message like "Task 3 complete". Verify your GitHub notebook appears complete and well-presented.

Task 3 - Verify

Capture a screenshot of your completed notebook as viewed in GitHub at the conclusion of this task. Display the screenshot as evidence of task completion.

Optional Bonus Section

As part of your custom project, use a library or module we did not explore in class. Consider imageio, nltk, texatistic, textblob, wordcloud, or others. Look for something new that might interest you. Learn it on your own and apply it to your domain/project. Clearly label your Section Bonus using Markdown. In the bonus section, explain what you chose, why, how it went, and your results. Do you recommend it? This is a chance to show advanced, creative skills - make it valuable for others by describing it well.

Optional Bonus - Push to GitHub

Execute the completed notebook Add, commit, and push your changes to GitHub. You can use incremental commits as you work - provide useful commit messages. At the end, use a commit message like "Bonus complete". Verify your GitHub notebook appears complete and well-presented.

Optional Bonus - Verify

Capture a screenshot of your completed notebook as viewed in GitHub at the conclusion of this task. Display the screenshot as evidence of task completion.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.