GithubHelp home page GithubHelp logo

nyudatabootcamp / book Goto Github PK

View Code? Open in Web Editor NEW
7.0 9.0 4.0 467 KB

Textbook to accompany class

License: Creative Commons Attribution 4.0 International

advice pdf python pandas pandas-tutorial

book's Introduction

Preface

This GitBook was written by David Backus, Sarah Beckett-Hile, Chase Coleman, and Spencer Lyon for a course at NYU's Stern School of Business. The idea is to give students experience with economic and financial data and introduce programming newbies to the benefits of moving beyond Excel. We use the Python programming language, specifically Python's data management and graphics tools. If that doesn't whet your appetite, we have a more elaborate sales pitch.

We designed the book to accompany a live class. We've tried to make it self-contained, but the written word is a poor substitute for the interaction you get in a classroom.

The book comes in multiple formats. You can access it on the internet. Or you can download (and print) a pdf file. The former comes with links, which we think is a huge advantage, and can be updated quickly, but if you like paper by all means try the pdf. All of them are available at

https://www.gitbook.com/book/nyudatabootcamp/data-bootcamp/details

Related course materials are available at

http://nyu.data-bootcamp.com/

We welcome suggestions. Send them to Chase Coleman or Spencer Lyon. Or, even better, post an issue on our GitHub repository.

Warning

This is work in progress. We've written seven chapters so far, more are on the way.

Acknowledgements

This project was Glenn Okun's idea. He really should have done it himself, but we thank him for the idea and his ongoing support. Paul Backus, Hersh Iyer (MBA17), Matt McKay, Kim Ruhl, and Itamar Snir (MBA17) contributed technical support and applications. Ian Stewart provided his usual expert advice on teaching methods. You may also notice a family resemblance to Tom Sargent and John Stachurski's Quantitative Economics, a Python- and Julia-based course in dynamic macroeconomic theory. We thank them for their advice and encouragement.

License

This work is licensed under the Creative Commons Attribution 4.0 International License. The text of which can be found here, or, for more information about what it means, you should visit the Creative Commons website.

book's People

Contributors

cc7768 avatar danielcsaba avatar mwaugh0328 avatar sglyon avatar szokeb87 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

book's Issues

List comprehension: false claim

In the list comprehension section, we have

It's another thing that doesn't work in Python 2, so make sure you have Python 3 installed.

This is false. We should remove it.

First function example should return something

Right now our first taste of functions (in pyfun2) is the following:

def hello(firstname):               # define the function
    print('Hello,', firstname)

hello('Chase')                      # use the function

Because we take the time to walk through all syntax using this example we should cover return values here.

Question: should we introduce pandas_datareader in pandas-input.md?

In this section of the pandas-input.md chapter we go over how to use the API methods from pandas.io to read in data from FRED, IMF, and others.

That code has been deprecated in pandas for some time and users are told to use the pandas_datareader package.

Eventually the deprecated code will be deleted altogether from pandas and we'll be forced to use pandas_datareader to read in the data.

I think the obvious answer is that we should use the currently suggested method here instead of the deprecated routines, but this brings up the issue of using conda to install a package that doesn't come with Anaconda -- something we haven't talked about at this point in the book.

What do people think? Should we take the time to talk about pip and conda before this section (maybe at the start of this lecture in our discussion on packages) so we can use pandas_datareader?

typo in Matplotlib Fundamentals

Got an email a few days ago. Wanted to create a placeholder for this

Just wanted to point out a possible error that I found in the text of the book in the chapter Python graphics: Matplotlib fundamentals. On page 101, the PISA test score example has the text, fig, ax = subplots().

I think it should be fig, ax = plt.subplots(). I couldn't get it to work the other way, but maybe it's just my computer/Spyder?

Thanks!

Pandas chapters rework

Should think about doing some additional organization of the pandas chapters:

Two examples of this would be:

  • Split pandas-input.md into pandas-intro.md and pandas-input.md where pandas-intro dealt with an introduction to a dataframe and pandas-input then talked only about reading files from your computer or online.
  • The data cleaning chapters need to be rewritten and cleaned up.

Pre-term revisions (Fall 2016)

@cc7768 we need to do some revisions.

Here's a list of the chapters in the book:

  • README.md
  • intro.md
  • python.md
  • mentality.md)
  • fun1.md)
  • fun2.md)
  • input.md
  • graphs1.md
  • pip.md
  • clean.md
  • shape.md
  • group.md
  • merge.md
  • pip.md
  • emerging.md
  • indicators.md
  • random.md
  • other.md
  • practice.md
  • glossary.md

Leave a comment to claim which chapters you want to work on

Add link

In the data input chapter, we talk about a link to a file for downloading data, but we don't have a link.

Add this.

Jupyter Lab

Once JupyterLab is sufficiently stable then the book should be edited to use JupyterLab as their first exposure to writing code (as opposed to Spyder and Jupyter Notebooks). JupyterLab will provide both of the benefits of these two in a single program.

Packing imports and conda/pip

Package importing is currently covered in pandas-input.md, but it might make sense to break that out since it isn't directly related to inputting data.

We might take the conda/pip material and the package material and make a new chapter between pyfun-2 and pandas-input. This would allow us to introduce them to these ideas and how to install/update packages all at once.

Int v Float

Add discussion of different types of numerical types in Python. This should go in Python fundamentals 1 before Strings.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.