GithubHelp home page GithubHelp logo

luanborelli / ipeadatapy Goto Github PK

View Code? Open in Web Editor NEW
66.0 8.0 11.0 373 KB

ipeadatapy is a data and metadata extraction package made in Python using Ipeadata database official API. In it's essence it is an API wrapper.

License: MIT License

Python 4.81% Jupyter Notebook 95.19%
ipeadata ipea api data wrapper brazil economics datasets economic-data econometrics

ipeadatapy's Introduction

ipeadatapy

ipeadatapy: an API wrapper for Ipeadata

Downloads

What is it?

The main purpose of Ipeadatapy package is to provide a way of extracting data from Ipeadata through Python using Ipeadata’s API. Thus, in this sense, Ipeadatapy is what is called an API wrapper. Nevertheless, the goal of the package is far from being only extract data. Ipeadatapy also is concerned with treating, cleaning and making more understandable the data provided by the API as well as providing data filtering and search mechanisms. Briefly, Ipeadatapy’s objective can be described as being to facilitate users to search and analyze time series data and metadata from Ipeadata database using Python.

Main Features

Ipeadatapy allows you to extract processed data and metadata from Ipeadata's API in a more efficient and practical way, directly from your Python script, notebook and/or interactive shell. Here are some of the package's features:

  • Lists in data frames all Ipeadata available...
    • Time series names and codes;
    • Sources;
    • Countries;
    • Territories;
    • Themes.
  • Basic time series searching mechanism;
  • Data filtering through defined functions parameters;
  • Show time series data and metadata;
  • Filter time series data set by day, month and/or year;
  • Track latest updated time series.

Using pandas, one of the package dependecies, you can also plot and extract data and metadata. For more details check the documentation.

Where to get it

The source code is currently hosted on Ipeadatapy's GitHub.

Binary installers for the latest released version are available at Python package index page.

pip install ipeadatapy

Documentation

The official documentation is hosted on author's website: luanborelli.com/ipeadatapy/docs

Dependencies

The only dependecies are pandas and requests.

License

MIT

ipeadatapy's People

Contributors

luanborelli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ipeadatapy's Issues

timeseries() DatetimeIndex

Convert "DATE" column from timeseries() output to a DatetimeIndex in the format YYYY-MM-DD. Keep a copy of the old "DATE" column, naming it as "RAW DATE".

Before:
image

After:
image

KeyError: 'RAW DATE'

Describe the bug
When using the timeseries on COVID19 datasets, i've got error:


KeyError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2897 try:
-> 2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'RAW DATE'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
3 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
-> 2900 raise KeyError(key) from err
2901
2902 if tolerance is not None:

KeyError: 'RAW DATE'

Desktop (please complete the following information):
Google Colab
OS: Windows10
Browser Google Chrome
Version 90.0.4430.93 (Versão oficial) 64 bits

get_nivel_region() function not defined

Describe the bug
When using the timeseries.py function, the functionality supposed to group the returned values from the request (arg named "groupby") is not working properly, as the function named get_nivel_region() used within is not defined.

To Reproduce

  1. Example to get the error:
    '''
    import json
    import requests
    import ipeadatapy as ipea

big_theme = 'Regional'
theme_id = 6
keyword = 'Receita tributária - municipal'
name = ipea.list_series(keyword=keyword)['NAME'][0]
seriesCode = ipea.list_series(keyword=keyword)['CODE'][0]

myData = ipea.metadata(series=seriesCode, big_theme=big_theme, source=None, country=None, frequency=None, unit=None, measure=None, status=None, source_ext=None, source_url=None, last_update=None, code=None, comment=None, name=None, numerica=None, theme_id=theme_id)

if len(myData)==1:
myValues=ipea.timeseries(series=seriesCode, year=2017, grouby='TERNOME')
'''

Expected behaviour
I'm not sure how the groupby argument can group the results, perhaps it would create instances of the data frame, grouped by the grouping criteria ('TERNOME', territory name in this case).

Desktop (please complete the following information):

  • OS: Windows10
  • Browser Microsoft Edge
  • Version 85.0.564.63

Cannot return time series that doesn't have a well defined unit of measure

Reported example:

idpy.metadata(big_theme="Social", code = "BPC_idos")
idpy.timeseries("BPC_idos", year=2017, month=8).to_csv("BPC.csv", sep=";")

Output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-69-eb8760c3022a> in <module>
----> 1 idpy.timeseries("BPC_idos", year=2017, month=8).to_csv("BPC.csv", sep=";")

d:\code\abraji\testes\lib\site-packages\ipeadatapy\timeseries.py in timeseries(series, groupby, year, yearGreaterThan, yearSmallerThan, day, dayGreaterThan, daySmallerThan, month, monthGreaterThan, monthSmallerThan, code, date)
     48         ts_df = api_call(api).rename(index=str, columns={"SERCODIGO": "CODE", "VALDATA": "DATE", "VALVALOR": "VALUE ("+list(metadata_old(series)['MEASURE'])[0]+")"})
     49     else:
---> 50         ts_df = api_call(api)[['ANO','DIA','MES','SERCODIGO','VALDATA','VALVALOR']].rename(index=str, columns={"ANO": "YEAR", "DIA": "DAY", "MES": "MONTH", "SERCODIGO": "CODE", "VALDATA": "DATE", "VALVALOR": "VALUE ("+list(metadata_old(series)['MEASURE'])[0]+")"})
     51         #api_call(api).rename(index=str, columns={"SERCODIGO": "CODIGO", "VALDATA": "DATA", "VALVALOR": "VALOR ("+list(metadata_old(series)['UNINOME'])[0]+")"})
     52     if year is not None:

TypeError: can only concatenate str (not "NoneType") to str

Install requirements automatically

When installing the package with pip install ipeadatapy, the requirements are not installed together.

Describe the solution you'd like
Create a setup.py file in the root folder with the required packages listed in it (check this link for more info). With this, users would not need to run pip install -r requirements after installing ipeadatapy.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.