GithubHelp home page GithubHelp logo

federicogarza / covidmx Goto Github PK

View Code? Open in Web Editor NEW
18.0 5.0 7.0 658 KB

Python API to get information about COVID-19 in México.

License: MIT License

Python 97.35% Dockerfile 0.99% Makefile 1.66%
covid-19 mexico python coronavirus api pip mexico-maps

covidmx's Introduction

Build PyPI version fury.io DOI Downloads Python 3.5+ License: MIT

covidmx

Python API to get information about COVID-19 in México.

Requirements

more-itertools>=6.0.0
pandas>=0.25.2
Unidecode>=1.1.1
requests==2.21.0
xlrd==1.2.0
mapsmx==0.0.3
matplotlib==3.0.3
mapclassify==2.2.0
descartes==1.1.0

How to install

pip install covidmx

How to use

Dirección General de Epidemiología

The mexican Dirección General de Epidemiología has released open data about COVID-19 in México. This source contains information at the individual level such as gender, municipality and health status (smoker, obesity, etc). The package covidmx now can handle this source as default. Some variables are encoded as integers and the source also includes a data dictionary with all relevant information. When you pass clean=True (default option) returns the decoded data. You can also have access to the catalogue using return_catalogo=True and to the description of each one of the variables with return_descripcion=True. When you use some of this parameters, the API returns a tuple.

from covidmx import CovidMX

covid_dge_data = CovidMX().get_data()
raw_dge_data = CovidMX(clean=False).get_data()
covid_dge_data, catalogo_data = CovidMX(return_catalogo=True).get_data()
covid_dge_data, descripcion_data = CovidMX(return_descripcion=True).get_data()
covid_dge_data, catalogo_data, descripcion_data = CovidMX(return_catalogo=True, return_descripcion=True).get_data()

To get historical data use:

covid_dge_data = CovidMX(date='12-04-2020').get_data()

Default date format is %d-%m-%Y, but you can also use a particular format with:

covid_dge_data = CovidMX(date='2020-04-12', date_format='%Y-%m-%d').get_data()

Plot module

As of version 0.3.0, covidmx includes a module to create maps of different COVID-19 status at the national and state levels, with the possibility of including municipalities (using information of the Dirección General de Epidemiologia).

from covidmx import CovidMX

dge_plot = CovidMX().get_plot()

You can check available status and available states using:

dge_plot.available_states

array(['MÉXICO', 'CIUDAD DE MÉXICO', 'TAMAULIPAS', 'BAJA CALIFORNIA',
       'YUCATÁN', 'GUERRERO', 'BAJA CALIFORNIA SUR', 'JALISCO',
       'NUEVO LEÓN', 'SONORA', 'VERACRUZ DE IGNACIO DE LA LLAVE',
       'PUEBLA', 'CAMPECHE', 'GUANAJUATO', 'SAN LUIS POTOSÍ',
       'MICHOACÁN DE OCAMPO', 'COAHUILA DE ZARAGOZA', 'QUERÉTARO',
       'AGUASCALIENTES', 'TABASCO', 'HIDALGO', 'ZACATECAS', 'DURANGO',
       'CHIHUAHUA', 'CHIAPAS', 'SINALOA', 'QUINTANA ROO', 'MORELOS',
       'TLAXCALA', 'NAYARIT', 'OAXACA', 'COLIMA'], dtype=object)
dge_plot.available_status

['confirmados', 'negativos', 'sospechosos', 'muertos']

To plot a national map just use:

dge_plot.plot_map(status='confirmados')

If you want to include municipalities use:

dge_plot.plot_map(status='confirmados', add_municipalities=True)

You can pass a particular state filling the state argument with a valid name included in the available_states attribute:

dge_plot.plot_map(status='confirmados', state='CIUDAD DE MÉXICO', add_municipalities=True)
state='CIUDAD DE MÉXICO' state='JALISCO' state='MORELOS' state='MÉXICO'

Finally you can plot another interest variable (according to available_status attribute):

dge_plot.plot_map(status='sospechosos', add_municipalities=True)

You can save your maps using save_file_name:

dge_plot.plot_map(status='sospechosos', add_municipalities=True, save_file_name='sospechosos-nacional.png')

Serendipia

Serendipia publishes daily information of the mexican Secretaría de Salud about covid in open format (.csv). This api downloads this data easily, making it useful for task automation.

from covidmx import CovidMX

latest_published_data = CovidMX(source='Serendipia').get_data()

Then CovidMX instances a Serendipia class, searches the latest published data for both confirmed and suspects individuals and finally clean the data. Nevertheless, a more specific search can be conducted (see docs for details).

raw_data = CovidMX(source='Serendipia', clean=False).get_data()
confirmed = CovidMX(source='Serendipia', kind="confirmed").get_data()
suspects = CovidMX(source='Serendipia',kind="suspects").get_data()
particular_published_date = CovidMX(source='Serendipia', date='2020-04-10', date_format='%Y-%m-%d').get_data()

Cite as

Acknowledgments

Release information

0.3.1 (Current version)

  • 2020-06-01
  • Updated new urls from serendipia source. (Thanks to Mario Jimenez.)

0.3.0

  • 2020-04-26.
  • Includes a plot module at state and municipality leveles.
  • Includes a better handling of encodings. (Thanks to Mario Jimenez.)

0.2.5

0.2.4

  • 2020-04-16. The Dirección General de Epidemiología source renamed two columns:
    • HABLA_LENGUA_INDI -> HABLA_LENGUA_INDIG (column name and description are now homologated)
    • OTRA_CON -> OTRA_COM
    • Now the API can handle this change.

0.2.3

  • Now works with python3.5+.
  • Using clean=True returns encoded data instead of decoded data without cleaning columns (as works in 0.2.0 and 0.2.1).

0.2.1

  • Minor changes to README.

0.2.0

0.1.1

  • Minor changes to README.

0.1.0

First realease.

covidmx's People

Contributors

azulgarza avatar garciaguevara avatar isccarrasco avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

covidmx's Issues

Cannot read the data

I'm trying to run the covid_dge_data = CovidMX().get_data() line, but it get the runtime error Cannot read the data. Also get the following: URLError: <urlopen error [Errno 60] Operation timed out>.

Captura de Pantalla 2020-08-04 a la(s) 10 29 37
Captura de Pantalla 2020-08-04 a la(s) 10 29 50

I've failed to pull request at the dge.py file to change the URL_DESCRIPTION and URL_HISTORICALfor the updated ones:

URL_DESCRIPTION = 'http://epidemiologia.salud.gob.mx/gobmx/salud/datos_abiertos/diccionario_datos_covid19.zip' URL_HISTORICAL = 'http://epidemiologia.salud.gob.mx/gobmx/salud/datos_abiertos/datos_abiertos_covid19.zip'

Fails when installing in windows

I was trying to update the last release to date version 0.3.0 and I got this error when installing:

A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Error in dge.py file

When running Python to get information through the get_data method of the CovidMX object, it returns the following error:

raceback (most recent call last):
File "c:/Users/Alejandro Rebollo/Documents/Covid19MX.py", line 4, in
raw_dge_data = CovidMX(clean=False).get_data()
File "C:\Users\Alejandro Rebollo\AppData\Local\Programs\Python\Python38-32\lib\site-packages\covidmx\dge.py", line 33, in get_data
df, catalogo, descripcion = self.read_data()
File "C:\Users\Alejandro Rebollo\AppData\Local\Programs\Python\Python38-32\lib\site-packages\covidmx\dge.py", line 59, in read_data
raise RuntimeError('Cannot read data description.')
RuntimeError: Cannot read data description.

Error encoding='UTF-8'

covid_dge_data = CovidMX().get_data()

data = pd.read_csv(url_data, encoding='UTF-8')
73 except BaseException:
---> 74 raise RuntimeError('Cannot read the data.')
75
76 try:

RuntimeError: Cannot read the data.

KeyError: 'HABLA_LENGUA_INDIG'

Hi! I'm running the following command:
covid_dge_data, catalogo_data, descripcion_data = CovidMX(return_catalogo=True, return_descripcion=True).get_data()
And I am getting the following error:

KeyError Traceback (most recent call last)
in
3 #covid_dge_data, catalogo_data = CovidMX(return_catalogo=True).get_data()
4 #covid_dge_data, descripcion_data = CovidMX(return_descripcion=True).get_data()
----> 5 covid_dge_data, catalogo_data, descripcion_data = CovidMX(return_catalogo=True, return_descripcion=True).get_data()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\covidmx\dge.py in get_data(self)
36 if self.clean:
37 print('Cleaning data')
---> 38 df = self.clean_data(df, catalogo, descripcion)
39
40 print('Ready!')

~\AppData\Local\Continuum\anaconda3\lib\site-packages\covidmx\dge.py in clean_data(self, df, catalogo, descripcion)
172 for col in df.columns:
173 df[col] = self.replace_values(
--> 174 df, col, desc_dict, catalogo_dict)
175
176

~\AppData\Local\Continuum\anaconda3\lib\site-packages\covidmx\dge.py in replace_values(self, data, col_name, desc_dict, catalogo_dict)
130 def replace_values(self, data, col_name, desc_dict, catalogo_dict):
131
--> 132 formato = desc_dict[col_name]
133 if 'FECHA' in col_name:
134 return pd.to_datetime(

KeyError: 'HABLA_LENGUA_INDIG'

unexpected keyword argument 'return_catalogo' / 'return_descripcion'

Hi,

when I run covid_dge_data, catalogo_data = CovidMX(return_catalogo=True).get_data() get:

Traceback (most recent call last) <ipython-input-8-9c05e955ccf6> in <module>() 3 covid_dge_data = CovidMX().get_data() 4 raw_dge_data = CovidMX(clean=False).get_data() ----> 5 covid_dge_data, catalogo_data = CovidMX(return_catalogo=True).get_data() 6 covid_dge_data, descripcion_data = CovidMX(return_descripcion=True).get_data() 7 covid_dge_data, catalogo_data, descripcion_data = CovidMX(return_catalogo=True, return_descripcion=True).get_data() /usr/local/lib/python3.6/dist-packages/covidmx/covidmx.py in CovidMX(source, **kwargs) 27 28 if source == "Serendipia": ---> 29 return Serendipia(**kwargs) TypeError: __init__() got an unexpected keyword argument 'return_catalogo'

and covid_dge_data, descripcion_data = CovidMX(return_descripcion=True).get_data() get:

Traceback (most recent call last) <ipython-input-22-1b8abfd4e7ee> in <module>() 4 #raw_dge_data = CovidMX(clean=False).get_data() 5 #covid_dge_data, catalogo_data = CovidMX(return_catalogo=True).get_data() ----> 6 covid_dge_data, descripcion_data = CovidMX(return_descripcion=True).get_data() 7 #covid_dge_data, catalogo_data, descripcion_data = CovidMX(return_catalogo=True, return_descripcion=True).get_data() /usr/local/lib/python3.6/dist-packages/covidmx/covidmx.py in CovidMX(source, **kwargs) 27 28 if source == "Serendipia": ---> 29 return Serendipia(**kwargs) TypeError: __init__() got an unexpected keyword argument 'return_descripcion'

By the way, Unidecode package is required, so could be usefull to add this line to install instructions:

pip install unidecode

Catalogos missing

When I am calling this line

covid_dge_data = CovidMX().get_data()

I have this error:

"There is no item named 'Catalogos_0412.xlsx' in the archive"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.