khaeru / py-gdx Goto Github PK

A Pythonic interface to GAMS GDX files

License: MIT License

GAMS 4.70% Python 95.30%

py-gdx's Introduction

PyGDX

PyGDX is a Python package for accessing data stored in GAMS Data eXchange (GDX) files. GDX is a proprietary, binary file format used by the General Algebraic Modelling System (GAMS); pyGDX uses the Python bindings for the GDX API.

Originally inspired by the similar package, also named py-gdx, by Geoff Leyland, this version makes use of xarray to provide labelled data structures which can be easily manipulated with NumPy for calculations and plotting.

Documentation is available at https://pygdx.readthedocs.io, built automatically from the contents of the Github repository.

PyGDX is provided under the MIT License (see LICENSE).

Example

With the following GAMS program:

set  s  'Animals'  /
  a  Aardvark
  b  Blue whale
  c  Chicken
  d  Dingo
  e  Elephant
  f  Frog
  g  Grasshopper
  /;

set  t  'Colours'  /
  r  Red
  o  Orange
  y  Yellow
  g  Green
  b  Blue
  i  Indigo
  v  Violet
  /;

set  u  'Countries'  /
  CA  Canada
  US  United States
  CN  China
  JP  Japan
  /;

parameter p(s,t,u) 'Counts of nationalistic, colourful animals'
  / set.s.set.t.set.u 1 /;

execute_unload 'example.gdx'

The parameter p can be accessed via:

>>> import gdx
>>> f = gdx.File('example.gdx')
>>> f.p[:,'y','CA']
a    1
b    1
c    1
d    1
e    1
f    1
g    1
dtype: float64

py-gdx's People

Contributors

Stargazers

Watchers

Forkers

adamchainz kdheepak kavvkon behnam-zakeri fagan2888

py-gdx's Issues

Update Windows install documentation

@mmowers reports:

When running python setup.py install in py-gdx/, I had an error that said Setup script exited with error: Microsoft Visual Studio C++ 10.0 is required during the installation of pandas. I solved by just installing numpy and pandas directly with conda install numpy and conda install pandas, and then running python setup.py install again.

I had to make sure gams was added to PATH. I reinstalled and followed instructions here to make sure gams was added to PATH: http://www.gams.com/latest/docs/userguides/userguide/_u_g__w_i_n__i_n_s_t_a_l_l.html#CLI

Update the documentation to make these clear.

Convert to xarray 'decorator'

xarray 0.8 includes support for decorators, preferred over subclasses of xr.Dataset. Convert py-gdx from a subclass to a decorator.

This is a major change, and so will be backwards-incompatible.

Document gdxcc install on Windows

Installing the GAMS GDX Python API on Windows can be confusing; add these to the documentation.

Improve coveralls-reported build coverage

Coveralls.io currently gives 25% coverage for the py-gdx codebase/test suite—about 100% in gdx/test/test_gdx.py and zero elsewhere. On the other hand, local testing gives:

- coverage: platform linux, python 3.5.2-final-0 -
Name                Stmts   Miss  Cover
---------------------------------------
gdx/__init__          232     27    88%
gdx/api                45      0   100%
gdx/pycompat            7      0   100%
gdx/test/test_gdx     101      2    98%
---------------------------------------
TOTAL                 385     29    92%

Fix so that the travis/coveralls-reported coverage matches.

Add gdx write capability

It will be useful for writing gdx files directly from Python

Document source code location & PyPI install

Thomas Brinsmead points out that the installation instructions lack a link to the source code on Github around "the directory containing pyGDX." A reader starting with these docs has difficulty locating and obtaining the source code.

The documentation also does not reflect that the package is available from PyPI.

Upload to PyPI

Following these instructions.

Construct "implicit sets" for *

For GAMS parameters declared using the universal set, like parameter foo(*,*,*), PyGDX tries to find an existing GAMS set for each dimension indexed by *. Where no existing set matches, the dimension is left as *; but when * has many elements, this results in high memory use when trying to load foo.

Add code and an option for the File constructor to create the implicit sets for each dimension of a loaded parameter. For instance, create sets named _foo_1, _foo_2, and _foo_3, so that the parameter is foo(_foo_1,_foo_2,_foo_3). This will avoid the memory use problem in most practical situations.

File.extract() fails when multiple dimensions indexed by sub-sets of same set

Reported by @kmulvaney:

For a GAMS parameter like:

Parameter sectegy_fe(r,fe,g,t) sector energy use by type [mtce] /
…

…where fe is a subset of g, File.extract('sectegy_fe') raises an exception around the line:

                result = result.drop(drop, dim=p).swap_dims({p: c})

This is because both the second and third dimension are interpreted as being indexed by g, and it is not possible to make swap_dims swap only one.

Improve Travis build script

.travis.yml currently includes:

cache:
  directories:
    - $HOME/gams
    - $HOME/virtualenv/python$TRAVIS_PYTHON_VERSION*/lib/python*/site-packages

Because the latter directory contains the version of PyGDX that's being tested, the set of files to be cached changes on every build.

exclude installed pyGDX from cached build files
add coveralls.io configuration
switch to using Anaconda for the build environment

ValueError on extract()

Using mit-jp/crem@51d7d0189de31e54e18f164a0d004bae3a41670c to produce example.gdx, one parameter raises an error when using File.extract():

>>> import gdx
>>> f = gdx.File('example.gdx')
>>> f.extract('sectem')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-bdee7af32ba8> in <module>()
----> 1 f.extract('sectem')

/home/khaeru/vc/py-gdx/gdx/__init__.py in extract(self, name)
    372                 # rename 'p' to 'c'
    373                 drop = set(self[p].values) - set(self[c].values) - set('')
--> 374                 result = result.drop(drop, dim=p).rename({p: c})
    375         return result
    376 

/home/khaeru/.local/lib/python3.5/site-packages/xarray/core/dataarray.py in rename(self, new_name_or_name_dict)
    733             name_dict = new_name_or_name_dict.copy()
    734             name = name_dict.pop(self.name, self.name)
--> 735             dataset = self._to_temp_dataset().rename(name_dict)
    736             return self._from_temp_dataset(dataset, name)
    737         else:

/home/khaeru/.local/lib/python3.5/site-packages/xarray/core/dataset.py in rename(self, name_dict, inplace)
   1245                                  "variable in this dataset" % k)
   1246             if v in self:
-> 1247                 raise ValueError('the new name %r already exists' % v)
   1248 
   1249         variables = OrderedDict()

ValueError: the new name 'r' already exists

sectem is defined over r, g and t; r is a subset of rs, and the error seems to happen when trying to rename r to rs.

This may be new after the update to xarray.

Reported by @ctli

Change name to "gdxarray"

It's cute, and PyGDX has some other (though not active) uses:

When reading gdx file, "TypeError: object of type 'NoneType' has no len()"

Thanks for this package! I'm looking forward to using it! First I have to figure out this issue though.

After installing py-gdx, on the first attempt to read a gdx file, I get a typeError:

(py34) C:\Users\Owner\Python Libs\py-gdx speed test>python
Python 3.4.5 |Continuum Analytics, Inc.| (default, Jul  5 2016, 14:53:07) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from gdx import File
>>> f = File("LDC_static_inputs.gdx")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Owner\Anaconda3\envs\py34\lib\site-packages\gdx-3-py3.4.egg\gdx\__init__.py", line 62, in __init__
  File "C:\Users\Owner\Anaconda3\envs\py34\lib\site-packages\gdx-3-py3.4.egg\gdx\api.py", line 80, in __init__
  File "C:\Users\Owner\Anaconda3\envs\py34\lib\site-packages\gdx-3-py3.4.egg\gdx\api.py", line 53, in _gams_dir
  File "C:\Users\Owner\Anaconda3\envs\py34\lib\ntpath.py", line 253, in dirname
    return split(p)[0]
  File "C:\Users\Owner\Anaconda3\envs\py34\lib\ntpath.py", line 217, in split
    d, p = splitdrive(p)
  File "C:\Users\Owner\Anaconda3\envs\py34\lib\ntpath.py", line 159, in splitdrive
    if len(p) > 1:
TypeError: object of type 'NoneType' has no len()
>>>

The file "LDC_static_inputs.gdx" is in C:\Users\Owner\Python Libs\py-gdx speed test.

I am using Windows 10 home Version 1607 Build 14393.693.

I wonder if this is related to an installation issue I had. When running python setup.py install in py-gdx/, I had an error that said Setup script exited with error: Microsoft Visual Studio C++ 10.0 is required during the installation of pandas. I solved by just installing numpy and pandas directly with conda install numpy and conda install pandas, and then running python setup.py install again.

Any help would be much appreciated.

Thanks!