geostat-framework / pykrige Goto Github PK

View Code? Open in Web Editor NEW

722.0 36.0 183.0 1.4 MB

Kriging Toolkit for Python

Home Page: https://pykrige.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Python 97.71% Cython 2.29%

kriging interpolation geostatistics spatial-analysis gaussian-processes spatial-statistics

pykrige's People

Contributors

Stargazers

Watchers

Forkers

scollis mirandalv xuexianwu misganaa rth hc10024 anirban-sap glinsky007 doylejg abysmalocean diversoft dshean joshuasteele sphcn christos-tsotskas sativa shiweihappy mjziebarth wind-chh leochencipher kvanlombeek yifzhou pjhaest basaks whdc hydrogeog jamieg256 zqq0ew0 tianyeeee marcelorodriguesss python3pkg rhdhv-water iuriqueiroz nispc jakirkham spyton cwoelfer sl7rf juangeng shankhabrata cchristelis geoffdneal scotthavens hasanmoudud mapper3d sonlinux brauliobarahona hackl meteogroup rodriguezmuller daniel-m dangloven geoslegend michaelleerilee rouault acmoody usda-ars-nwrc fnavarrov jiangguanying firstkingofrome olaurendin harrymd sptz1 s-m-amin-ghasemi hc27080401 smholsen juifa-tsai yafshar blank-wang earthlinginteractive stripathyindia yhanl 15600050709 chris85712 ymcmrs augoliv soiqualang achilles78 aabercrombie0492 jdelsman kodexp chdwql alunux d-hasan brianmcburney fanwangm simon-xh yhaiping777 carlitosh kaplanemrah qdhqf donhalmina amrithril nannau holyborito fanhongweifd cvelascof open-eramet qiwang-sjtu qpanscience

pykrige's Issues

implement logging throughout

Use the logging module everywhere, instead of print statement everywhere?
Make this part of next release?

New release on PyPI

Seems there is 1.3.1 tagged this past December. Would be nice to have it on PyPI where 1.3.0 is the latest.

[Refactoring] N-dimenstional Kriging

Just a few thoughts about possible code refactoring in PyKrige.

Currently PyKrige has separate code that implements 2D and 3D kriging (e.g. in ok.py and ok3d.py), this result in code duplication and makes code maintenance and adding new features more difficult (they need to be added to every single file). Besides the 1D Kriging is not implemented, and it might have been nice to have some 1D examples as they are easier to visualize.

In addition, PR #24 adds a scikit-learn API to PyKrige on top of the existing UniversalKrigging and OrdinaryKrigging methods.

A possible solution to remove the current code duplication, would be to refactor the UniversalKrigging and OrdinaryKrigging to work in N-dimensions. The simplest way of doing it would be to use something along the lines of the scikit-learn API which will also remove the need for an additional wrapper for that. The general API could be something like,

class OrdinaryKrigging(pykrige.compat.BaseEstimator):
    def __init__(self, <kriging options>):
         [..]
    def fit(X, y):
        """ Where X is an array [n_samples, n_dimensions] and y is an array [n_samples]"""
        [...]
    def predict(X):
        """ Where X is an array [n_samples, n_dimensions]
        The equivalent of the current execute(style="points", ...)
        """
        [...]

The case of execute(style="masked", ...) could be done by supporting masked arrays for predict(X), while the case execute(style="grid", ...) can be done with a helper function,

def points2grid(**points):
       """ points: a list of arrays e.g [zpts, ypts, xpts] """
       grids = np.meshgrid(*points, indexing='ij')
       return np.concatenate([grid.flatten()[:, None] for grid in grids])

which is mostly what is done internally by execute at present.

This would break backward compatibility though, so a major version change would be needed (PyKrige v2).

Update: the refactoring could follow the steps below,

in core.py merge adjust_for_anisotropy and adjust_for_anisotropy_3d into a single private function _adjust_for_anisotropy(X, center, scaling, angle) where X is a [n_samples, n_dim] array, and all the other are list of floats and, it returns the X_modified array. (PR #33 )

in core.py merge the initialize_variogram_model and initialize_variogram_model_3d into a single private function (PR #47 )

_initialize_variogram_model(X, y,  variogram_model, variogram_model_parameters, variogram_function, nlags, weight)

in core.py merge the krige and krige_3d into a single private function _krige(X, y, coords, variogram_function, variogram_model_parameters) (PR #51 )
in core.py similarly merge find_statistics* (PR #51)
create a new ND kriging class for ordinary Kriging that follows the scikit-learn compatible api. We need to decide on a name OrdinaryKriging would be great but it's already taken). Maybe OrdinaryNDKriging ?
rewrite the internals of OrdinaryKriging and OrdinaryKriging3D to use OrdinaryNDKriging internally.
rewrite specific tests for OrdinaryNDKriging and add deprecation warnings on OrdinaryKriging , OrdinaryKriging3D
do the same thing for other kriging methods
after a few versions remove the old style kriging interface.

What do you think?

Applying Regressing Kriging to a gridx and gridy

Hi,
Firstly massive thanks for this awesome libraries. I am still a novice in this field, so please excuse my ignorance. I just want to know if the Regression Kriging can be applied to a user specified grid.
For instance the ordinary krigin has: OK.execute('grid', gridx, gridy)
and the Universal kriging: UK.execute('grid', gridx, gridy)

Your example on the Regression Kriging stops at the displays of the scores:
(m_rk.score(p_test, x_test, target_test))
I see there is a couple of options like fit, krige, krige_residual and predict on the RegressionKriging object. Should I use one of those functions?
Thanks alot

Preparing for v1.4 release

As discussed in #60, there were quite a bit of new features added since last year, and it would be good to make a new release (v1.4).

@bsmurphy @basaks What do you think is left to do to make this release happen? Could you please add / remove open issues to/from the v1.4 milestone if needed? Thanks!

Issue in universal kriging with external drift

While trying to do an external drift calculation I caught an error with the dimensions not match, however they do indeed match. I think issue is on line 341 of uk.py, the y dimension of external_drift is compared to the x dimension of external_drift_x.

            if external_drift.shape[0] != external_drift_y.shape[0] or \
               external_drift.shape[1] != external_drift_x.shape[0]: ### Here is the issue
                if external_drift.shape[0] == external_drift_x.shape[0] and \
                   external_drift.shape[1] == external_drift_y.shape[0]:
                    self.external_Z_drift = np.array(external_drift.T)
                else:
                    raise ValueError("External drift dimensions do not match provided "
                                     "x- and y-coordinate dimensions.")

Basically, I think this change should be made

               external_drift.shape[1] != external_drift_x.shape[1]:

CircleCI

CircleCI seems to be having some issues running our setup.py script, since we check to ensure dependencies are installed before running the actual setup function. I added a circle.yml file to try to make sure that numpy/scipy/matplotlib are installed (via pip) before the setup script is run, but that seems to have lead to other problems (I think maybe because doing pip install matplotlib is problematic).

@rth, @basaks, since you guys are better at this kind of stuff then I am, do you have any ideas?

bug?

I'm not sure if I'm misunderstanding something, or if I've discovered a bug. But I got a small code example to reproduce it:
We start with these data:

y/x	0	1	2
2	n	5	n
1	0	n	n
0	1	n	n

from pykrige.ok import OrdinaryKriging
import numpy as np
import pykrige.kriging_tools as kt
z = [[0,0,1],
     [0,1,0],
     [1,2,5]]
data = np.array(z)
gridx = np.arange(0.0, 2.1, 0.5)
gridy = np.arange(0.0, 2.1, 0.5)
OK = OrdinaryKriging(data[:, 0], data[:, 1], data[:, 2], variogram_model='linear',
                     verbose=False, enable_plotting=False)
z, _ = OK.execute('grid', gridx, gridy)
kt.write_asc_grid(gridx, gridy, z, filename="output.asc")

I would assume the output would be (preserving the three original values, with interpolated x values):

y/x	0	0.5	1	1.5	2
2	x	x	5	x	x
1.5	x	x	x	x	x
1	0	x	x	x	x
0.5	x	x	x	x	x
0	1	x	x	x	x

But the output file contains the following:

NCOLS          5         
NROWS          5         
XLLCENTER      0.00      
YLLCENTER      0.00      
DX             0.50      
DY             0.50      
NODATA_VALUE   -999.00   
2.67            5.00            2.67            2.53            2.44            
2.28            2.29            2.28            2.25            2.23            
1.86            -0.00           1.86            1.96            2.02            
1.62            1.54            1.62            1.74            1.83            
1.50            1.00            1.50            1.60            1.69

From my interpretation of this, the original values have all been placed at X=0.5

Variance prediction in regression kriging

That would be useful to return not only the point estimate in regression kriging, but also the variance ? So that the user can then compute a confidence interval of each prediction...

I think it would not be too difficult :

regression model based on scikit doesn't return this. So let's not consider it for the moment
the kriging model does return it ! So we could just use it directly in rk

It's not perfect, since we will not catch all the variance, but it's better than nothing...

What do you think ?

Working with missing data

I'm trying to interpolate data which contains missing values using pyKrige. Is this possible? So far, I encountered this error while doing so:


import numpy as np
from pykrige.ok import OrdinaryKriging

data = np.array([[0.3, 1.2,np.nan],
                 [1.9, 0.6, np.nan],
                 [1.1, 3.2, np.nan],
                 [3.3, 4.4, 1.47],
                 [4.7, 3.8, 1.74]])

gridx = np.arange(0.0, 5.5, 0.5)
gridy = np.arange(0.0, 5.5, 0.5)

OK = OrdinaryKriging(data[:,0],data[:,1],data[:,2],variogram_model='linear',verbose=False)

 File "<ipython-input-40-17311a362b4a>", line 17, in <module>
    OK = OrdinaryKriging(data[:,0],data[:,1],data[:,2],variogram_model='linear',verbose=False)

  File "~/python3.6/site-packages/pykrige/ok.py", line 232, in __init__
    self.variogram_function, nlags, weight)

  File "~/python3.6/site-packages/pykrige/core.py", line 199, in initialize_variogram_model
    variogram_function, weight)

  File "~/python3.6/site-packages/pykrige/core.py", line 286, in calculate_variogram_model
    x0 = [(np.amax(semivariance) - np.amin(semivariance))/(np.amax(lags) - np.amin(lags)),

  File "~/python3.6/site-packages/numpy/core/fromnumeric.py", line 2252, in amax
    out=out, **kwargs)

  File "~/python3.6/site-packages/numpy/core/_methods.py", line 26, in _amax
    return umr_maximum(a, axis, None, out, keepdims)

ValueError: zero-size array to reduction operation maximum which has no identity

Is there a workaround for this?

Thanks!

Performance optimization with LU decomposition

Making this @bsmurphy comment a separate issue,

BTW, @rth (and anyone else who thinks about matrices, I don't think about them that much admittedly), I've been wondering if we could use LU decomposition on the kriging matrix to speed up the solution. Any thoughts? I haven't thought about this that much, but since the LHS matrix is the same (just the RHS vector that's changing), we might be able to leverage this for the looping backend...

[RFC] Data processing pipelines

In line with code reorganization in issue #33 , I was wondering what's your opinion on data pipelines. I could be wrong but, I looks like what the current OridnaryKrigging etc.. could be separated in several independent steps,

(optional) anisotropy correction coordinate transformation
(optional) optional geographic coordinate transformation
actual kriging (not sure if drift could be separated in a separate step..)

One could then potentially create, for instance, AnisotropyTransformer, CoordinateTransformer, KrigingEstimator classes (or some other names) and get the results by constructing a pipeline using sklearn.pipeline.Pipeline or sklearn.pipeline.make_pipeline. The advantage of this being that the different transformers

could then be simpler (with few input parameters), and users would only use the elementary bricks they need
can be reused for different krigging types.
can be more easily customized by subclassing different steps
can be tested independently

I'm not sure if that would be useful. Among other things this could depend on how much more options we might end up adding. For instance the Universal Krigging class currently has 16 input parameters which is already quite a bit. If we end up adding, say local variogram search radius, kriging search radius and a few others, splitting the processing into several steps might be a way to simplify the interface for the users.

Just a though... What do you think?

Improve docstrings formatting

Now that online documentation was setup, some effort to make sure all docstrings use a correct numpy style formatting is necessary.

For instance, most methods under "API reference" of the documentation, are currently not very well rendered.

Any help on this would be greatly appreciated. Instructions on how to build the documentation locally can be found in #48

Scaling to large datasets

This issues aims to discuss scaling of PyKridge to large datsets (which could impact, for instance, the optimization approaches in issue #35).

Here are approximate (and possibly inaccurate) time complexity estimations for different processing steps of the kriging process in 2D, according to these benchmarks (adapted from PR #36), applied to a 5k-10k dataset which only have 2 measurement points for each parameter,

Calculation (training) of the kriging model: ~O(N_train²)
Prediction from a trained model (no moving window):
- all backends backend : ~O(N_test*N_train^1.5)
Prediction from a trained model (with window):
- loop and C backends: ~O(N_test*N_nn^(1~2))

For information, the approximate time complexity of linear algebra operations that may limit the performance are,

O(N^3) for linear system inversions
O(N^3) for matrix multiplication
and O(N^3) for matrix inversions

(though the constant term would be quite different).

This may be of interest to @kvanlombeek and @basaks as discussed in issue #29 . The training part indeed doesn't scale so well with the dataset size and also affect the predictions time. The total run time for the attached benchmarks is 48min wall time and 187min CPU time (on a 4 core CPU), so most of the cricial operation do take advantage of a multi-threaded BLAS for linear algebra operations.

Any suggestions of how we could improve scaling (or general performance) are very welcome..

implement the nugget on the diagonal to prevent exact interpolation

As suggested already in the code, implement the option that the variance is taken into account when you krige.

invalid coords check - bug in core.py

Hi,
Thanks so much for your work.
I think on line 193 of core.py:
if np.any(x == coords[0]) and np.any(y == coords[1]):
the purpose is to check if there is a [x,y] that equals coords, however this line evaluates true if in the case of:
x = [1, 2, 3]
y = [1, 2, 3]
coords = [1,2]
however there is no [x,y] such that coords == [1,2]

I have gotten around this by using:
bd = np.sqrt((x - coords[0])**2 + (y - coords[1])**2)
zero_value = np.any(bd <= 1e-8)

I am unfamiliar with the internals of the algorithm so please let me know if my suspicion is incorrect.
Thanks

set coordinate_types

when i used OrdinaryKriging, i set" coordinate_types='geographic'", error occured" init() got an unexpected keyword argument 'coordinate_types". Are there some functions requiring imported?

Problem updating variogram_model_parameters with a dict

UniversalKriging

Updating variogram mode...
Using 'gaussian' Variogram Model
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-21c1fac50399> in <module>()
      4 #                               anisotropy_scaling=1., anisotropy_angle=0.)
      5 
----> 6 K.update_variogram_model('gaussian', variogram_parameters={'sill':2.25166068e04, 'range':3.84845608e00, 'nugget':2.90938889e02})
      7 K.variogram_model_parameters
      8 K.display_variogram_model()
~/anaconda3/lib/python3.6/site-packages/pykrige/uk.py in update_variogram_model(self, variogram_model, variogram_parameters, variogram_function, nlags, weight, anisotropy_scaling, anisotropy_angle)
    547             else:
    548                 print("Using '%s' Variogram Model" % self.variogram_model)
--> 549                 print("Sill:", self.variogram_model_parameters[0])
    550                 print("Range:", self.variogram_model_parameters[1])
    551                 print("Nugget:", self.variogram_model_parameters[2])
TypeError: 'set' object does not support indexing

option to downsample the data to calculate the variogram

When you krige with large datasets (100k datapoints for example), the number of pairwise distances blow up. It would be nice if there is a parameter in the function like max_n_for_variogram that limits the number of datapoints used.

I have personally already implemented this in the ordinary kriging function as our dataset was too large.

does it work well with big data?

I have large data and it doesn't work well with Arcgis, what about the code?

Statistics calculations... automatic, or switched by the user?

The policy on whether or not to automatically calculate the statistics (set in places with the enable_statistics boolean kwarg) is inconsistent across the different kriging classes and isn't actually even re-done in the update_variogram_model method in each class. (I think this is just my mistake from way back when I was putting this all together in the first place, and probably also a problem of having similar functionality have to be copied over so many classes.) I'm thinking about just making the statistics calculations automatic/default (i.e., get rid of the enable_statistics flag that appears inconsistently across classes). With the fixes in PRs #47 and #51, the statistics calculations should go smoothly now. Then the user can access them with the existing infrastructure, and if verbose (or someday logging) is enabled then they'll be spit out there. @rth, @basaks, @whdc, thoughts? Along with this change, I'm also thinking of adding a few other outputs to the statistcs, like @whdc suggested, and also some more useful calculations from the Kitanidis text...

How about regression kriging?

I have extended the sklear_cv.Krige class to do regression Kriging.
https://en.wikipedia.org/wiki/Regression-Kriging

Would that be of interest?

Add a documentation page

It would be great to add a documentation page for PyKrige with sphinx hosted, for instance at readthedocs.org.

The code itself is fairly well documented so it's just a matter of setting sphinx up and checking that the docstrings are correctly parsed. We could use, for instance, numpy style docstrings through sphinxcontrib-napoleon (some adaptation work might be necessary).

P.S: @bsmurphy could you please grant me collaborator's right on this repo, so I can assign labels on issues etc. ? Thanks.

Migrate from nose to pytest for unit tests

nose is not being developed anymore and we should probably migrate to pytest for unit tests (as most packages in the Python ecosystem have done or are in the process of doing).

This would allow to greatly simplify writing of unit tests, e.g. instead of

self.assertTrue(np.allclose(res, np.array([0.98, 1.05]), 0.01, 0.01))

one could simply write

assert np.allclose(res, np.array([0.98, 1.05]), 0.01, 0.01)

and pytest would handle everything. Actually in this case,

from numpy.testing import assert_allclose

assert_allclose(res, np.array([0.98, 1.05]), 0.01, 0.01)

would be even better.

Note: numpy.testing might still require to install nose (particularly for older numpy versions).

Refactoring of code: separation of concerns for ploting variograms

I want to display an experimental variogram even before deciding which kriging method I want to use. Now what I can do is using pykrige.ok.OrdinaryKriging.display_variogram_model or pykrige.uk.UniversalKriging.display_variogram_model. Two issues come to my mind:

The two methods have some code duplicaiton, it could be solved with some programming pattern like delegation or inheritance
It forces me to assume a variogram model which will be plotted as well. But I only want the experimental variogram.

My idea is to extract a visualizer as a class on its own that can does the plotting. That visualizer is used by any call of display_variogram_model and keeps the matplotlib code at one place. If somebody does not have matplotlib installed, it only needs to be handled in the visualizer class. Further it can give the freedom to plot only the experimental variogram as well.

Fix travis config

See this this travis CI job.

This is using python3.6 instead of the desired python3.4

0.34s$ wget http://repo.continuum.io/miniconda/Miniconda${TRAVIS_PYTHON_VERSION:0:1}-latest-Linux-x86_64.sh -O miniconda.sh
--2017-02-18 20:34:55-- http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
Resolving repo.continuum.io (repo.continuum.io)... 104.16.19.10, 104.16.18.10, 2400:cb00:2048:1::6810:120a, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.16.19.10|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh [following]
--2017-02-18 20:34:55-- https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
Connecting to repo.continuum.io (repo.continuum.io)|104.16.19.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34663461 (33M) [application/x-sh]
Saving to: miniconda.sh' 100%[======================================>] 34,663,461 140M/s in 0.2s 2017-02-18 20:34:55 (140 MB/s) - miniconda.sh' saved [34663461/34663461]
before_install.2
0.00s$ chmod +x miniconda.sh
before_install.3
5.93s$ ./miniconda.sh -b
PREFIX=/home/travis/miniconda3
installing: python-3.6.0-0 ...
installing: cffi-1.9.1-py36_0 ...
installing: conda-env-2.6.0-0 ...
installing: cryptography-1.7.1-py36_0 ..

Local Variogram Model?

This probably deserves it's own thread.

In many situations an useful and an already used technique in the kriging community, is the local variogram model. Can someone work on this instead?

This may have additional benefits in speeding up the variogram computations for very large problems (with lots of observations).

VESPER is a PC-Windows program developed by the Australian Centre for Precision Agriculture (ACPA) for spatial prediction that is capable of performing kriging with local variograms (Haas, 1990). Kriging with local variograms involves searching for the closest neighbourhood for each prediction site, estimating the variogram from the neighbourhood, fitting a variogram model to the data and predicting the value and its uncertainty. The local variogram is modelled in the program by fitting a variogram model automatically through the nonlinear least-squares method. Several variogram models are available, namely spherical, exponential, Gaussian and linear with sill. Punctual and block kriging is available as interpolation options. This program adapts itself spatially in the presence of distinct differences in local structure over the whole field.

Some context by @bsmurphy

@rth added the moving window function, which is similar to what you're suggesting @basaks except that it assumes a stationary variogram. Adding in the extra layer of re-estimating the variogram for local neighborhoods could certainly be done at some level, but it would require a calculation for each moving window location; therefore, not sure how much it would really boost speed... Anyways, both of these ideas would be nice to implement at some point: the local variogram estimation for max flexibility and the downsampling in global variogram estimation as a useful tool (could be put in the kriging tools module)...

Can't install manually with 'python setup.py install'

Hi,

I tried to clone this repositery and then do a 'python setup.py install', but it tells me that I don't have numpy installed. I double checked, and numpy is well installed and working on my computer.

Do someone knows the issues ?

By the way, I do this because I want to try the regression kriging, which does not seem to be implemented in the pip version.

Thanks a lot for your help

Python 3 support

In the long term perspective, it would probably be good to add Python 3.x support alongside 2.7.

The corresponding test environments, could then be uncommented in .travis.yml and appveyor.yml...

example err

$ python2 Three-Dimensional-Kriging-Example.py
Traceback (most recent call last):
File "Three-Dimensional-Kriging-Example.py", line 35, in
k3d, ss3d = uk3d.execute('grid', gridx, gridy, gridz, specified_drift_arrays=[xg, yg])
File "/usr/lib/python2.7/site-packages/pykrige/uk3d.py", line 752, in execute
raise ValueError("Dimensions of drift values array do not match specified grid dimensions.")
ValueError: Dimensions of drift values array do not match specified grid dimensions.

why?

Exception in provided 3d example: Inconsistent number of specified drift terms supplied

I took your example provided at https://github.com/bsmurphy/PyKrige#three-dimensional-kriging-example and tried to run it:

>>> k3d, ss3d = uk3d.execute('grid', gridx, gridy, gridz, specified_drift_arrays=[xg, yg, zg])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\current-user\AppData\Roaming\Python\Python36\site-packages\pykrige\uk3d.py", line 775, in execute
    raise ValueError("Inconsistent number of specified drift terms supplied.")
ValueError: Inconsistent number of specified drift terms supplied.

That does not look like what I expected...

Cokriging

Hi,

I would like to know if cokriging is planned ? It would be very useful...

Thanks again for this amazing project !

Alexis

Ability to specify cutoff

Hello,

Thanks for your work in the very useful Pykrige. One feature that I haven't found is the ability to specify the variogram cutoff, i.e. the distance up to which the variogram is calculated. Currently it appears that the variogram in Pykrige is calculated across the full distance of the data, whereas the typical cutoff is 1/3 of the diagonal distance of the data (i.e. gstat's default), and being able to specify this is important for many datasets. Apologies if this already exists and I've missed it, but otherwise this would be a useful addition.

Can't Install

Hello, thanks for sharing

I get de error when I try

pip install pykrige

add mpi example?

I have already implemented an MPI Krige pipeline in another package, using which I can krige large geotifs.

I have so far used Kriging on 20000x12000 pixels in 10 minutes using ordinary krige + linear + n_esitmators_points=50 using 32 cores.

Let me know if this will be desirable, and I can put together another PR with an MPI example.

where are the search parameters

Hey there, New user here. Trying to do some of the krigging we do in SURFER using an open source library like this. Where can the user define the search parameters?

Cannot set backend to "C"

I use the latest PyKrige (1.3.0) on Fedora 23, and everythong works well. But it cannot work with backend="C".

Following is the error message:

Warning: failed to load Cython extensions.
See #8
Falling back to a pure python backend...

All value are same

Good day
Currently we are using Pyrige on our project. We use variogram model gaussian, because we are not sure why on linear, all of value in asc are same. On gaussian, the result is fine, but we need linear model.

here is the script with data:

from pykrige.ok import OrdinaryKriging
import numpy as np
import pykrige.kriging_tools as kt
import subprocess

data = [[7227227.598324121, 741736.5387751074, 10], [7227387.972113899, 742348.3581188999, 10], [7227504.409359513, 741794.7492702686, 50], [7227196.779826333, 742523.8574917192, 10], [7227173.741004699, 742722.5347341978, 0], [7226577.720167324, 742653.4784150455, 400], [7227081.667052395, 743517.2418175992, 40], [7226356.042043427, 742829.1261297092, 90], [7226554.678690424, 742852.1465022594, 50], [7226157.289099098, 742812.6361562246, 800], [7226929.018067002, 743096.8637661343, 50], [7226684.283700548, 743471.1798958695, 60], [7226883.030788386, 743494.2116793899, 60], [7225958.657863444, 742783.0844111233, 600], [7226975.104354201, 742699.5162851174, 60], [7226379.083423361, 742630.4609652269, 780], [7226998.143078854, 742500.8419647408, 10], [7226730.38157898, 743073.8398052861, 290], [7227127.765316596, 743119.8900339963, 20], [7226707.334112899, 743272.5100437128, 20], [7227326.401742509, 743142.9146601006, 20], [7227525.0381373875, 743165.9396187299, 10], [7227303.353986896, 743341.593667091, 10], [7227104.717657487, 743318.566118863, 20], [7226799.395507857, 742477.8248015478, 10], [7225935.724523432, 742981.7453153334, 780], [7225584.425343172, 742538.3925276443, 70], [7226952.062683605, 742898.1902188676, 0], [7226753.426098847, 742875.1691804474, 20], [7225736.977087256, 742958.7210444498, 70], [7226531.745059891, 743050.8161771429, 760], [7226776.467672564, 742676.4981690557, 70], [7227280.303285444, 743540.2722880202, 20], [7225490.149526373, 742349.2620251302, 60], [7225429.391564154, 741351.0645974253, 20], [7227062.957670536, 741852.6477720598, 30], [7225417.185908113, 741151.4868056511, 40], [7225441.5925421305, 741550.7439451264, 50], [7226485.647374582, 743448.1504222901, 60], [7225760.021123014, 742760.0650378156, 0], [7227021.178857623, 742302.1672575969, 10], [7227050.757397798, 741653.0451066317, 110], [7225628.983078791, 741338.9515235351, 20], [7225005.795711183, 740976.1451948371, 10], [7226877.565960731, 741659.1935542912, 80], [7225392.878499182, 740752.2342010476, 60], [7225889.449159239, 742324.9649234914, 20], [7225641.186254723, 741538.5328277033, 30], [7226838.962920578, 741465.558159485, 20], [7224996.859375702, 740776.6322658678, 40], [7225205.38657999, 740964.0277736945, 20], [7227038.554236196, 741453.4430858254, 60], [7225229.79817595, 741363.2778863846, 40], [7225453.681624319, 741750.3210617228, 60], [7226826.757280052, 741265.9596496419, 0], [7225193.28812708, 740764.3551863618, 470], [7225241.889738288, 741562.8514915913, 30], [7223792.999598428, 742133.7748068094, 40], [7223371.917262611, 741755.8843535043, 10], [7226905.970504334, 743295.5369270579, 30], [7225783.062211899, 742561.4086442813, 50], [7227548.083043832, 742967.257303668, 20], [7225405.089937136, 740951.8107056284, 0], [7225689.855679398, 742337.0643507005, 20], [7225616.889589428, 741139.2719109184, 30], [7224624.179692563, 742689.773329013, 10], [7223560.952906568, 741545.4710069869, 10], [7223404.271078165, 742355.0653203392, 10], [7223782.140346996, 741934.040743739, 20], [7224402.900267473, 742301.1648633606, 10], [7224203.108117493, 742311.945089181, 10], [7223184.121154054, 741765.6820959963, 10], [7223161.4835907, 741566.9496566014, 10], [7223814.487816765, 742533.2408105701, 20], [7223803.74515754, 742333.5075065546, 20], [7224003.426678358, 742322.7266269, 30], [7224370.538778005, 741702.0415918522, 20], [7223593.20759579, 742144.5559392121, 10], [7224014.1697301455, 742522.4628108565, 20], [7223415.123744923, 742554.7948350566, 20], [7223393.52630399, 742155.3383827255, 10], [7223382.778629602, 741955.6120492729, 20], [7223992.68072684, 742122.9910471919, 20], [7223361.161998633, 741556.2601255218, 10], [7224834.600497055, 742878.7444872069, 20], [7224213.962354492, 742511.6861246709, 20], [7223760.6329405755, 741534.6792663855, 10], [7223971.067542169, 741723.6206349385, 30], [7223771.387197874, 741734.4101528268, 20], [7223425.8645168785, 742754.4220858769, 10], [7225877.258927372, 742125.2772584416, 10], [7226500.533824252, 742488.058595678, 70], [7228167.217470973, 742078.6910039685, 30], [7227620.821597124, 741241.2340276041, 20], [7227890.517958436, 742020.4753557814, 250], [7225840.814586731, 741526.4044635892, 10], [7223441.704057103, 742995.6797662751, 0], [7224258.535894949, 743116.5761307424, 0], [7223582.459529135, 741944.8267253814, 30], [7224589.876254661, 742293.0382504247, 10], [7224181.612530067, 741912.4687743032, 30], [7224170.858597059, 741712.8324249025, 10], [7224392.042738639, 742101.4215543908, 20], [7223604.063555035, 742344.2877281049, 20], [7223571.708562405, 741745.098115536, 0], [7224823.861671604, 742678.9955736481, 10], [7224381.293102894, 741901.6808193953, 10], [7223614.805821692, 742544.0181519834, 20], [7224613.327568116, 742490.0259277604, 0], [7223625.5469861785, 742743.648281642, 10], [7223981.931875573, 741923.2560715778, 40], [7224192.361773577, 742112.2066297057, 20], [7227336.89690737, 741904.0453019128, 0], [7227446.193639806, 742071.5546095923, 20], [7227395.109361633, 741627.2418545977, 10], [7226508.697690444, 743249.4834928864, 50], [7225852.978788622, 741726.0067433427, 20], [7225653.384751505, 741738.2156896066, 20], [7225465.87681756, 741950.0016979582, 0], [7225677.664066257, 742137.4804681872, 10], [7226076.962701651, 742113.1762338938, 0], [7225665.471353862, 741937.7963174966, 10], [7225478.070910113, 742149.5820675708, 20], [7225865.174804527, 741925.6931193519, 10], [7225266.283984977, 741962.1054698917, 120], [7225217.5947164595, 741163.6020529796, 50], [7225066.578481888, 741974.3074886939, 60], [7225254.0874126945, 741762.5286132753, 70], [7226488.343454836, 742288.4625977299, 40], [7226288.750236297, 742300.5641146363, 70], [7226101.233891608, 742512.3541465371, 20], [7226089.044345303, 742312.7638840869, 30], [7226300.82760811, 742500.2561939631, 280], [7225278.365082124, 742161.7819139991, 80], [7225078.76995909, 742173.9830298338, 30], [7227781.220247371, 741852.9614902339, 40], [7227664.781552965, 742406.5815673766, 0], [7227613.7075066045, 741962.2594666366, 0], [7227555.488521611, 742239.0666980959, 0], [7227832.29824914, 742297.2882012674, 0], [7227723.003803087, 742129.7724436527, 0], [7227999.703027056, 742187.9900356561, 70], [7227569.710108707, 740796.9214540131, 50], [7227453.316009933, 741350.4365766125, 30], [7227671.920685718, 741685.4504051579, 30], [7227730.128059231, 741408.6395133291, 50], [7228006.938174991, 741466.9451057224, 0], [7228225.432099374, 741801.8707178305, 280], [7227839.4308860935, 741576.1487067917, 50], [7227679.019901991, 740964.4231965214, 40], [7228116.131559499, 741634.355076432, 130], [7227948.621070396, 741743.6587180682, 40], [7228057.920195877, 741911.1734708956, 360], [7227562.619273299, 741517.9421006007, 110], [7227511.515069407, 741073.7303940638, 40], [7227897.633934952, 741299.4350224253, 160], [7227788.3278440125, 741131.9277193418, 30], [7224343.020621691, 740161.8996701327, 10], [7224425.608069318, 740361.7923243452, 20], [7224283.690284246, 740396.6345250956, 10], [7223463.037230799, 743194.7162524968, 20], [7223661.911908321, 743175.1816026324, 20], [7223860.786574427, 743155.6467022882, 50], [7224059.661240388, 743136.1115416633, 40], [7224215.869797494, 742718.5037778367, 0], [7224414.744507348, 742698.9682300765, 20], [7224634.95220642, 742878.4684002735, 30], [7224436.0775191095, 742898.004259161, 0], [7224237.202842622, 742917.5398685983, 0], [7224457.410549199, 743097.0404698143, 10], [7224656.285192069, 743077.5045587744, 10], [7224279.868954374, 743315.6125746248, 60], [7224478.74357572, 743296.0768518158, 40], [7224500.076620733, 743495.1134159163, 0], [7224301.202032242, 743514.6491906192, 30], [7224322.535106477, 743713.6859785933, 0], [7226133.83447627, 743004.8117909471, 500], [7226332.471057562, 743027.8347541334, 430], [7225824.052747822, 741291.0459670975, 0], [7226592.771177746, 741050.3176416397, 10], [7226393.178775877, 741062.5383295384, 0], [7226006.091973691, 741286.460748286, 20], [7225993.88470266, 741086.8747398502, 10], [7226604.977896271, 741250.013172498, 10], [7226193.477278963, 741074.6554450922, 0], [7226405.276076367, 741262.1281139745, 10], [7226205.683175875, 741274.345237537, 10], [7225781.097517002, 740682.4678015739, 0], [7225980.6893187035, 740670.2431864912, 0], [7226180.390043719, 740658.1207460476, 0], [7226379.981671028, 740645.8947385595, 0], [7226392.19402713, 740845.586234625, 10], [7226192.492979417, 740857.7065003845, 10], [7225992.90085382, 740869.9289436647, 20], [7225793.310423685, 740882.049776317, 10], [7225611.271718425, 740886.639979107, 10], [7226309.423783402, 743226.4991375986, 10], [7226110.787298828, 743203.4732513556, 40], [7225811.845756525, 741091.4625891433, 10], [7225599.059091186, 740687.0606350523, 0]]

data = np.array(data)
grid_step = 10
gridx = np.arange(740079.981183, 743963.151806, grid_step)
gridy = np.arange(7223164.31876, 7228331.36228, grid_step)

# Create the ordinary kriging object. Required inputs are the X-coordinates of
# the data points, the Y-coordinates of the data points, and the Z-values of the
# data points. If no variogram model is specified, defaults to a linear variogram
# model. If no variogram model parameters are specified, then the code automatically
# calculates the parameters by fitting the variogram model to the binned
# experimental semivariogram. The verbose kwarg controls code talk-back, and
# the enable_plotting kwarg controls the display of the semivariogram.
OK = OrdinaryKriging(data[:, 0], data[:, 1], data[:, 2], variogram_model='linear')

# Creates the kriged grid and the variance grid. Allows for kriging on a rectangular
# grid of points, on a masked rectangular grid of points, or with arbitrary points.
# (See OrdinaryKriging.__doc__ for more information.)
z, ss = OK.execute('grid', gridx, gridy)

# Writes the kriged grid to an ASCII grid file.
kt.write_asc_grid(gridx, gridy, z, filename='krige-test.asc')

Is there any setting that we miss?
Thank you

Working with a large grid size

Hello,

thanks you so much for working on this and making it available on github!

I would like to use this module with a moderately large grid size of the order of nx=1000, ny=1000 and with a number of points n=100. Given that UniversalKriging solves the linear system, A X = b, where A is, I believe, an array of shape (nx, ny, n, n), this currently requires roughly nx*ny*n**2*8/1e9= 80 GB of RAM simply to define that matrix in 64bit , and subsequently I'm running out of memory on my laptop.

I was wondering if I was possible to optimise this code so it is more memory efficient? For instance, the option would be to port the execute method in uk.py to Cython, so it uses a lower number of temporary numpy arrays for indexing, etc., which would same some memory. However, at the end of the day, it's still necessary to define that same A matrix (which is not sparse as far as I can tell) for the linear system. I suppose, it's not possible to do a domain decomposition on the grid (decompose it in regions and do the calculation region by region), right? Is there any other things that could be attempted? I can do some optimisation , I would just need some advice about where to start since I don't know much about Kriging.

Thanks,

pykrige.test.TestRegressionKrige times out

Travis CI occasionally fails because pykrige.test.TestRegressionKrige takes more than 10min to run (maximum allowed time). This seems to happen randomly, on other occasions in completes within seconds: see e.g. #61 where one Travis CI job passed, the other failed.

I think the problem is that sometimes the test fails to download the test dataset with fetch_california_housing in https://github.com/bsmurphy/PyKrige/blob/master/pykrige/test.py#L1726 . In any case it might be a good idea to replace this with a built-in dataset that doesn't require to be downloaded.

cc @basaks

No module named 'Cython'

Hi there, first of all thanks for the project!

Somehow I struggle to install it when using Windows + Miniconda.

(C:\path\to\miniconda)` C:\current\directory>pip install pykrige
Collecting pykrige
  Downloading PyKrige-1.3.0.tar.gz (596kB)
    100% |████████████████████████████████| 604kB 1.5MB/s
    Complete output from command python setup.py egg_info:
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "C:\Users\current-user\AppData\Local\Temp\pip-build-phddipzi\pykrige\setup.py", line 14, in <module>
        from Cython.Distutils import build_ext
    ModuleNotFoundError: No module named 'Cython'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\Users\current-user\AppData\Local\Temp\pip-build-phddipzi\pykrige\

Geographically weighted regression kriging

There is a proposal that I might be working on this. There is a lot out there on GWR using R, but I could only find one such in python. Let me know if you come across other GWR python packages. Depending on how pygwr fares, I might look at some R/python hybrid, or simply a R based solution.

Benchmark results

I tried to do benchmark run with gstat package in R, and I found there are differences between two outputs. Have you ever done similar experiments?

library(gstat)
a <- read.table("http://www.stat.ucla.edu/~nchristo/statistics_c173_c273/kriging_11.txt", header=TRUE)
 x.range <- as.integer(range(a[,1]))
y.range <- as.integer(range(a[,2]))
grd <- expand.grid(x=seq(from=x.range[1], to=x.range[2], by=1),
                     y=seq(from=y.range[1], to=y.range[2], by=1))
m <- vgm(10, "Exp", 3.33, 0)
q1 <- krige(id="z", formula=z~1, data=a, newdata=data.frame(x=64, y=128), model = m,
            locations=~x+y)

from pykrige.ok import OrdinaryKriging
import numpy as np
import pykrige.kriging_tools as kt

data = np.loadtxt('./kriging_11.txt', skiprows=1)

# vgm(psill, model, range, nugget)
# m <- vgm(10, "Exp", 3.33, 0)
#spherical - [sill, range, nugget]
OK = OrdinaryKriging(data[:, 0], data[:, 1], data[:, 2], variogram_model='exponential',
                     variogram_parameters = [10, 3.33, 0], 
                     verbose=False, enable_plotting=False)
z, ss = OK.execute('points', np.array([64]), np.array([128]))
print z

The prediction at (64, 128) I got 338.9828 from gstat and 363.59867058131033 from pykrige.

Update changelog / contributors acknowledgement

Would be nice to update the CHANGELOG.md with the latest updates, and also reverse it, so that latest changes are on the top (not the bottom).

It might also be a good moment to discuss how to best acknowledge different contributors. For instance for new features, major updates, we could add the name of the contributor (with their agreement) and the PR number, as it is done for instance in the scikit-learn changelog?

Make matplotlib an optional dependency

Currently matplotlib is a hard dependency and it is imported as import matplotlib.pyplot as plt at the top of most source files. It might be better to make it an optional dependency so that krigging would still work even if matplotlib is not installed and the plotting would fail with some explicit error message asking to install matplotlib.

This is in particularly useful when computing kriging on a remote server, where plotting might not be best suited anyway.

The data with lon and lat can be processed?

Hello, i really want to know that how to process the data with lon and lat . do i have to transport it to Projection coordinatesystem? thank you

3D universal kriging

Could you implement this? How much work would it take?

_get_kriging_matrix is very long for large datasets

Hi again,

I just saw that the function '_get_kriging_matrix' is very very long (about 5-10 secs) when the training points size is about 10,000 or 15,000.

Do you know if a speed up would be possible ? For instance, I was thinking that as I specified in my model 'n_closed_points' = 100, computing the kriging matrix over the 100 nearest points would be sufficient ?

Trying to run Regression kriging but no module named 'pykrige.rk'

Greetings,

Trying to run the regression kriging example found on the git page, here:

https://github.com/bsmurphy/PyKrige/blob/master/examples/regression_kriging2d.py

When I run the code in spyder I get the error "No module named 'pykrige.rk'"

I am using a pip installed Pykrige module Version: 1.3.2.

Looking in the cite package folder where PyKriging is installed I don't see a file titled rk.py, like I see when I download the master version. that I downloaded from GitHub. Should I be using a different example for this older version of PyKrige?

If not, and I know this is an off-topic question, how can I install the new version on a Windows machine?

Thanks in advance,
Austin.

Getting mean in ordinary kriging

Hi
I'm doing 1D ordinary kriging and instead of the predictions, I need the mean of the kriged process. Is there any method that returns that or the estimated weights of the kriged process?

Thanks a lot!

geostat-framework / pykrige Goto Github PK

pykrige's People

Contributors

Stargazers

Watchers

Forkers

pykrige's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs