geostat-framework / pykrige Goto Github PK
View Code? Open in Web Editor NEWKriging Toolkit for Python
Home Page: https://pykrige.readthedocs.io
License: BSD 3-Clause "New" or "Revised" License
Kriging Toolkit for Python
Home Page: https://pykrige.readthedocs.io
License: BSD 3-Clause "New" or "Revised" License
Use the logging
module everywhere, instead of print
statement everywhere?
Make this part of next release?
Seems there is 1.3.1 tagged this past December. Would be nice to have it on PyPI where 1.3.0 is the latest.
Just a few thoughts about possible code refactoring in PyKrige.
Currently PyKrige has separate code that implements 2D and 3D kriging (e.g. in ok.py
and ok3d.py
), this result in code duplication and makes code maintenance and adding new features more difficult (they need to be added to every single file). Besides the 1D Kriging is not implemented, and it might have been nice to have some 1D examples as they are easier to visualize.
In addition, PR #24 adds a scikit-learn API to PyKrige on top of the existing UniversalKrigging
and OrdinaryKrigging
methods.
A possible solution to remove the current code duplication, would be to refactor the UniversalKrigging
and OrdinaryKrigging
to work in N-dimensions. The simplest way of doing it would be to use something along the lines of the scikit-learn API which will also remove the need for an additional wrapper for that. The general API could be something like,
class OrdinaryKrigging(pykrige.compat.BaseEstimator):
def __init__(self, <kriging options>):
[..]
def fit(X, y):
""" Where X is an array [n_samples, n_dimensions] and y is an array [n_samples]"""
[...]
def predict(X):
""" Where X is an array [n_samples, n_dimensions]
The equivalent of the current execute(style="points", ...)
"""
[...]
The case of execute(style="masked", ...)
could be done by supporting masked arrays for predict(X)
, while the case execute(style="grid", ...)
can be done with a helper function,
def points2grid(**points):
""" points: a list of arrays e.g [zpts, ypts, xpts] """
grids = np.meshgrid(*points, indexing='ij')
return np.concatenate([grid.flatten()[:, None] for grid in grids])
which is mostly what is done internally by execute
at present.
This would break backward compatibility though, so a major version change would be needed (PyKrige v2).
Update: the refactoring could follow the steps below,
core.py
merge adjust_for_anisotropy
and adjust_for_anisotropy_3d
into a single private function _adjust_for_anisotropy(X, center, scaling, angle)
where X is a [n_samples, n_dim]
array, and all the other are list of floats and, it returns the X_modified
array. (PR #33 )core.py
merge the initialize_variogram_model
and initialize_variogram_model_3d
into a single private function (PR #47 )
_initialize_variogram_model(X, y, variogram_model, variogram_model_parameters, variogram_function, nlags, weight)
core.py
merge the krige
and krige_3d
into a single private function _krige(X, y, coords, variogram_function, variogram_model_parameters)
(PR #51 )core.py
similarly merge find_statistics*
(PR #51)OrdinaryKriging
would be great but it's already taken). Maybe OrdinaryNDKriging
?OrdinaryKriging
and OrdinaryKriging3D
to use OrdinaryNDKriging
internally.OrdinaryNDKriging
and add deprecation warnings on OrdinaryKriging
, OrdinaryKriging3D
What do you think?
Hi,
Firstly massive thanks for this awesome libraries. I am still a novice in this field, so please excuse my ignorance. I just want to know if the Regression Kriging can be applied to a user specified grid.
For instance the ordinary krigin has: OK.execute('grid', gridx, gridy)
and the Universal kriging: UK.execute('grid', gridx, gridy)
Your example on the Regression Kriging stops at the displays of the scores:
(m_rk.score(p_test, x_test, target_test))
I see there is a couple of options like fit, krige, krige_residual and predict on the RegressionKriging object. Should I use one of those functions?
Thanks alot
As discussed in #60, there were quite a bit of new features added since last year, and it would be good to make a new release (v1.4).
@bsmurphy @basaks What do you think is left to do to make this release happen? Could you please add / remove open issues to/from the v1.4 milestone if needed? Thanks!
While trying to do an external drift calculation I caught an error with the dimensions not match, however they do indeed match. I think issue is on line 341 of uk.py, the y dimension of external_drift is compared to the x dimension of external_drift_x.
if external_drift.shape[0] != external_drift_y.shape[0] or \
external_drift.shape[1] != external_drift_x.shape[0]: ### Here is the issue
if external_drift.shape[0] == external_drift_x.shape[0] and \
external_drift.shape[1] == external_drift_y.shape[0]:
self.external_Z_drift = np.array(external_drift.T)
else:
raise ValueError("External drift dimensions do not match provided "
"x- and y-coordinate dimensions.")
Basically, I think this change should be made
external_drift.shape[1] != external_drift_x.shape[1]:
CircleCI seems to be having some issues running our setup.py
script, since we check to ensure dependencies are installed before running the actual setup function. I added a circle.yml
file to try to make sure that numpy/scipy/matplotlib are installed (via pip) before the setup script is run, but that seems to have lead to other problems (I think maybe because doing pip install matplotlib
is problematic).
@rth, @basaks, since you guys are better at this kind of stuff then I am, do you have any ideas?
I'm not sure if I'm misunderstanding something, or if I've discovered a bug. But I got a small code example to reproduce it:
We start with these data:
y/x | 0 | 1 | 2 |
---|---|---|---|
2 | n | 5 | n |
1 | 0 | n | n |
0 | 1 | n | n |
from pykrige.ok import OrdinaryKriging
import numpy as np
import pykrige.kriging_tools as kt
z = [[0,0,1],
[0,1,0],
[1,2,5]]
data = np.array(z)
gridx = np.arange(0.0, 2.1, 0.5)
gridy = np.arange(0.0, 2.1, 0.5)
OK = OrdinaryKriging(data[:, 0], data[:, 1], data[:, 2], variogram_model='linear',
verbose=False, enable_plotting=False)
z, _ = OK.execute('grid', gridx, gridy)
kt.write_asc_grid(gridx, gridy, z, filename="output.asc")
I would assume the output would be (preserving the three original values, with interpolated x values):
y/x | 0 | 0.5 | 1 | 1.5 | 2 |
---|---|---|---|---|---|
2 | x | x | 5 | x | x |
1.5 | x | x | x | x | x |
1 | 0 | x | x | x | x |
0.5 | x | x | x | x | x |
0 | 1 | x | x | x | x |
But the output file contains the following:
NCOLS 5
NROWS 5
XLLCENTER 0.00
YLLCENTER 0.00
DX 0.50
DY 0.50
NODATA_VALUE -999.00
2.67 5.00 2.67 2.53 2.44
2.28 2.29 2.28 2.25 2.23
1.86 -0.00 1.86 1.96 2.02
1.62 1.54 1.62 1.74 1.83
1.50 1.00 1.50 1.60 1.69
From my interpretation of this, the original values have all been placed at X=0.5
That would be useful to return not only the point estimate in regression kriging, but also the variance ? So that the user can then compute a confidence interval of each prediction...
I think it would not be too difficult :
It's not perfect, since we will not catch all the variance, but it's better than nothing...
What do you think ?
I'm trying to interpolate data which contains missing values using pyKrige. Is this possible? So far, I encountered this error while doing so:
import numpy as np
from pykrige.ok import OrdinaryKriging
data = np.array([[0.3, 1.2,np.nan],
[1.9, 0.6, np.nan],
[1.1, 3.2, np.nan],
[3.3, 4.4, 1.47],
[4.7, 3.8, 1.74]])
gridx = np.arange(0.0, 5.5, 0.5)
gridy = np.arange(0.0, 5.5, 0.5)
OK = OrdinaryKriging(data[:,0],data[:,1],data[:,2],variogram_model='linear',verbose=False)
File "<ipython-input-40-17311a362b4a>", line 17, in <module>
OK = OrdinaryKriging(data[:,0],data[:,1],data[:,2],variogram_model='linear',verbose=False)
File "~/python3.6/site-packages/pykrige/ok.py", line 232, in __init__
self.variogram_function, nlags, weight)
File "~/python3.6/site-packages/pykrige/core.py", line 199, in initialize_variogram_model
variogram_function, weight)
File "~/python3.6/site-packages/pykrige/core.py", line 286, in calculate_variogram_model
x0 = [(np.amax(semivariance) - np.amin(semivariance))/(np.amax(lags) - np.amin(lags)),
File "~/python3.6/site-packages/numpy/core/fromnumeric.py", line 2252, in amax
out=out, **kwargs)
File "~/python3.6/site-packages/numpy/core/_methods.py", line 26, in _amax
return umr_maximum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation maximum which has no identity
Is there a workaround for this?
Thanks!
Making this @bsmurphy comment a separate issue,
BTW, @rth (and anyone else who thinks about matrices, I don't think about them that much admittedly), I've been wondering if we could use LU decomposition on the kriging matrix to speed up the solution. Any thoughts? I haven't thought about this that much, but since the LHS matrix is the same (just the RHS vector that's changing), we might be able to leverage this for the looping backend...
In line with code reorganization in issue #33 , I was wondering what's your opinion on data pipelines. I could be wrong but, I looks like what the current OridnaryKrigging etc.. could be separated in several independent steps,
One could then potentially create, for instance, AnisotropyTransformer
, CoordinateTransformer
, KrigingEstimator
classes (or some other names) and get the results by constructing a pipeline using sklearn.pipeline.Pipeline
or sklearn.pipeline.make_pipeline
. The advantage of this being that the different transformers
I'm not sure if that would be useful. Among other things this could depend on how much more options we might end up adding. For instance the Universal Krigging class currently has 16 input parameters which is already quite a bit. If we end up adding, say local variogram search radius, kriging search radius and a few others, splitting the processing into several steps might be a way to simplify the interface for the users.
Just a though... What do you think?
Now that online documentation was setup, some effort to make sure all docstrings use a correct numpy style formatting is necessary.
For instance, most methods under "API reference" of the documentation, are currently not very well rendered.
Any help on this would be greatly appreciated. Instructions on how to build the documentation locally can be found in #48
This issues aims to discuss scaling of PyKridge to large datsets (which could impact, for instance, the optimization approaches in issue #35).
Here are approximate (and possibly inaccurate) time complexity estimations for different processing steps of the kriging process in 2D, according to these benchmarks (adapted from PR #36), applied to a 5k-10k dataset which only have 2 measurement points for each parameter,
~O(N_train²)
~O(N_test*N_train^1.5)
~O(N_test*N_nn^(1~2))
For information, the approximate time complexity of linear algebra operations that may limit the performance are,
O(N^3)
for linear system inversionsO(N^3)
for matrix multiplicationO(N^3)
for matrix inversions(though the constant term would be quite different).
This may be of interest to @kvanlombeek and @basaks as discussed in issue #29 . The training part indeed doesn't scale so well with the dataset size and also affect the predictions time. The total run time for the attached benchmarks is 48min wall time and 187min CPU time (on a 4 core CPU), so most of the cricial operation do take advantage of a multi-threaded BLAS for linear algebra operations.
Any suggestions of how we could improve scaling (or general performance) are very welcome..
As suggested already in the code, implement the option that the variance is taken into account when you krige.
Hi,
Thanks so much for your work.
I think on line 193 of core.py:
if np.any(x == coords[0]) and np.any(y == coords[1]):
the purpose is to check if there is a [x,y] that equals coords, however this line evaluates true if in the case of:
x = [1, 2, 3]
y = [1, 2, 3]
coords = [1,2]
however there is no [x,y]
such that coords == [1,2]
I have gotten around this by using:
bd = np.sqrt((x - coords[0])**2 + (y - coords[1])**2)
zero_value = np.any(bd <= 1e-8)
I am unfamiliar with the internals of the algorithm so please let me know if my suspicion is incorrect.
Thanks
when i used OrdinaryKriging, i set" coordinate_types='geographic'", error occured" init() got an unexpected keyword argument 'coordinate_types". Are there some functions requiring imported?
UniversalKriging
Updating variogram mode...
Using 'gaussian' Variogram Model
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-16-21c1fac50399> in <module>()
4 # anisotropy_scaling=1., anisotropy_angle=0.)
5
----> 6 K.update_variogram_model('gaussian', variogram_parameters={'sill':2.25166068e04, 'range':3.84845608e00, 'nugget':2.90938889e02})
7 K.variogram_model_parameters
8 K.display_variogram_model()
~/anaconda3/lib/python3.6/site-packages/pykrige/uk.py in update_variogram_model(self, variogram_model, variogram_parameters, variogram_function, nlags, weight, anisotropy_scaling, anisotropy_angle)
547 else:
548 print("Using '%s' Variogram Model" % self.variogram_model)
--> 549 print("Sill:", self.variogram_model_parameters[0])
550 print("Range:", self.variogram_model_parameters[1])
551 print("Nugget:", self.variogram_model_parameters[2])
TypeError: 'set' object does not support indexing
When you krige with large datasets (100k datapoints for example), the number of pairwise distances blow up. It would be nice if there is a parameter in the function like max_n_for_variogram that limits the number of datapoints used.
I have personally already implemented this in the ordinary kriging function as our dataset was too large.
I have large data and it doesn't work well with Arcgis, what about the code?
The policy on whether or not to automatically calculate the statistics (set in places with the enable_statistics
boolean kwarg) is inconsistent across the different kriging classes and isn't actually even re-done in the update_variogram_model
method in each class. (I think this is just my mistake from way back when I was putting this all together in the first place, and probably also a problem of having similar functionality have to be copied over so many classes.) I'm thinking about just making the statistics calculations automatic/default (i.e., get rid of the enable_statistics
flag that appears inconsistently across classes). With the fixes in PRs #47 and #51, the statistics calculations should go smoothly now. Then the user can access them with the existing infrastructure, and if verbose
(or someday logging) is enabled then they'll be spit out there. @rth, @basaks, @whdc, thoughts? Along with this change, I'm also thinking of adding a few other outputs to the statistcs, like @whdc suggested, and also some more useful calculations from the Kitanidis text...
I have extended the sklear_cv.Krige
class to do regression Kriging.
https://en.wikipedia.org/wiki/Regression-Kriging
Would that be of interest?
It would be great to add a documentation page for PyKrige with sphinx hosted, for instance at readthedocs.org.
The code itself is fairly well documented so it's just a matter of setting sphinx up and checking that the docstrings are correctly parsed. We could use, for instance, numpy style docstrings through sphinxcontrib-napoleon
(some adaptation work might be necessary).
P.S: @bsmurphy could you please grant me collaborator's right on this repo, so I can assign labels on issues etc. ? Thanks.
nose is not being developed anymore and we should probably migrate to pytest for unit tests (as most packages in the Python ecosystem have done or are in the process of doing).
This would allow to greatly simplify writing of unit tests, e.g. instead of
self.assertTrue(np.allclose(res, np.array([0.98, 1.05]), 0.01, 0.01))
one could simply write
assert np.allclose(res, np.array([0.98, 1.05]), 0.01, 0.01)
and pytest would handle everything. Actually in this case,
from numpy.testing import assert_allclose
assert_allclose(res, np.array([0.98, 1.05]), 0.01, 0.01)
would be even better.
Note: numpy.testing
might still require to install nose (particularly for older numpy versions).
I want to display an experimental variogram even before deciding which kriging method I want to use. Now what I can do is using pykrige.ok.OrdinaryKriging.display_variogram_model
or pykrige.uk.UniversalKriging.display_variogram_model
. Two issues come to my mind:
My idea is to extract a visualizer as a class on its own that can does the plotting. That visualizer is used by any call of display_variogram_model and keeps the matplotlib code at one place. If somebody does not have matplotlib installed, it only needs to be handled in the visualizer class. Further it can give the freedom to plot only the experimental variogram as well.
This is using python3.6
instead of the desired python3.4
0.34s$ wget http://repo.continuum.io/miniconda/Miniconda${TRAVIS_PYTHON_VERSION:0:1}-latest-Linux-x86_64.sh -O miniconda.sh
--2017-02-18 20:34:55-- http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
Resolving repo.continuum.io (repo.continuum.io)... 104.16.19.10, 104.16.18.10, 2400:cb00:2048:1::6810:120a, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.16.19.10|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh [following]
--2017-02-18 20:34:55-- https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
Connecting to repo.continuum.io (repo.continuum.io)|104.16.19.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34663461 (33M) [application/x-sh]
Saving to:miniconda.sh' 100%[======================================>] 34,663,461 140M/s in 0.2s 2017-02-18 20:34:55 (140 MB/s) -
miniconda.sh' saved [34663461/34663461]
before_install.2
0.00s$ chmod +x miniconda.sh
before_install.3
5.93s$ ./miniconda.sh -b
PREFIX=/home/travis/miniconda3
installing: python-3.6.0-0 ...
installing: cffi-1.9.1-py36_0 ...
installing: conda-env-2.6.0-0 ...
installing: cryptography-1.7.1-py36_0 ..
This probably deserves it's own thread.
In many situations an useful and an already used technique in the kriging community, is the local variogram
model. Can someone work on this instead?
This may have additional benefits in speeding up the variogram computations for very large problems (with lots of observations).
VESPER is a PC-Windows program developed by the Australian Centre for Precision Agriculture (ACPA) for spatial prediction that is capable of performing kriging with local variograms (Haas, 1990). Kriging with local variograms involves searching for the closest neighbourhood for each prediction site, estimating the variogram from the neighbourhood, fitting a variogram model to the data and predicting the value and its uncertainty. The local variogram is modelled in the program by fitting a variogram model automatically through the nonlinear least-squares method. Several variogram models are available, namely spherical, exponential, Gaussian and linear with sill. Punctual and block kriging is available as interpolation options. This program adapts itself spatially in the presence of distinct differences in local structure over the whole field.
Some context by @bsmurphy
@rth added the moving window function, which is similar to what you're suggesting @basaks except that it assumes a stationary variogram. Adding in the extra layer of re-estimating the variogram for local neighborhoods could certainly be done at some level, but it would require a calculation for each moving window location; therefore, not sure how much it would really boost speed... Anyways, both of these ideas would be nice to implement at some point: the local variogram estimation for max flexibility and the downsampling in global variogram estimation as a useful tool (could be put in the kriging tools module)...
Hi,
I tried to clone this repositery and then do a 'python setup.py install', but it tells me that I don't have numpy installed. I double checked, and numpy is well installed and working on my computer.
Do someone knows the issues ?
By the way, I do this because I want to try the regression kriging, which does not seem to be implemented in the pip version.
Thanks a lot for your help
In the long term perspective, it would probably be good to add Python 3.x support alongside 2.7.
The corresponding test environments, could then be uncommented in .travis.yml
and appveyor.yml
...
$ python2 Three-Dimensional-Kriging-Example.py
Traceback (most recent call last):
File "Three-Dimensional-Kriging-Example.py", line 35, in
k3d, ss3d = uk3d.execute('grid', gridx, gridy, gridz, specified_drift_arrays=[xg, yg])
File "/usr/lib/python2.7/site-packages/pykrige/uk3d.py", line 752, in execute
raise ValueError("Dimensions of drift values array do not match specified grid dimensions.")
ValueError: Dimensions of drift values array do not match specified grid dimensions.
why?
I took your example provided at https://github.com/bsmurphy/PyKrige#three-dimensional-kriging-example and tried to run it:
>>> k3d, ss3d = uk3d.execute('grid', gridx, gridy, gridz, specified_drift_arrays=[xg, yg, zg])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\current-user\AppData\Roaming\Python\Python36\site-packages\pykrige\uk3d.py", line 775, in execute
raise ValueError("Inconsistent number of specified drift terms supplied.")
ValueError: Inconsistent number of specified drift terms supplied.
That does not look like what I expected...
Hi,
I would like to know if cokriging is planned ? It would be very useful...
Thanks again for this amazing project !
Alexis
Hello,
Thanks for your work in the very useful Pykrige. One feature that I haven't found is the ability to specify the variogram cutoff, i.e. the distance up to which the variogram is calculated. Currently it appears that the variogram in Pykrige is calculated across the full distance of the data, whereas the typical cutoff is 1/3 of the diagonal distance of the data (i.e. gstat's default), and being able to specify this is important for many datasets. Apologies if this already exists and I've missed it, but otherwise this would be a useful addition.
I have already implemented an MPI Krige pipeline in another package, using which I can krige large geotifs.
I have so far used Kriging on 20000x12000 pixels in 10 minutes using ordinary krige + linear + n_esitmators_points=50 using 32 cores.
Let me know if this will be desirable, and I can put together another PR with an MPI example.
Hey there, New user here. Trying to do some of the krigging we do in SURFER using an open source library like this. Where can the user define the search parameters?
I use the latest PyKrige (1.3.0) on Fedora 23, and everythong works well. But it cannot work with backend="C".
Following is the error message:
Warning: failed to load Cython extensions.
See #8
Falling back to a pure python backend...
Good day
Currently we are using Pyrige on our project. We use variogram model gaussian, because we are not sure why on linear, all of value in asc are same. On gaussian, the result is fine, but we need linear model.
here is the script with data:
from pykrige.ok import OrdinaryKriging
import numpy as np
import pykrige.kriging_tools as kt
import subprocess
data = [[7227227.598324121, 741736.5387751074, 10], [7227387.972113899, 742348.3581188999, 10], [7227504.409359513, 741794.7492702686, 50], [7227196.779826333, 742523.8574917192, 10], [7227173.741004699, 742722.5347341978, 0], [7226577.720167324, 742653.4784150455, 400], [7227081.667052395, 743517.2418175992, 40], [7226356.042043427, 742829.1261297092, 90], [7226554.678690424, 742852.1465022594, 50], [7226157.289099098, 742812.6361562246, 800], [7226929.018067002, 743096.8637661343, 50], [7226684.283700548, 743471.1798958695, 60], [7226883.030788386, 743494.2116793899, 60], [7225958.657863444, 742783.0844111233, 600], [7226975.104354201, 742699.5162851174, 60], [7226379.083423361, 742630.4609652269, 780], [7226998.143078854, 742500.8419647408, 10], [7226730.38157898, 743073.8398052861, 290], [7227127.765316596, 743119.8900339963, 20], [7226707.334112899, 743272.5100437128, 20], [7227326.401742509, 743142.9146601006, 20], [7227525.0381373875, 743165.9396187299, 10], [7227303.353986896, 743341.593667091, 10], [7227104.717657487, 743318.566118863, 20], [7226799.395507857, 742477.8248015478, 10], [7225935.724523432, 742981.7453153334, 780], [7225584.425343172, 742538.3925276443, 70], [7226952.062683605, 742898.1902188676, 0], [7226753.426098847, 742875.1691804474, 20], [7225736.977087256, 742958.7210444498, 70], [7226531.745059891, 743050.8161771429, 760], [7226776.467672564, 742676.4981690557, 70], [7227280.303285444, 743540.2722880202, 20], [7225490.149526373, 742349.2620251302, 60], [7225429.391564154, 741351.0645974253, 20], [7227062.957670536, 741852.6477720598, 30], [7225417.185908113, 741151.4868056511, 40], [7225441.5925421305, 741550.7439451264, 50], [7226485.647374582, 743448.1504222901, 60], [7225760.021123014, 742760.0650378156, 0], [7227021.178857623, 742302.1672575969, 10], [7227050.757397798, 741653.0451066317, 110], [7225628.983078791, 741338.9515235351, 20], [7225005.795711183, 740976.1451948371, 10], [7226877.565960731, 741659.1935542912, 80], [7225392.878499182, 740752.2342010476, 60], [7225889.449159239, 742324.9649234914, 20], [7225641.186254723, 741538.5328277033, 30], [7226838.962920578, 741465.558159485, 20], [7224996.859375702, 740776.6322658678, 40], [7225205.38657999, 740964.0277736945, 20], [7227038.554236196, 741453.4430858254, 60], [7225229.79817595, 741363.2778863846, 40], [7225453.681624319, 741750.3210617228, 60], [7226826.757280052, 741265.9596496419, 0], [7225193.28812708, 740764.3551863618, 470], [7225241.889738288, 741562.8514915913, 30], [7223792.999598428, 742133.7748068094, 40], [7223371.917262611, 741755.8843535043, 10], [7226905.970504334, 743295.5369270579, 30], [7225783.062211899, 742561.4086442813, 50], [7227548.083043832, 742967.257303668, 20], [7225405.089937136, 740951.8107056284, 0], [7225689.855679398, 742337.0643507005, 20], [7225616.889589428, 741139.2719109184, 30], [7224624.179692563, 742689.773329013, 10], [7223560.952906568, 741545.4710069869, 10], [7223404.271078165, 742355.0653203392, 10], [7223782.140346996, 741934.040743739, 20], [7224402.900267473, 742301.1648633606, 10], [7224203.108117493, 742311.945089181, 10], [7223184.121154054, 741765.6820959963, 10], [7223161.4835907, 741566.9496566014, 10], [7223814.487816765, 742533.2408105701, 20], [7223803.74515754, 742333.5075065546, 20], [7224003.426678358, 742322.7266269, 30], [7224370.538778005, 741702.0415918522, 20], [7223593.20759579, 742144.5559392121, 10], [7224014.1697301455, 742522.4628108565, 20], [7223415.123744923, 742554.7948350566, 20], [7223393.52630399, 742155.3383827255, 10], [7223382.778629602, 741955.6120492729, 20], [7223992.68072684, 742122.9910471919, 20], [7223361.161998633, 741556.2601255218, 10], [7224834.600497055, 742878.7444872069, 20], [7224213.962354492, 742511.6861246709, 20], [7223760.6329405755, 741534.6792663855, 10], [7223971.067542169, 741723.6206349385, 30], [7223771.387197874, 741734.4101528268, 20], [7223425.8645168785, 742754.4220858769, 10], [7225877.258927372, 742125.2772584416, 10], [7226500.533824252, 742488.058595678, 70], [7228167.217470973, 742078.6910039685, 30], [7227620.821597124, 741241.2340276041, 20], [7227890.517958436, 742020.4753557814, 250], [7225840.814586731, 741526.4044635892, 10], [7223441.704057103, 742995.6797662751, 0], [7224258.535894949, 743116.5761307424, 0], [7223582.459529135, 741944.8267253814, 30], [7224589.876254661, 742293.0382504247, 10], [7224181.612530067, 741912.4687743032, 30], [7224170.858597059, 741712.8324249025, 10], [7224392.042738639, 742101.4215543908, 20], [7223604.063555035, 742344.2877281049, 20], [7223571.708562405, 741745.098115536, 0], [7224823.861671604, 742678.9955736481, 10], [7224381.293102894, 741901.6808193953, 10], [7223614.805821692, 742544.0181519834, 20], [7224613.327568116, 742490.0259277604, 0], [7223625.5469861785, 742743.648281642, 10], [7223981.931875573, 741923.2560715778, 40], [7224192.361773577, 742112.2066297057, 20], [7227336.89690737, 741904.0453019128, 0], [7227446.193639806, 742071.5546095923, 20], [7227395.109361633, 741627.2418545977, 10], [7226508.697690444, 743249.4834928864, 50], [7225852.978788622, 741726.0067433427, 20], [7225653.384751505, 741738.2156896066, 20], [7225465.87681756, 741950.0016979582, 0], [7225677.664066257, 742137.4804681872, 10], [7226076.962701651, 742113.1762338938, 0], [7225665.471353862, 741937.7963174966, 10], [7225478.070910113, 742149.5820675708, 20], [7225865.174804527, 741925.6931193519, 10], [7225266.283984977, 741962.1054698917, 120], [7225217.5947164595, 741163.6020529796, 50], [7225066.578481888, 741974.3074886939, 60], [7225254.0874126945, 741762.5286132753, 70], [7226488.343454836, 742288.4625977299, 40], [7226288.750236297, 742300.5641146363, 70], [7226101.233891608, 742512.3541465371, 20], [7226089.044345303, 742312.7638840869, 30], [7226300.82760811, 742500.2561939631, 280], [7225278.365082124, 742161.7819139991, 80], [7225078.76995909, 742173.9830298338, 30], [7227781.220247371, 741852.9614902339, 40], [7227664.781552965, 742406.5815673766, 0], [7227613.7075066045, 741962.2594666366, 0], [7227555.488521611, 742239.0666980959, 0], [7227832.29824914, 742297.2882012674, 0], [7227723.003803087, 742129.7724436527, 0], [7227999.703027056, 742187.9900356561, 70], [7227569.710108707, 740796.9214540131, 50], [7227453.316009933, 741350.4365766125, 30], [7227671.920685718, 741685.4504051579, 30], [7227730.128059231, 741408.6395133291, 50], [7228006.938174991, 741466.9451057224, 0], [7228225.432099374, 741801.8707178305, 280], [7227839.4308860935, 741576.1487067917, 50], [7227679.019901991, 740964.4231965214, 40], [7228116.131559499, 741634.355076432, 130], [7227948.621070396, 741743.6587180682, 40], [7228057.920195877, 741911.1734708956, 360], [7227562.619273299, 741517.9421006007, 110], [7227511.515069407, 741073.7303940638, 40], [7227897.633934952, 741299.4350224253, 160], [7227788.3278440125, 741131.9277193418, 30], [7224343.020621691, 740161.8996701327, 10], [7224425.608069318, 740361.7923243452, 20], [7224283.690284246, 740396.6345250956, 10], [7223463.037230799, 743194.7162524968, 20], [7223661.911908321, 743175.1816026324, 20], [7223860.786574427, 743155.6467022882, 50], [7224059.661240388, 743136.1115416633, 40], [7224215.869797494, 742718.5037778367, 0], [7224414.744507348, 742698.9682300765, 20], [7224634.95220642, 742878.4684002735, 30], [7224436.0775191095, 742898.004259161, 0], [7224237.202842622, 742917.5398685983, 0], [7224457.410549199, 743097.0404698143, 10], [7224656.285192069, 743077.5045587744, 10], [7224279.868954374, 743315.6125746248, 60], [7224478.74357572, 743296.0768518158, 40], [7224500.076620733, 743495.1134159163, 0], [7224301.202032242, 743514.6491906192, 30], [7224322.535106477, 743713.6859785933, 0], [7226133.83447627, 743004.8117909471, 500], [7226332.471057562, 743027.8347541334, 430], [7225824.052747822, 741291.0459670975, 0], [7226592.771177746, 741050.3176416397, 10], [7226393.178775877, 741062.5383295384, 0], [7226006.091973691, 741286.460748286, 20], [7225993.88470266, 741086.8747398502, 10], [7226604.977896271, 741250.013172498, 10], [7226193.477278963, 741074.6554450922, 0], [7226405.276076367, 741262.1281139745, 10], [7226205.683175875, 741274.345237537, 10], [7225781.097517002, 740682.4678015739, 0], [7225980.6893187035, 740670.2431864912, 0], [7226180.390043719, 740658.1207460476, 0], [7226379.981671028, 740645.8947385595, 0], [7226392.19402713, 740845.586234625, 10], [7226192.492979417, 740857.7065003845, 10], [7225992.90085382, 740869.9289436647, 20], [7225793.310423685, 740882.049776317, 10], [7225611.271718425, 740886.639979107, 10], [7226309.423783402, 743226.4991375986, 10], [7226110.787298828, 743203.4732513556, 40], [7225811.845756525, 741091.4625891433, 10], [7225599.059091186, 740687.0606350523, 0]]
data = np.array(data)
grid_step = 10
gridx = np.arange(740079.981183, 743963.151806, grid_step)
gridy = np.arange(7223164.31876, 7228331.36228, grid_step)
# Create the ordinary kriging object. Required inputs are the X-coordinates of
# the data points, the Y-coordinates of the data points, and the Z-values of the
# data points. If no variogram model is specified, defaults to a linear variogram
# model. If no variogram model parameters are specified, then the code automatically
# calculates the parameters by fitting the variogram model to the binned
# experimental semivariogram. The verbose kwarg controls code talk-back, and
# the enable_plotting kwarg controls the display of the semivariogram.
OK = OrdinaryKriging(data[:, 0], data[:, 1], data[:, 2], variogram_model='linear')
# Creates the kriged grid and the variance grid. Allows for kriging on a rectangular
# grid of points, on a masked rectangular grid of points, or with arbitrary points.
# (See OrdinaryKriging.__doc__ for more information.)
z, ss = OK.execute('grid', gridx, gridy)
# Writes the kriged grid to an ASCII grid file.
kt.write_asc_grid(gridx, gridy, z, filename='krige-test.asc')
Is there any setting that we miss?
Thank you
Hello,
thanks you so much for working on this and making it available on github!
I would like to use this module with a moderately large grid size of the order of nx=1000, ny=1000 and with a number of points n=100. Given that UniversalKriging
solves the linear system, A X = b
, where A
is, I believe, an array of shape (nx, ny, n, n)
, this currently requires roughly nx*ny*n**2*8/1e9= 80 GB
of RAM simply to define that matrix in 64bit , and subsequently I'm running out of memory on my laptop.
I was wondering if I was possible to optimise this code so it is more memory efficient? For instance, the option would be to port the execute
method in uk.py
to Cython, so it uses a lower number of temporary numpy arrays for indexing, etc., which would same some memory. However, at the end of the day, it's still necessary to define that same A
matrix (which is not sparse as far as I can tell) for the linear system. I suppose, it's not possible to do a domain decomposition on the grid (decompose it in regions and do the calculation region by region), right? Is there any other things that could be attempted? I can do some optimisation , I would just need some advice about where to start since I don't know much about Kriging.
Thanks,
Travis CI occasionally fails because pykrige.test.TestRegressionKrige
takes more than 10min to run (maximum allowed time). This seems to happen randomly, on other occasions in completes within seconds: see e.g. #61 where one Travis CI job passed, the other failed.
I think the problem is that sometimes the test fails to download the test dataset with fetch_california_housing
in https://github.com/bsmurphy/PyKrige/blob/master/pykrige/test.py#L1726 . In any case it might be a good idea to replace this with a built-in dataset that doesn't require to be downloaded.
cc @basaks
Hi there, first of all thanks for the project!
Somehow I struggle to install it when using Windows + Miniconda.
(C:\path\to\miniconda)` C:\current\directory>pip install pykrige
Collecting pykrige
Downloading PyKrige-1.3.0.tar.gz (596kB)
100% |████████████████████████████████| 604kB 1.5MB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\current-user\AppData\Local\Temp\pip-build-phddipzi\pykrige\setup.py", line 14, in <module>
from Cython.Distutils import build_ext
ModuleNotFoundError: No module named 'Cython'
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\Users\current-user\AppData\Local\Temp\pip-build-phddipzi\pykrige\
There is a proposal that I might be working on this. There is a lot out there on GWR using R, but I could only find one such in python. Let me know if you come across other GWR python packages. Depending on how pygwr
fares, I might look at some R/python hybrid, or simply a R based solution.
I tried to do benchmark run with gstat package in R, and I found there are differences between two outputs. Have you ever done similar experiments?
library(gstat)
a <- read.table("http://www.stat.ucla.edu/~nchristo/statistics_c173_c273/kriging_11.txt", header=TRUE)
x.range <- as.integer(range(a[,1]))
y.range <- as.integer(range(a[,2]))
grd <- expand.grid(x=seq(from=x.range[1], to=x.range[2], by=1),
y=seq(from=y.range[1], to=y.range[2], by=1))
m <- vgm(10, "Exp", 3.33, 0)
q1 <- krige(id="z", formula=z~1, data=a, newdata=data.frame(x=64, y=128), model = m,
locations=~x+y)
from pykrige.ok import OrdinaryKriging
import numpy as np
import pykrige.kriging_tools as kt
data = np.loadtxt('./kriging_11.txt', skiprows=1)
# vgm(psill, model, range, nugget)
# m <- vgm(10, "Exp", 3.33, 0)
#spherical - [sill, range, nugget]
OK = OrdinaryKriging(data[:, 0], data[:, 1], data[:, 2], variogram_model='exponential',
variogram_parameters = [10, 3.33, 0],
verbose=False, enable_plotting=False)
z, ss = OK.execute('points', np.array([64]), np.array([128]))
print z
The prediction at (64, 128) I got 338.9828 from gstat and 363.59867058131033 from pykrige.
Would be nice to update the CHANGELOG.md
with the latest updates, and also reverse it, so that latest changes are on the top (not the bottom).
It might also be a good moment to discuss how to best acknowledge different contributors. For instance for new features, major updates, we could add the name of the contributor (with their agreement) and the PR number, as it is done for instance in the scikit-learn changelog?
Currently matplotlib is a hard dependency and it is imported as import matplotlib.pyplot as plt
at the top of most source files. It might be better to make it an optional dependency so that krigging would still work even if matplotlib is not installed and the plotting would fail with some explicit error message asking to install matplotlib.
This is in particularly useful when computing kriging on a remote server, where plotting might not be best suited anyway.
Hello, i really want to know that how to process the data with lon and lat . do i have to transport it to Projection coordinatesystem? thank you
Could you implement this? How much work would it take?
Hi again,
I just saw that the function '_get_kriging_matrix' is very very long (about 5-10 secs) when the training points size is about 10,000 or 15,000.
Do you know if a speed up would be possible ? For instance, I was thinking that as I specified in my model 'n_closed_points' = 100, computing the kriging matrix over the 100 nearest points would be sufficient ?
Greetings,
Trying to run the regression kriging example found on the git page, here:
https://github.com/bsmurphy/PyKrige/blob/master/examples/regression_kriging2d.py
When I run the code in spyder I get the error "No module named 'pykrige.rk'"
I am using a pip installed Pykrige module Version: 1.3.2.
Looking in the cite package folder where PyKriging is installed I don't see a file titled rk.py, like I see when I download the master version. that I downloaded from GitHub. Should I be using a different example for this older version of PyKrige?
If not, and I know this is an off-topic question, how can I install the new version on a Windows machine?
Thanks in advance,
Austin.
Hi
I'm doing 1D ordinary kriging and instead of the predictions, I need the mean of the kriged process. Is there any method that returns that or the estimated weights of the kriged process?
Thanks a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.