simogeo / geostats Goto Github PK
View Code? Open in Web Editor NEWA tiny and standalone javascript library for classification and basic statistics :
Home Page: https://www.intermezzo-coop.eu/mapping/geostats/
A tiny and standalone javascript library for classification and basic statistics :
Home Page: https://www.intermezzo-coop.eu/mapping/geostats/
Just have a glance on this
For example:
If my data is 80,70,1 and I request Jenks breaks with 5 classes - it will throw:
Cannot read property '2' of undefined
at geostats.getClassJenks (geostats.js:1079)
in legend, when using 'distinct' mode, display values with wanted precision even if bound is Int :
here is the code portion to adapt. See skaindl@07e1ec6#diff-b8f0546235651ebff498b7ebc1cbec4dL1086
if(mode == 'distinct' && i != 0) {
if(isInt(start_value)) {
start_value = parseInt(start_value) + 1;
} else {
start_value = parseFloat(start_value) + (1 / Math.pow(10,this.precision));
// strangely the formula above return sometimes long decimal values,
// the following instruction fix it
start_value = parseFloat(start_value).toFixed(this.precision);
}
start_value = parseFloat(start_value) + (1 / Math.pow(10,this.precision));
start_value = parseFloat(start_value).toFixed(this.precision);
}
I intend to use geostats
in typescript file, but i got this error message: Could not find a declaration file for module 'geostats'...
when importing it.
This is the full message showing on my Webstorm:
TS7016: Could not find a declaration file for module 'geostats'. '.../node_modules/geostats/lib/geostats.min.js' implicitly has an 'any' type. Try
npm i --save-dev @types/geostats
if it exists or add a new declaration (.d.ts) file containingdeclare module 'geostats';
I've tried to run npm i --save-dev @types/geostats
, but @types/geostats
doesn't exist. Could anyone help me on this issue?
Changing the sample page to
Setting the serie7 to getEqInterval(5); around line 89 and displaying legend in discontinuous mode like so :
var content_legend3 = serie7.getHtmlLegend(null, 'Legend (real values, breaking continuous serie)', true, null, 'discontinuous');
will cause an error line 825 :
TypeError: this.inner_ranges is null
var tmp = this.inner_ranges[i].split(this.separator);
This isn't necessarily an issue with geostats, but I wanted to make you aware that the _ function in geostats conflicts with Underscore.js; a popular javascript library (http://documentcloud.github.com/underscore/). Underscore.js uses the _ function for accessing every method in its library. For my project, to fix this, I renamed the _ function in geostat to returnStr.
Otherwise, nice job with geostats. Thanks!
Hello!
In our project we have situations when geostats is used for data arrays with more than 30k items and in such cases getClassQuantile gives wrong results. Other used methods getClassJenks and getClassEqInterval works as expected
In this example (github doesnt't allow upload html files, so I upload txt) quantile_bug.txt getClassQuantile returns [5, 5, 5, 20332]
See following fork commit ... and do something clean :
I have a set of data that when classified using Jenks creates a single element class for the highest data point, but the class is not used. The upper and lower bounds for the class are both set to the same value (67.5 in this case), but the data point is placed in the next class down which also has 67.5 set as its upper bound.
The way the code is setup, it seems that any value that falls directly on a boundary will be assigned to the class for which it is the upper bound, this will produce empty classes anytime a class contains only a single element.
When the serie is updated with setSerie, the statistic fields (this.stat_*) are not reset, which causes other functions like getClassQuantile to break, since this.sorted() assumes that this.stat_sorted is still up to date.
This can be fixed very simply by adding a function like
this.resetStatistics = function() {
this.stat_sorted = null;
this.stat_mean = null;
this.stat_median = null;
this.stat_sum = null;
this.stat_max = null;
this.stat_min = null;
this.stat_pop = null;
this.stat_variance = null;
this.stat_stddev = null;
this.stat_cov = null;
}
that would be called after setting the new serie in setSerie.
i.e.
this.setSerie = function(a) {
this.log('Setting serie (' + a.length + ') : ' + a.join());
this.serie = Array() // init empty array to prevent bug when calling classification after another with less items (sample getQuantile(6) and getQuantile(4))
this.serie = a;
//reset statistics after changing serie
this.resetStatistics();
this.setPrecision();
}
Description:
I have noticed that the geostats
npm package size is surprisingly large, at about 63 MB. Upon investigating, it seems this is due to a directory called OpenLayers
that is not actually present in the GitHub repository. This unaccounted space usage could potentially create storage and deployment inefficiencies for users of the package.
Steps to Reproduce:
Install the geostats
package via npm with npm install geostats
Check the disk usage of the installed package with du -sh node_modules/geostats
(on Unix-like systems) or equivalent commands on other systems
Navigate into the installed geostats
directory in node_modules
and observe the OpenLayers
directory.
Expected Behavior:
The npm package size should be more or less similar to the repository size, considering only the necessary files for the library to function.
Actual Behavior:
The npm package size is much larger than the GitHub repository due to an extraneous OpenLayers
directory.
ability to exclude specific values from min / max methods
How is this project licensed? I'm interested making use of this project but cannot due so unless it's under some kind of license like MIT or Apache.
This function doesn't work as expected in a case where size of the population is divisible by number of classes.
this.getQuantiles = function(nbClass) {
var tmp = this.sorted();
var quantiles = [];
var step = this.pop() / nbClass;
for (var i = 1; i < nbClass; i++) {
var qidx = Math.round(i*step+.5);
quantiles.push(tmp[qidx-1]); // zero-based
}
return quantiles;
};
For example if the population size is 10 and nbClass=5 --> step=2 and we get qidx: 3,5,7,9 instead of 2,4,6,8.
The problem is the round function that in this case rounds up (for example 2.5 --> 3). Quick solution would be to add 0.49 instead of 0.5.
We had a case where we had material size of 320 members. It was divided to 5 classes and the result was groups that had 65,64,64,64,63 members.
implement geometric progression classification
When setting the bounds on the std deviation classes, sometimes it gets the wrong intervals as it doesn't check if the minimum value or maximum is already inside the range of the intervals set.
For instance, if the stddev is 10 and the mean is 5, but the minimum value on the data is 1, the intervals returned might be
[1, -5, 5, 15, maxValue]
When match bounds is set, after setting the stddev interval bounds, the code should check if min and max values are inside the interval and shift it accordingly. Regarding the last example, the stddev should be shifted from -1 stddev, 0 stddev, 1 stddev to 0, 1 ,2 so the minimum value is actually the minimum bound of the interval. This might create a problem when min and max value are inside the intervals, which can either be solved by ignoring the match bounds or by reducing the stddev step from 1 to 0.5 or 0.25.
By doing some tests I saw that we get different results depending on the use of getClassJenks
or getClassJenks2
(the old version). Is this normal?
I've dropped two sets of values to reproduce my example: https://gist.github.com/mthh/bda5150a1389733c9709a1e1f5b4e756
With data from file a.json :
g = new geostats(values)
g.getClassJenks(7)
// Output:
// [0.0028109620325267315, 1.2488291063345969, 2.6667816913686693, 4.124441018793732, 5.528719199355692, 6.937670034822077, 8.41798685491085, 9.997982932254672]
g.getClassJenks2(7)
// Output:
// [0.0028109620325267315, 1.2488291063345969, 2.6667816913686693, 4.124441018793732, 5.528719199355692, 6.937670034822077, 8.431752701289952, 9.997982932254672]
With data from file b.json :
g = new geostats(values)
g.getClassJenks(7)
// Output:
// [0, 0, 10435, 11643, 12476, 13336, 16073, 27125]
g.getClassJenks2(7)
// Output:
// [0, 0, 10435, 11643, 12476, 13337, 16116, 27125]
In both cases it is the penultimate value which is different.
with colors attributes;
Currently the upper class limit is the same as the lower limit of the next class so it is unclear to which class a value belongs to, if it's on the border. You can see this in the example 9 legend:
http://www.empreinte-urbaine.eu/mapping/geostats/
It would be better if the class limits would show the actual values of the data (i.e. what is the maximum value in the data set belonging to that particular class and what is the minimum value in the data set belonging to the next class). This could be an option in the getHtmlLegend()-method, if both alternatives are desired.
Thanks for a nice lib!
Hi,
When I have an array of numbers with a great precision and try to initialize a geostats
object I get Uncaught RangeError: toFixed() digits argument must be between 0 and 20
when using chromium/chrome, although it works fine in Firefox.
Is it a misuse from me ? Or something to be fixed ?
> arr = [0.00000123456789123456789, 0.00000123456789123456789, 0.123, 0.156]
Array [ 0.0000012345678912345679, 0.0000012345678912345679, 0.123, 0.156 ]
> s = new geostats(arr)
geostats.js:240 Uncaught RangeError: toFixed() digits argument must be between 0 and 20
In geostats.js
using chormium:
(line 211) var precision = (this.serie[i] + "").split(".")[1].length;
set this.precision
to 22, which is too high to be used (line 240) in b[i] = parseFloat(a[i]).toFixed(this.precision);
(its gives the same precision, 22, in Firefox but the .toFixed()
method wont complain before a precision of 101)
Maybe by truncating the precision to 20 when necessary ? Or in any case ? (it was gonna be truncated to 22) and according to Mozilla Developer Network, the arguments of toFixed()
...
... may be a value between 0 and 20, inclusive, and implementations may optionally support a larger range of values.
I don't know nor if that should be handled by geostats
neither if there is a clean way to check that before encountering the error but it's still possible to reduce the precision at this point...:
@@ -237,7 +237,15 @@ var geostats = function(a) {
for (var i = 0; i < a.length; i++) {
// check if the given value is a number
if (isNumber(a[i])) {
- b[i] = parseFloat(a[i]).toFixed(this.precision);
+ try {
+ b[i] = parseFloat(a[i]).toFixed(this.precision);
+ } catch (e) {
+ if(this.precision > 20) {
+ this.precision = 20;
+ b[i] = parseFloat(a[i]).toFixed(this.precision);
+ } else
+ throw e;
+ }
} else {
b[i] = a[i];
}
Cheers!
useful to automatically set bounds/ranges and generate legend
to shorten down the code I suggest the following changes :
/** return min value */
this.min = function () {
if (this._nodata())
return;
this.stat_min = Math.min.apply(null, this.serie);
//if (this.stat_min == null) {
// this.stat_min = this.serie[0];
// for (i = 0; i < this.pop() ; i++) {
// if (this.serie[i] < this.stat_min) {
// this.stat_min = this.serie[i];
// }
// }
//}
return this.stat_min;
};
/** return max value */
this.max = function () {
if (this._nodata())
return;
this.stat_max = Math.max.apply(null, this.serie);
//if (this.stat_max == null) {
// this.stat_max = this.serie[0];
// for (i = 0; i < this.pop() ; i++) {
// if (this.serie[i] > this.stat_max) {
// this.stat_max = this.serie[i];
// }
// }
//}
return this.stat_max;
};
A better implementation of : skaindl@e70f21a
Hi and thanks very much for your very useful library.
I was getting at an error something to do with 'a[i].toFixed is not a function' and I found a typo in line 240 of geostats.js.
Currently it's
b[i] = parseFloat(a[i].toFixed(this.precision));
and changing that to
b[i] = parseFloat(a[i]).toFixed(this.precision);
got rid of the error.
Again Thanks!
Hello,
I downloaded the geostats library several days ago, it was the master branch. I found some little issues in the new implementation of the 3 functions min, max and decimalFormat.
In the min and max functions, there are undeclared variables respectively called min (line 277) and max (line 293), that I believe is a bug when you changed the implementation for those 2 functions. I think they should be this.stat_min and this.stat_max instead.
In the decimalFormat function, when you use toFixed() (line 245), values in the number array (a) will be converted to string and copied to the new array (b). Therefore, there will be a subtle type transformation of the serie, and will become a greater issue when we search for the min and max values of the serie (since 2 < 12 but "2" > "12"). I suggest to add parseFloat after doing toFixed() (b[i] = parseFloat(parseFloat(a[i]).toFixed(this.precision));).
Nonetheless, your library geostats is a great help for our works concerning statistics.
Best regards,
Dac Anh Minh LE
Following standard JS conventions, class names should be CamelCase: Geostats
or GeoStats
There is some unexpected issue if you, like me, tend to name instances like this after the class which they represent:
var geostats = new geostats(values);
geostats
is redefined here, so you end up with a somewhat non-obvious JS error.
bug on counter when using uniqueValues
When I perform getClassJenks
on more than 10,000 data, it will take a long time even cause the page jams
the StDeviation() method doesn't fix the min and the max value correctly.
This method can compute bounds larger than the min/max values.
Here are my corrections :
// we finally set the first value
// a[0] = this.min();
a[0] = a[1]-this.stddev();
// we finally set the last value
// a[nbClass] = this.max();
a[nbClass] = a[nbClass-1]+this.stddev();
What i'm trying to do is to import this package into Angular 8 project with Typescript. Unfortunatelly im getting errors "i is not defined" or "round is not defined", both of variables in all use cases are not declared in source code.
Please fix it by adding 'var' keyword before every 'i' in loops and before round object declaration
Thank you so much for your contribution to the mapping community. I have used geostats in my dissertation work. It works well and efficiently and does exactly what is described. Thank you again for this useful tool.
I am preparing an academic manuscript where I discuss my research and would like to include a proper citation to geostats. I am fine with adding a citation for the software itself and referencing the github page. However, I would like to first ask if geostats was developed for an academic research project, and is there an accompanying manuscript I can cite instead? If not, no problem as I am still happy to cite the github site as I mentioned.
Kind regards,
Scott
How ranges can be:
[ -137.000777441011,
-74.89077744101098,
-12.78077744101099,
49.32922255898901,
111.439222558989,
173.54922255898902,
235.65922255898903,
297.76922255898904 ]
if min: 0 max: 182
?.
Geostats on NPM is version 1.1.0. require('geostats')
seems to yield an object and not work at all.
The workaround, npm install simogeo/geostats
works, but is a workaround and others might fall into this trap.
The min and max methods use this:
this.stat_min = Math.min.apply(null, this.serie);
This apparently is recursive because I got this error:
My quick and dirty workaround was to replace your min and max functions using d3 min and max functions:
var geoStatsObject = new geostats(allData);
geoStatsObject.min = function () {
if (this._nodata())
return;
this.stat_min = d3.min(this.serie);
return this.stat_min;
};
geoStatsObject.max = function () {
if (this._nodata())
return;
this.stat_max = d3.max(this.serie);
return this.stat_max;
};
On https://www.intermezzo-coop.eu/mapping/geostats/, the samples just show up as blank for me:
I've tested in Firefox, Chrome, and IE. The shapefile to geojson sample works, though.
last class value bound wrong caused by precision of floats:
this.getEqInterval = function (nbClass) {
if (this._nodata())
return;
this.method = _t('eq. intervals') + ' (' + nbClass + ' ' + _t('classes')
+ ')';
var tmpMax = this.max();
var a = Array();
var val = this.min();
var interval = (tmpMax - this.min()) / nbClass;
for (i = 0; i <= nbClass; i++) {
a[i] = val;
val += interval;
}
//-> Fix last bound to Max of values
a[nbClass] = tmpMax;
this.bounds = a;
this.setRanges();
return a;
};
Implement getStandardDeviation classification
Geostats does not currently support numeric separators in any form, does it? I was wondering for a way to get getHtmlLegend
to output large numbers with thousands separators but I'm guessing I'd have to rewrite the function to make that happen?
(Just discovered this, which suggests this'll be a feature in V8, but was wondering in general.)
For most functions, it seems I should be able to simply call (for example), geostats.min(values)
without creating a geostats
instance.
I haven't used the library enough yet to understand when/why I'd want my own instance of geostats
instead of just calling its functions statically.
Consider the following data set [2,16,20,23,10,29], calling getJenks(5) returns [2,10,16,20,29,29]. Notice the duplicate 29, is this the correct behavior? Also what is the purpose of the extractBounds method?
When using the getClassJenks function, I ended up finding out that if I pass in a string value as the nbClass param, it will result in a very slow execution time. After looking at your source code, I noticed that it was appending a 1 to the end of the value instead of the expected addition.
For example, if I called getClassJenks("5") on a series with ~ 4000 rows, it was taking a very long time to return results. ~ 10 seconds
Here is a debugging image from chrome:
It looks like in my example, it is looping 51 times instead of 6 times.
Maybe the user should be smart enough to make sure to pass in a number type when calling this function, but I think it would be helpful to try to parse the nbClass value if it a string / log an error if the user passes in a string.
I would be happy to create a pull request to fix this issue if you want. I love this library, it has helped me out quite a bit in the last year.
geostats could be non blocking to use it on NodeJs.
Hi,
We get classify issue when the number of classes (user selects) is equal or greater than the number of values that are going to be classified.
I can give you examples if it is not clear.
Regards,
Ghazal
var values = [43.96, 50.92, 45.96, 45.44, 44.95, 46.64, 44.16, 45.63, 40.4, 45.16, 44.21, 46.27, 47.93, 44.77, 45.1, 45.94, 45.53, 44.79, 44.84, 45.84, 44.22, 44.55, 45.35, 45.8, 43.67, 45.17, 47.88, 45.07, 47.95, 46.26, 44.07, 47.07, 43.73, 44.33, 46.88, 44.66, 45.16, 46.05, 44.43, 43.9, 43.33, 44.8, 47.24, 44.88, 45.43, 46.89, 46.37, 44.67, 46.62, 46.43, 45.8, 47.05];
var gs = new geostats(values);
var breaks = gs.getClassJenks(5);
Results:
[40.4, "43.33", "44.95", "46.27", "47.95", 50.92]
Expected:
[40.4, 43.33, 44.95, 46.27, 47.95, 50.92]
Hey,
I'm really interested in using this lib for the GeoStyler but the last release on NPM was 3 years ago.
Are there any plans to reactivate the NPM package or do i have to download the lib from github and include it 'manual'.
BTW: The minified version is currently broken (0 bytes).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.