Comments (4)
Thanks @PragTob
I have based a lot of the behaviours in this package on the Python numpy
and scipy
libraries.
The scipy.stats.mode()
function returns the lowest value when there are multiple values with the same frequency - https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mode.html
Can you describe how you are using this function so I can understand your requirements?
from elixir-statistics.
Hi there,
truth be told I'm not really using - I was looking at the library when evaluating if it was worth releasing my statistics library (Statistex) and came across this so I thought I might as well report it as I believe it to be a bug.
That said, I'm not the one to argue with numpy and scipy. It's just what I read up on when implementing mode (or fully truthful, what someone else read up on and I questioned when reviewing it :) )
I think it's important to know about a data set that there are multiple modes and also seeing them as they might be very interesting. For instance I use this mostly with benchmarking. Seeing that there are 2 modes but they aren't directly "next" to each other is super interesting.
In the same vein, when no value occurs more than once and then just the smallest value is reported that seems highly unhelpful to me. In benchmarking terms it would have me believe that the fastest run time is the one that occurs most frequently which imo heavily skews the results.
Anyhow, that's just my perspective/what I remember from reading it up back then. If numpy and scipy do it like this I'm sure it's fine and feel free to close this :)
Tobi
from elixir-statistics.
Thanks for the background @PragTob
To be honest, I probably didn't think too much about the mode
implementation. It's not a statistic I use very often.
Benchmarking is probably better evaluated with percentiles anyway, especially if you have real (floating point) numbers.
There might be scope for another mode
function with a different arity which can support multimodal datasets.
from elixir-statistics.
I find all of them useful, but yes benchee also supports percentiles :) (Currently by default shows 99th%, which might be too hardcore maybe 95th% would be a better default, not sure) But yeah, love me some box plots :)
from elixir-statistics.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elixir-statistics.