libmir / mir-random Goto Github PK
View Code? Open in Web Editor NEWAdvanced Random Number Generators
Home Page: http://mir-random.libmir.org/
Advanced Random Number Generators
Home Page: http://mir-random.libmir.org/
Hello, I was attempting to use normalVar when I ran into issues with the code not compiling.
template mir.random.variable.NormalVariable!double.NormalVariable.opCall cannot deduce function from argument types !()(MersenneTwisterEngine!(uint, 32LU, 624LU, 397LU, 31LU, 2567483615u, 11LU, 4294967295u, 7LU, 2636928640u, 15LU, 4022730752u, 18LU, 1812433253u)*), candidates are:
../../../.dub/packages/mir-random-2.1.5/mir-random/source/mir/random/variable.d(912,7): mir.random.variable.NormalVariable!double.NormalVariable.opCall(G)(ref scope G gen) if (isSaturatedRandomEngine!G)
../../../.dub/packages/mir-random-2.1.5/mir-random/source/mir/random/variable.d(941,7): mir.random.variable.NormalVariable!double.NormalVariable.opCall(G)(scope G* gen) if (isSaturatedRandomEngine!G)
/usr/bin/ldc2 failed with exit code 1.
These issues came when I tried using the following code.
import std.random;
import std.stdio;
import mir.random.variable;
Random gen = Random(unpredictableSeed);
double x = normalVar!double()(&gen);
writeln(x);
I referred to the documentation to try and get the code working, but I was unable to understand the examples using the undefined rne
and threadLocalPtr
. Furthermore, when I tried running the examples on the documentation, they seemed to fail. In short, I was wondering what the correct way to get a random double from the NormalVariable struct was.
@n8sh Could you please tag the next v0.2.8 release, list changes since v0.2.5, and make announce in the forum?
Since Linux 3.17 there is a getrandom
syscall - it's pretty cool because it always to block until enough entropy is provided see e.g.
https://lwn.net/Articles/606141/
or:
http://man7.org/linux/man-pages/man2/getrandom.2.html
Small proof-of-concept for Linux:
/*
* Flags for getrandom(2)
*
* GRND_NONBLOCK Don't block and return EAGAIN instead
* GRND_RANDOM Use the /dev/random pool instead of /dev/urandom
*/
enum GRND_NONBLOCK = 0x0001;
enum GRND_RANDOM = 0x0002;
enum GETRANDOM = 318;
size_t syscall(size_t ident, size_t n, size_t arg1, size_t arg2)
{
size_t ret;
synchronized asm
{
mov RAX, ident;
mov RDI, n[RBP];
mov RSI, arg1[RBP];
mov RDX, arg2[RBP];
syscall;
mov ret, RAX;
}
return ret;
}
void main() {
long buf;
syscall(GETRANDOM, cast(size_t) &buf, buf.sizeof, 0);
import std.stdio;
writeln(buf);
}
syscall is taken from syscall-d.
Also should we provide a way for the user to directly receive random data via OS APIs?
One of the tests in transformations.d has code like this:
assert(iv.ltx.approxEqual(-8.75651e-27));
assert(iv.lt1x.approxEqual(-1.02451e-24));
assert(iv.lt2x.approxEqual(-1.18581e-22));
Note that all the expected values are negative and small.
However, when I actually print out these values immediately before the test, they are all positive (but otherwise the same):
iv.ltx 8.756510893e-27
iv.lt1x 1.024511801e-24
iv.lt2x 1.185806751e-22
How is it that the test is passing when the expected values have the wrong sign? Well, these values are all very small and appoxEqual with default arguments will accept any two values whose absolute difference is less than 1e5, so any two small value will always succeed regardless of their values.
Perhaps something like feqrel
should be used instead: the values agree to at least 19 mantissa bits based on that function (if the sign is fixed), or just a purely "relative" test rather than dual relative + absolute tests like approxEqual
.
Hi. I'm getting a statistical artifact in a simulation of a quantile estimator. This artifact appears when I use mir-random, but not when I use Phobos.
At first I thought it was just an artifact (more or less expected) of an imperfect pseudo-random number generator. So I expected that if I changed the generator, either the artifact would go away or it would be replaced by different artifacts. But I tried two different mir generators (pcg32 and SplitMixEngine) and the artifact remains. Since this doesn't happen with Phobos, it suggests it might be an issue with mir-random.
Here's my simulation procedure:
Generate a random sample of size n
(n=30
in my examples), using some distribution (at first I was using an ExponentialVariable
, but I changed it to a UniformVariable
, so I could easily compare it with Phobos).
Generate a random quantile between 0 and 1.
Estimate the quantile value using the sample. To do that, we sort the sample (to obtain the order statistics) and select the ordered sample value that best estimates the true quantile value (basically, the kth order statistic estimates q=k/(n+1)
).
Since we know the actual distribution, obtain the true quantile corresponding to the estimated quantile value.
In one file (quantile.txt
) output the pair (requested quantile
, true estimated quantile
)
In another file (quantile-diff.txt
) output the pair (requested quantile
, true estimated quantile
- requested quantile
)
Plot both files, using gnuplot
Here's an example of the kind of artifact I mean, a visible line among the expected dot dispersion. You might need to run the simulation more than once for the artifact to show, or for it to show in a different region.
Requested vs estimated quantile:
Quantile diff (same simulation run):
As part of my efforts to isolate this issue, one of the things I tried was to use a fixed order statistic (instead of the one that best approximates each quantile). Here's some example data I obtained using the order statistic x_{3:30} (D array index 2).
One trial without the artifact:
Another trial with the artifact. Notice both a different artifact strength, and a different estimate density (compare side-by-side with the previous example, for it to be easier to see):
Here's the gnuplot code:
set terminal png;
set output "quantile.png"
plot "quantile.txt" with dots notitle
set output "quantile-diff.png"
plot "quantile-diff.txt" with dots notitle
Here's the D simulation code: https://gist.github.com/luismarques/053e8f98bd0aba9657305541c8eabca9
Any ideas on what might be causing this?
I am working on a software for mixed-modelling. Currently, I have been using BLAS' gemm
routine
and would like to switch to mir-glas
. However, gsl library doesn't go well with mir-glas
.
Here is a sample program that can be used to reproduce the errors.
gemm_example.
dub.json => link
If I remove, gsl
from dependencies and libs
in dub.json
, the program compiles without any errors.
dub build --compiler=ldmd2 --parallel --force -v results in
https://gist.github.com/prasunanand/2bfd4b12e5fe43360bf0a0a90369af56
Other info
prasun@devUbuntu:~/dev/temp/gemm$ nm ~/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/libmir-cpuid.a | grep cpuid_init
0000000000000000 T cpuid_init
prasun@devUbuntu:~/dev/temp/gemm$ nm ~/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/libmir-cpuid.a | grep cpuid_dCache
0000000000000000 T cpuid_dCache
prasun@devUbuntu:~/dev/temp/gemm$ nm ~/.dub/packages/mir-cpuid-0.4.2/mir-cpuid/libmir-cpuid.a | grep cpuid_uCache
0000000000000000 T cpuid_uCache
prasun@devUbuntu:~/dev/temp/gemm$ ldmd2 -v
LDC - the LLVM D compiler (1.1.0):
based on DMD v2.071.2 and LLVM 3.7.1
built with LDC - the LLVM D compiler (0.17.1)
Default target: x86_64-unknown-linux-gnu
Host CPU: haswell
http://dlang.org - http://wiki.dlang.org/LDC
...
Maybe, I'm doing something wrong (help would be appreciated) but I'm getting nonsensical results when using an ExponentialVariable (EV).
Take an EV with lambda 0.5. We can confirm that it has a median of around 1.38629: http://www.wolframalpha.com/input/?i=median+of+exponential+distribution+with+lambda+0.5.
Therefore, I would expect that a ExponentialVariable!double(0.5) range would return approximately half of its values < 1.38629, and half > 1.38629. That doesn't seem to be the case.
import std.stdio;
import mir.random.algorithm;
import mir.random.variable;
void main()
{
enum lambda = 0.5;
enum median = 1.38629;
enum n = 1000;
auto ev = ExponentialVariable!double(lambda).range;
int smaller;
int equal;
int larger;
foreach(i; 0 .. n)
{
auto v = ev.front();
ev.popFront();
if(v < median)
++smaller;
else if(v == median)
++equal;
else
++larger;
}
auto nf = double(n);
writefln("%s %s %s", smaller/nf, equal/nf, larger/nf);
}
(I used an explicit loop, instead of ranges, just to be sure I wasn't doing something unexpected).
Running this (https://run.dlang.io/is/r1p4vE) I get about 95% of values are smaller than the median.
Hi Nathan,
Could you please sent me your public profile if any or your email.
Kind regards,
Ilya
ping @n8sh
unpredictableSeed
uses the time and process id to generate a random seed (see here).
Q: Why doesn't it let the OS do the work? (e.g. /dev/random
or getrandom
)
(comes from Dlang's issue tracker: https://issues.dlang.org/show_bug.cgi?id=16493)
There used to be Vose's O(1) discrete sampler in mir
:
libmir/mir#259
https://github.com/libmir/mir/blob/da76cf406d06957e472b9ba90b4c90b917480cb9/source/mir/random/discrete.d
Paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.398.3339&rep=rep1&type=pdf
It was pretty cool because it allowed fast sampling once the initial setup cost of O(n) is paid.
@9il: I assume that you didn't port it because of its usage of the experimental Allocator?
It could be changed to use malloc and either @disable this(this)
or allow copying with ref-counting.
Q: Is there interest in porting this to mir-random and if so how would you like the allocation of the working data to be done?
In ndvariable assert
is used to signal about wrong input. But assert
is not convenient very much 'cause it's too fatal. Cannot exceptions be used here?
I believe there is a typo in the template below:
template transform(string f0, string f1, string f2, string c)
{
import std.array : replace;
enum raw = `_f0 *= _c;
_f0 = copysign(exp(_f0), _c);
_f2 = _c * _f1 * _f1 + _f2;
_f2 *= _c * _f0;
_f1 *= c;
_f1 *= _f0;`;
enum transform = raw.replace("_f0", f0).replace("_f1", f1).replace("_f2", f2).replace("_c", c);
}
Note the line _f1 *= c;
, I believe it should be _f1 *= _c;
.
It happens to work because everywhere it is called the input variable happens to be named 'c' so the replace("_c", c)
isn't strictly necessary in those cases.
any plans to include multinomial random variable?
The MersenneTwisterEngine
template distinguishes between the UIntType
used to store generator state, and the word size w
: for example, when UIntType
is ulong
(64-bit), the word size may still be only 32 or (say) 48 bits. In this case the returned variates should consist of only the lowest w
bits of the generated UIntType
value (which is not currently enforced) and the seeding process should also take this into account (currently only partially done: the first entry in the state array takes the lowest w
bits of the provided seed, but the population of the other state array entries does not take this into account).
Comparison of output of mir.random
's MersenneTwisterEngine
and C++11's mersenne_twister_engine
can be made using the following D unittest:
unittest
{
alias MT(UIntType, uint w) = MersenneTwisterEngine!(UIntType, w, 624, 397, 31,
0x9908b0df, 11, 0xffffffff, 7,
0x9d2c5680, 15,
0xefc60000, 18, 1812433253);
foreach (R; std.meta.AliasSeq!(MT!(uint, 32), MT!(ulong, 32), MT!(ulong, 48), MT!(ulong, 64)))
{
static if (R.wordSize == 48) static assert(R.max == 0xFFFFFFFFFFFF);
auto a = R(R.defaultSeed);
import std.stdio : writeln;
writeln(a());
writeln(a());
writeln();
}
}
... and the following C++11 program:
#include <cinttypes>
#include <iostream>
#include <random>
#include <type_traits>
template<class UIntType, size_t w>
void mt_result ()
{
std::mersenne_twister_engine<
UIntType, w, 624, 397, 31,
0x9908b0df, 11, 0xffffffff, 7,
0x9d2c5680, 15,
0xefc60000, 18, 1812433253> gen;
gen.seed(std::mt19937::default_seed);
std::cout << gen() << std::endl;
//for (int i = 0; i < 599; ++i)
// gen();
std::cout << gen() << std::endl;
std::cout << std::endl;
}
int main ()
{
mt_result<uint32_t, 32>();
mt_result<uint64_t, 32>();
mt_result<uint64_t, 48>();
mt_result<uint64_t, 64>();
}
xorshift1024*
/ xorshift1024*ฯ
is faster and has better statistical properties than Mt19937_94
while occupying considerably less memory (136 bytes vs 2512 bytes).
I might suggest that the default mir.random.engine.Random
should be something much more lightweight, but using xorshift1024*
will not harm any existing code that assumes that Random
is suitable for massive simulations. Rather, such code will be improved.
I have made a branch with the proposed change:
n8sh@7dc9438
With Mir v0.7.0 released with extMul
, I think it may be time for a new mir-random release. Agree?
Running the code below on Windows 10 with --compiler=ldc2
import mir.ndslice;
import mir.random : threadLocalPtr, Random;
import mir.random.variable : normalVar;
void main() {
auto sample = threadLocalPtr!Random.randomSlice(normalVar, 15);
}
throws the following exception:
Error: no property randomSlice for type mir.random.engine.mersenne_twister.MersenneTwisterEngine!(ulong, 64LU, 312LU, 156LU, 31LU, 13043109905998158313LU, 29LU, 6148914691236517205LU, 17LU, 8202884508482404352LU, 37LU, 18444473444759240704LU, 43LU, 6364136223846793005LU)*
dub.json
"dependencies": {
"mir-algorithm": "~>3.7.18",
"mir-random": "~>2.2.11"
},
mir.random
currently uses the name Mt19937
to mean the standard 32-bit generator when building for 32-bit targets and the MT19937-64 generator when building for 64-bit targets.
However, this is problematic terminology. MT19937 is explicitly defined to be the version of the generator that has a 32-bit unsigned integer as its data type, a 32-bit word size, a 624-word state, etc. etc.: see the opening paragraphs of https://en.wikipedia.org/wiki/Mersenne_Twister
The recommended course of action here is to delete the Mt19937_32
symbol and to use Mt19937
as the symbol only for the 32-bit generator. If this is acceptable I'll submit patches to this effect.
Note the changes proposed here do not affect the decision of what the default RNG Random
should be.
The forum has a question about simulating multiple values from a multivariate normal distribution. This is possible with a for loop, but it seems not currently possible with mir.random.algorithm's range.
void main()
{
import mir.random : Random, unpredictableSeed;
import mir.random.ndvariable : MultivariateNormalVariable;
import mir.random.algorithm : range;
import mir.ndslice.slice : sliced;
import std.range : take;
auto mu = [10.0, 0.0].sliced;
auto sigma = [2.0, -1.5, -1.5, 2.0].sliced(2,2);
auto rng = Random(unpredictableSeed);
auto sample = range!rng
(MultivariateNormalVariable!double(mu, sigma))
.take(10);
}
Hi there,
I'm developing a time-reaction propagator for physical simulations based on the Gillespie algorithm.
See https://code.dlang.org/packages/gillespied.
I would like to provide an optional dependency on mir.random.
Managed to do this, however, while testing I found that for a particular case, test results don't fit the expected results.
See https://bitbucket.org/Sandman8/gillespied/issues/1/bug-with-real-type
The tests I drive include
std.random of phobos and mir.random
both for types float, double, real and size_t.
The test fails only for the combination of mir.random & real.
This is the reason, why I'm writing here. However, not sure, where the bug really is. Maybe you could have a look at this.
original libmir/mir#390
We have version(D_betterC)
now. This allows to declare a thread local default RNGs like they are in Phobos.
This is an enhancement request concerning the Tiny Mersenne Twister algorithm :
As a request, could it be implemented / added to mir ?
Thanks,
Ezneh.
The _z
field of the MersenneTwisterEngine
is uninitialized on struct creation:
The constructor does not touch it directly, but calls popFront()
, where in this first call, _z
is used to set the value of z
without _z
itself having been initialized first:
Presumably the intention/expectation is that _z
should be initialized to data[index]
in the constructor?
I struggle to find how to do a simple random integer generation.
import mir.random;
import mir.random.engine.mersenne_twister;
void main() {
auto rng = Mt19937(123);
auto n = rng.rand!int(10);
}
does not work.
So, whenever I want to generate a random integer I'd have to cast
. Is there a Mir alternative to np.random.randint(1, 10)
?
isRandomEngine
makes use of the RandomEngine
UDA in order to determine if a given functor is a random engine or not. Aside from terminology (technically a random engine is a pseudo-random generator, and this check does not account for pseudo-random vs. 'true' random), there are a number of concerns with this approach:
Forced dependencies. The isUniformRNG
check in Phobos' std.random
, which looks for a compile-time boolean flag isUniformRandom
, means that 3rd-party code can implement compatible generators without having any direct dependency on std.random
. For example, the code implemented at https://github.com/WebDrake/dxorshift does not import std.random
except in unittests, where it is used to verify compatibility. By contrast any code that wishes to implement mir.random
-compatible RNGs is forced to import the RandomEngine
enum.
Forwarding of information via generic wrappers. The UDA will not be available if a generator is wrapped by functionality like (say) RefCounted
or Unique
. Currently enum constant fields will also not be forwarded (see: https://issues.dlang.org/show_bug.cgi?id=14830) but that at least seems possible in principle; it seems more dubious that UDAs would be reasonable to forward via a generic wrapper.
With this in mind it might be worth considering if a UDA is the right way to mark functors as random number generators.
I can't find pcg.d inside zip for release 0.2.5...
// Compile script below with (set randomRange to RandomRange):
// $ dub run --single random.d
#!/usr/bin/env dub
/+ dub.sdl:
name "random"
dependency "mir" version="0.22.0"
dependency "mir-random" version="~>0.0.1"
dependency "mir-math" version="~>0.0.1"
+/
import std.range, std.stdio;
import mir.ndslice;
import mir.random;
import mir.random.variable: NormalVariable;
import mir.random.algorithm: RandomRange;
void main(){
auto rng = Random(unpredictableSeed); // Engines are allocated on stack or global
auto sample = rng // Engines are passed by reference to algorithms
.range(NormalVariable!double(0, 1))// Random variables are passed by value
.take(1000) // Fix sample length to 1000 elements (Input Range API)
.array; // Allocates memory and performs computation
writeln(sample);
}
//Got the error:
Performing "debug" build using dmd for x86_64.
mir-internal 0.0.1: target for configuration "library" is up to date.
mir-math 0.0.1: target for configuration "library" is up to date.
mir-random 0.0.1: target for configuration "library" is up to date.
random ~master: building configuration "application"...
random.d(20,6): Error: no property 'range' for type 'MersenneTwisterEngine!(ulong, 64LU, 312LU, 156LU, 31LU, 13043109905998158313LU, 29u, 6148914691236517205LU, 17u, 8202884508482404352LU, 37u, 18444473444759240704LU, 43u)'
dmd failed with exit code 1.
See #51 (comment)
Linux kernel headers define
__NR_getrandom
(and other syscall numbers). I have created an empty https://github.com/libmir/mir-linux-kernel for linux headers.https://github.com/torvalds/linux can be used as reference.
We need anarch/<arch-name->/include/uapi/asm/unistd.h
files and few small others that are imported by some archs.D file name structure can looks like
mir/linux/arch/<arch-name->/uapi/asm/unistd.d
plus unified filemir/linux/asm/unistd.d
that predifined arch version from predefined-versions. D does not support all architectures, but many of them are supported by LDC and D specification, see predefined-versions for details.
I try to generate two dimensional normal distributed point cloud using covariance matrix:
[ 4 0
0 4 ]
then I calculate variance for each dimension and for second dimension variance is around 1.0 instead of 4.0
this source code is runnable on run.dlang.org:
/+dub.sdl:
dependency "mir-algorithm" version="~>3.7.8"
dependency "mir-random" version="~>2.2.8"
+/
import std;
void main()
{
import mir.ndslice.slice: sliced;
import mir.random;
import mir.random.ndvariable : multivariateNormalVar;
// given settings
enum DataSize = 10_000;
scope Random* gen = threadLocalPtr!Random;
auto sigma = [4.0f, -0.0f, -0.0f, 4.0f].sliced(2,2);
auto rv = multivariateNormalVar!float(sigma);
// generate samples
float[2][] samples;
samples.length = DataSize;
samples.each!((ref v) { rv(gen, v[]); });
//writeln(samples);
// calculate mean and variance
const mean_y = samples.map!"a[1]".mean;
writeln(mean_y);
float var_y = 0;
samples.each!(v=>var_y += (v[1]-mean_y)^^2);
var_y /= DataSize-1;
writeln(var_y); // <====== variance is expected to be around 4.0, but it always is around 1.0
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.