GithubHelp home page GithubHelp logo

Comments (17)

grahamgower avatar grahamgower commented on June 11, 2024 1

I found this because a small proportion of my simulations were running forever (an endless loop caused by calling sim.readFromPopulationFile() and having a repeating sequence of random numbers). This is the smallest code I could come up with to exhibit the problem. My practical code is much more complicated, but I'm happy to share it if necessary.

from slim.

grahamgower avatar grahamgower commented on June 11, 2024 1

Ok, thanks Ben. I am not even remotely an expert on RNG algorithms, so I won't comment on MT compared with other algorithms. I had no problems with 200,000 simulations using setSeed(rdunif(1,0,asInteger(2^62)-1)), whereas I had 80 or so stuck processes after ~25,000 simulations with setSeed(rdunif(1,0,asInteger(2^32)-1)). I don't need the functionality you describe above, so I'm going to change my code to avoid repeatedly calling setSeed().

from slim.

grahamgower avatar grahamgower commented on June 11, 2024 1

One final comment from me, and then I'll leave you alone. :)

If we draw n=100,000 numbers from a 32 bit range, then the expected number of duplicates is just n*(n-1) / 2^32 = 2.33. Whereas for a 62 bit range, we have n*(n-1)/2^62 = 2.17e-9. So the MT is doing just fine here, it's rather the range of possible values that's the issue. It strongly suggests that reseeding a PRNG from a sequence of 32 bit random integers is probably a bad idea, even with a perfect source of entropy.

from slim.

bhaller avatar bhaller commented on June 11, 2024

Interesting. Well, the situation under the hood is rather complex; Eidos currently has two random number generators that it maintains in parallel, one 32-bit and one 64-bit. By re-seeding without generating any intervening values, you may be exercising the random number generator setup in a way that would be unlikely to occur in real simulations. Did you see an issue with this in real practice? But in any case, that recommendation was made back when Eidos had only a 32-bit generator, so trying to generate a number in [0, 2^62-1] would not have worked; but now it works (rdunif() draws from the 64-bit generator), so the recommendation in the manual should probably change as you suggest. Thanks.

from slim.

grahamgower avatar grahamgower commented on June 11, 2024

I saw that you're using biased modulo arithmetic for rdunif(). I made the following change, but it didn't resolve my problem. You might be interested in making this change in the future anyway.

diff --git a/eidos/eidos_rng.h b/eidos/eidos_rng.h
index 3cf3e82a..605b8b5f 100644
--- a/eidos/eidos_rng.h
+++ b/eidos/eidos_rng.h
@@ -406,16 +406,17 @@ inline __attribute__((always_inline)) double Eidos_MT64_genrand64_real3(void)
 /* BCH: generates a random integer in [0, p_n - 1]; parallel to Eidos_rng_uniform_int() above */
 inline __attribute__((always_inline)) uint64_t Eidos_rng_uniform_int_MT64(uint64_t p_n)
 {
-	// OK, so.  The GSL's uniform int method, whose logic we replicate in Eidos_rng_uniform_int(), makes sure
-	// that the probability of each integer is exactly equal by figuring out a scaling, and then looping on
-	// generated draws, with that scaling applied, until it gets one that is in range.  Here we skip that extra
-	// work and just use modulo.  This technically means our draws will be biased toward the low end, unless
-	// p_n is an exact divisor of UINT64_MAX, I guess; but UINT64_MAX is so vastly large compared to the uses
-	// we will put this generator to that the bias should be utterly undetectable.  We are not drawing values
-	// in anywhere near the full range of the generator; we just need a couple of orders of magnitude more
-	// headroom than UINT32_MAX provides.  If we start to use this for a wider range of p_n (such as making it
-	// available in the Eidos APIs), this decision would need to be revisited.  BCH 12 May 2018
-	return Eidos_MT64_genrand64_int64() % p_n;
+	// Lemire's integer multiplication debiasing.
+	// http://www.pcg-random.org/posts/bounded-rands.html
+	uint64_t t = (-p_n) % p_n;
+	uint64_t l;
+	__uint128_t m;
+	do {
+		uint64_t x = Eidos_MT64_genrand64_int64();
+		m = __uint128_t(x) * __uint128_t(p_n);
+		l = uint64_t(m);
+	} while (l < t);
+	return m >> 64;
 }
 
 

from slim.

bhaller avatar bhaller commented on June 11, 2024

Yes; as the comment indicates, this design is deliberate. It's a concession to efficiency; we want RNG draws to be as fast as possible. I might put your change in as an option selectable with a compile-time switch (see also the "fast poisson" code in eidos_rng.h), but I haven't seen any evidence that this is an issue for real simulations.

from slim.

grahamgower avatar grahamgower commented on June 11, 2024

FYI, the diff I provided is faster on my hardware.

Biased:

$ echo "for (i in 1:100) {rdunif(10000000, 0, 500+i);}" | eidos -time /dev/stdin 

// ********** CPU time used: 13.0755

Unbiased

$ echo "for (i in 1:100) {rdunif(10000000, 0, 500+i);}" | ./eidos -time /dev/stdin 

// ********** CPU time used: 5.19165

from slim.

bhaller avatar bhaller commented on June 11, 2024

How is that possible?

from slim.

grahamgower avatar grahamgower commented on June 11, 2024

I was wondering that myself... The function is inlined, so I guess the modulo operation in my diff gets moved out of the busy loop. I'm not even on a recent gcc (v5.5.0).

from slim.

grahamgower avatar grahamgower commented on June 11, 2024

Actually, it can't be inlining/gcc doing anything, because that's determined at run time (/me slaps forehead). It must be the cpu doing out-of-order execution. The following code is marginally slower with my diff. So asking for more than 1 random number at a time makes my diff win.

echo "for (i in 1:100000000) {rdunif(1, 0, 500+i);}" | ./eidos -time /dev/stdin

from slim.

bhaller avatar bhaller commented on June 11, 2024

Yeah, I was about to ask you for that n=1 timing test. :-> Yeah, with >1 draw the C++ code is executing a loop over the number of draws, and the invariants in the inlined function can be moved outside of the loop, so it can be fast. (Still impressive that it is actually faster than a single modulo, wow!) But with 1 draw there is no such win, and the added complexity makes your code a bit slower. The difference is marginal when the timing test involves Eidos code like this, because when executing for (i in 1:100000000) {rdunif(1, 0, 500+i);} the bulk of the overhead is probably in the Eidos interpreter anyway. But for the case where the random number draws are being requested one at a time from SLiM's core engine, with no Eidos interpreter overhead, I would imagine the performance impact would actually be quite large; you could do a C++ loop instead of an Eidos loop to see if you're curious (I'm curious :->). You'd have to prevent the inlining optimizing your timing loop, though. Random number generation does show up as a hotspot in many models, so it's worth worrying about the performance, although I'm not sure this particular code path is a typical hotspot. In any case, it would be nice to provide your diffs as an option, as I mentioned.

Speaking of your diffs: they rely on __uint128_t, which I guess is a GCC extension? SLiM tries to compile on a pretty broad range of platforms, so that is not ideal. If it's in a conditionally compiled block anyway, then it's not the end of the world; people using a compiler that doesn't support __uint128_t just can't flip that particular compile-time switch. But if there's an alternate way of coding this that would be more platform-agnostic that would be preferable...?

from slim.

grahamgower avatar grahamgower commented on June 11, 2024

Hmm... I'm not sure about __uint128_t on other compilers (maybe unsigned long long works for some?). To be honest, I just picked a simple debiasing from pcg-random.org that had reasonable performance. It might not be the best choice. And if you really care about tuning the performance, you should consider replacing the mersene twister with a generator from the PCG family.

from slim.

bhaller avatar bhaller commented on June 11, 2024

Hmm. Well, that MT64 RNG had the best overall performance (in terms of both speed and good pseudorandomness properties) in some comparison article I found, not sure where. I'm unlikely to worry about it further unless it shows up as a bottleneck for a model I'm trying to optimize; SLiM's code base is too large to go looking for trouble like that. :->

from slim.

grahamgower avatar grahamgower commented on June 11, 2024

After some more experimentation, I reckon this is directly caused by the MT. I downloaded the mt19937ar.c file and made some small changes (mt19937ar-grg.c.txt).

$ gcc -Wall mt19937ar-grg.c -o mt19937ar
$ for i in `seq 10`; do ./mt19937ar $i | awk '{if (a[$1] > 0) {print "seed='$i': repeat at", NR", same as", a[$1]", delta =", NR-a[$1]; exit} a[$1]=NR}'; done
seed=4: repeat at 92774, same as 41948, delta = 50826
seed=5: repeat at 37087, same as 22246, delta = 14841
seed=7: repeat at 43354, same as 42850, delta = 504
seed=8: repeat at 7481, same as 5537, delta = 1944
seed=10: repeat at 83321, same as 63121, delta = 20200

The numbers drawn can repeat with short delta. I haven't done the maths to see if this is normal for a perfect RNG---intuitively its not. In any event, the sequences do not repeat when this happens. However, if you reseed the RNG with this drawn number, then you'll get a repeating sequence.

What was the original reason to setSeed() in slim after calling sim.readFromPopulationFile()? Is this even necessary?

from slim.

bhaller avatar bhaller commented on June 11, 2024

Interesting. I agree that it doesn't seem "perfect", but cryptographic-quality RNGs are much slower, and such momentary correlations between numbers produced at very long intervals seem unlikely to matter for simulations as long as they don't follow a regular pattern. But of course they bite in this particular usage case. It may be that re-seeding a RNG with a number generated by the RNG itself is generally considered to be a no-no, so we might be violating the "rules" right there, I don't know. Overall, my impression is that the MT generator is considered to be pretty high quality given its speed.

As to why we do this: it is an option, not a requirement. The idea is to create a specific type of reproducibility: the ability to skip over intervening attempts in a conditional simulation. If you don't re-seed the generator when you return to the save point, then to recreate the final, successful run you have to do the entire run, including all restarts. If you want to do model runs that are conditional on a rare event, like fixation of a deleterious mutation, that might take a very long time. If, on the other hand, you re-seed the generator at every restart and print the new seed to your output stream, then once you have a successful run you can reproduce it without having to execute all of the intervening failed attempts: just load the save file, re-seed with the final, successful seed, and proceed forward with the successful run sequence. But given that this technique is running into problems, it might be best not to recommend it in the manual, but rather to just mention it as an optional add-on, with a discussion of the possible caveats.

On the other hand, your change to use asInteger(2^62)-1 fixes the issue, as I understand it, yes? So perhaps I should just adopt that as my new recommendation, with a bit more discussion of the fact that this is an option, not a requirement.

from slim.

bhaller avatar bhaller commented on June 11, 2024

Yes, that makes sense. Thanks very much for delving into this issue, this is very helpful. I won't do anything about it right now, since I'm busy with some other stuff; but I will definitely address it for the next release of SLiM, and the above discussion will be really useful for that. Thanks!

from slim.

bhaller avatar bhaller commented on June 11, 2024

Hi @grahamgower. I changed the recipes and added a discussion of this issue in the manual as well. It will come out in the next SLiM release. Thanks again.

from slim.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.