Comments (6)
from slim.
Hrm. I have no guesses, but I'll look into it when I have a sec.
from slim.
OK, I've put some thinking-out-loud here so you can see the process, but the upshot is, bug fixed. So with a seed of 2 instead of 123, some of the first-generation individuals live on to the end, and their locations are corrupted too; it's not only the individuals that are present in the table solely because they were remembered. I added a call to sim.treeSeqOutput("first_gen.treestxt", _binary=F);
before the binary-file output call, and the positions in the text output are corrupted too, so it is not an issue specifically with binary output. I also wrote a little script to read the .trees file back in and then print positions, and the positions of first-generation individuals are corrupted there too:
initialize() {
initializeSLiMModelType("nonWF");
initializeSLiMOptions(dimensionality="xy");
initializeTreeSeq();
initializeMutationType("m1", 0.5, "g", 0.0, 2);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 9);
initializeMutationRate(0.0);
initializeRecombinationRate(1e-8);
}
reproduction() {
}
1 late() {
sim.readFromPopulationFile("first_gen.trees");
for (ind in p1.individuals) {
catn(c("late", ind.pedigreeID, ind.x, ind.y));
}
sim.simulationFinished();
}
So the corruption is also hosing SLiM's reading back in, unsurprisingly; this is just a confirmation of what we (think we) already know, that the data in the files is in fact bad (rather than this being a weird pyslim bug or something).
So, there is something about the way individuals in the first generation are handled that is different from other individuals, and that results in their location data being corrupted. Well. When addSubpop() is called, that results in the first-gen individuals being archived immediately, right? With whatever random data happens to be in their location properties. Then setSpatialPosition() gets called, but that does not update the archived bad data. Then in the late() event you call treeSeqRememberIndividuals(), but only on the individuals that survived through the first mortality phase. It should update the archived data for those individuals; but nobody ever updates the archive for the other first-gen individuals.
So we have two groups of first-gen individuals. The first group is those who died during the first mortality phase; in my test run with seed 2, that is ids 1 4 6 9. They get archived with garbage and never get updated, so it is not surprising that they have garbage. To fix them, I would think you would want to call rememberIndividuals() in your 1 early() event after setting up their spatial positions.
But then the second group is those who lived through the first mortality phase; in my test run that is 0 2 3 5 7 8. They should have their archives updated by treeSeqRememberIndividuals(), it seems to me, and that apparently is not happening. SLiMSim::ExecuteMethod_treeSeqRememberIndividuals() calls SLiMSim::AddIndividualsToTable(), and the code there to update the location data... makes no sense. It was passing memcpy() &location, which is a pointer to a std::vector; and it was passing a size of location.size(), which is the number of entries in location, not the number of bytes. Changing that to location.data() and location.size() * sizeof(double) seems to have fixed the bug, as far as my tests indicate.
And after fixing that bug, the first group of individuals (1 4 6 9) did indeed still have garbage location data, and adding a treeSeqRememberIndividuals() call at the end of the 1 early() event did indeed fix that. So that also seems to make sense, although it's sucky that one has to call treeSeqRememberIndividuals() for this to work; that's a bit counterintuitive and will probably hose other people too. Not sure how to fix it, though.
This really is one of those "how did this ever work??" bugs, since any call to treeSeqRememberIndividuals() that caused an update of an existing record ought to have corrupted the location data, as far as I can tell. So all of the individuals that lived for more than one generation in your model should have become corrupted, it seems to me; I'm not sure why they didn't. The mysteries of life; I'm not inclined to track that down. :->
If you think there are any loose ends here, let me know. Thanks for the catch, this is a good bug. :->
from slim.
Whew, good catch. I don't think that the bad info in non-surviving individuals is terrible; but perhaps there should be a bit in the treeSeqRememberIndividuals() documentation saying that (a) you can call it in early and/or late; and (b) maybe you'd want to call it in early, before mortality; and (c) a brief explanation of this gotcha.
I guess we didn't see this because we've got no tests that check whether first-gen locations are correctly recorded.
from slim.
@petrelharp The new doc:
– (void)treeSeqRememberIndividuals(object individuals)
Permanently adds the individuals specified by individuals to the sample retained across tree sequence table simplification. This method may only be called if tree sequence recording has been turned on with initializeTreeSeq(). All currently living individuals are always retained across simplification; this method does not need to be called, and indeed should not be called, for that purpose. Instead, treeSeqRememberIndividuals() is for permanently adding particular individuals to the retained sample. Typically this would be used, for example, to retain particular individuals that you wanted to be able to trace ancestry back to in later analysis. However, this is not the typical usage pattern for tree sequence recording; most models will not need to call this method.
Calling treeSeqRememberIndividuals() on an individual that is already remembered will cause the archived information about the remembered individual to be updated to reflect the individual’s current state. A case where this is particularly important is for the spatial location of individuals in continuous-space models. SLiM automatically remembers the individuals that comprise the first generation of any new subpopulation created with addSubpop(), for easy recapitation and other analysis (see section 16.10). However, since these first-generation individuals are remembered at the moment they are created, their spatial locations have not yet been set up, and will contain garbage – and those garbage values will be archived in their remembered state. If you need correct spatial locations of first-generation individuals for your post-simulation analysis, you should call treeSeqRememberIndividuals() explicitly on the first generation, after setting spatial locations, to update the archived information with the correct spatial positions.
Does that seem good?
from slim.
Sure, good edit. This will not make it into the 3.2 docs (that train has just left the station :->), but it'll roll in whatever the next version after that is. :->
from slim.
Related Issues (20)
- Segfault in python-based tests HOT 27
- Error with paste0 function: too many arguments supplied HOT 7
- feature request: do not include trailing blank space when pressing ctrl+shift+left/right HOT 2
- turn pacman crank for SLiM 4 HOT 2
- `treeSeqRememberIndividuals(inds)` now throws an error if inds is of length zero HOT 1
- Error when using tree sequence recording in conditional simulations HOT 1
- logfiles and absolute paths on windows HOT 1
- Add parent age information to Individual HOT 15
- Tree-seq docs in the SLiM manual need a review HOT 4
- SLiM 4.0.1 release process HOT 13
- More than ~2 billion segregating mutations doesn't give a good error HOT 1
- add `localPopulationDensityAtPoint` method HOT 4
- Windows CI failing due to SIGTRAP HOT 3
- Can't reload recapitated tree to SLiM HOT 16
- "Total fitness <= 0.0" error loading population state - memory issue or something else? HOT 5
- Windows fix for RNG seeding needed HOT 8
- open and save file dialogs should start in working directory? HOT 4
- SLiMgui prompts "This document has been modified" even though it hasn't HOT 6
- "end of simulation" tick specification HOT 3
- between-species interactions invalidated for species other than the first in reproduction() HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from slim.