messerlab / slim Goto Github PK

SLiM is a genetically explicit forward simulation software package for population genetics and evolutionary biology. It is highly flexible, with a built-in scripting language, and has a cross-platform graphical modeling environment called SLiMgui.

Home Page: https://messerlab.org/slim/

License: GNU General Public License v3.0

Objective-C 0.83% Objective-C++ 6.29% C++ 58.50% C 17.85% CMake 0.18% HTML 6.57% Python 0.25% Rich Text Format 6.14% QMake 0.17% Slim 0.23% Makefile 1.38% Shell 0.30% M4 1.32%

slim

slim's People

Contributors

Stargazers

Watchers

slim's Issues

max(1,2) != max(2,1)

max() on integer arguments always returns the first argument.

diff --git a/eidos/eidos_functions.cpp b/eidos/eidos_functions.cpp
index 9bd3deee..28f02312 100644
--- a/eidos/eidos_functions.cpp
+++ b/eidos/eidos_functions.cpp
@@ -3615,7 +3615,7 @@ EidosValue_SP Eidos_ExecuteFunction_max(const EidosValue_SP *const p_arguments,
 			
 			if (arg_count == 1)
 			{
-				int64_t temp = arg_value->LogicalAtIndex(0, nullptr);
+				int64_t temp = arg_value->IntAtIndex(0, nullptr);
 				if (max < temp)
 					max = temp;
 			}

Error: condition for if statement has size() != 1 for partial sweeps

I'm trying to simulate partial sweep. However, I want the script to run as long as partial sweep hasn't been established. The script runs for most of my simulations but for certain simulations it returns the following error:

My script is as follows:

How can I get rid of this error?
Lastly, I know I can create a haplotype plot using SLiM GUI. But I am running all these simulations from my terminal and I do not use SLiM GUI. Is there a way to save these haplotypes as well from the terminal?

Include 'install' target in generated Makefile

When building from source and installing into system locations, the usual convention is to call make install. The current CMake generated makefiles don't support this. It should be straightforward to include, as CMake certainly supports it.

This is useful when building, .e.g, a Docker image containing SLiM, saving the trouble of copying the built binary to the /usr/bin or where ever.

`qnorm()` and similar functions, for truncation selection?

I'd like to implement truncation selection by providing a tail probability to the command line. To do so, we'd need a function like R's qnorm(), which given a probability (and Gaussian parameters) will return the quantile of that probability. Is this something that you'd consider merging in to SLiM?

GSL has the required functions that could be wrapped much like rnorm() is. If you're welcome to the idea, I'd be happy to work on it and submit a pull request.

Segmentation fault when processing a large recombination map

I think I discovered a bug in SLiM's processing of recombination rates.

I'm trying to simulate the impact of purifying selection against non-synonymous deleterious variants on linked intergenic/intronic neutral sites. I've generated a recombination map based on real genome annotations, specifying recombination rates between adjacent exons and/or neutral sites.

I noticed that as the number of recombination regions gets larger (a little over 1 million individual recombination regions), SLiM crashes with a segmentation fault.

By investigating this issue closer, I found that this is the offending line:
https://github.com/MesserLab/SLiM/blob/master/core/chromosome.cpp#L129

SLiM creates an array of doubles B to store recombination rates on the stack, which is limited in size. This can cause a stack overflow if the number of recombination rates gets too large, leading to a segmentation fault.

Whether this happens or not depends on the stack limit. For example, the stack limit on my system is ~8 Mb. Since the sizeof(double) is 8 bytes, 8Mb is a bit over 1 million doubles, which leads to a stack overflow when more than 1 million recombination regions are specified by the user (my case exactly).

It was very easy to fix the stack overflow problem by using a std::vector (which allocates memory on the heap internally) instead of a plain C array. You can see the changes I made to fix this issue in this commit: https://github.com/bodkan/SLiM/commit/7073e7ffc37de882b3b6f2a5f854d63399056df3

I've uploaded a testing recombination map file and a simple SLiM configuration script generated from that file, which should be enough to reproduce the bug.

Let me know if this makes any sense!

Segmentation fault with chromosome painting

Hi,

I have been running a script uses mutations to track the hybrid index of two populations upon a secondary contact event. I have to track mutations since fitness is based on the hybrid index. However during certain runs segmentation faults (core dumped) occur.

I think that you should be able to reproduce the error with the following information.

// Initial random seed:
1889190714725

// RunInitializeCallbacks():
initializeMutationRate(0);
initializeRecombinationRate(1e-05);
initializeMutationType(2, 0.5, "f", 0);
initializeGenomicElementType(1, m2, 1);
initializeGenomicElement(g1, 0, 149999);

// Starting run at generation <start>:
1

Segmentation fault

Script:

initialize() {
	
	defineConstant('selection_strength', 0.015);
	defineConstant('migration', 0.001);
	defineConstant('seq_len', 1.5e5);
	defineConstant('recombination_rate', 1e-6);
	
	initializeMutationRate(0.0);
	initializeRecombinationRate(1e-5);

	// m2 mutation for pop2 = mutation labeling individuals from pop2
	//mutation have fixed effect size 0.0 == neutral
	initializeMutationType("m2", 0.5, "f", 0.0);
	
	//make sure fixed mutations are still counted with countMutationsOfType(m2)
	m2.convertToSubstitution = F;
	
	//genomic element g1 only contains mutations of type m2	
	initializeGenomicElementType("g1", m2, 1.0);
	//whole genome can potentially carry a mutation of type m2
	initializeGenomicElement(g1, 0, asInteger(seq_len)-1);
	
}

function (float)FitnessFunction(float HI, float S){
	
	return 1 - S*(4*HI*(1-HI));
}

1 {
	sim.addSubpop("p1", 2000);
	sim.addSubpop("p2", 2000);
	p1.setMigrationRates(p2, migration);
	p2.setMigrationRates(p1, migration);
}


1 late(){
	//label all genomes of individuals from pop2 with an m2 mutation
	p2.genomes.addNewMutation(m2, 0.0, 0:(asInteger(seq_len)-1));
}

1: late(){
	for (pop in c(p1, p2)){
	inds = pop.individuals;
	//phenotypes == hybriditity index
	phenotypes = inds.countOfMutationsOfType(m2)/(2*seq_len);
	inds.fitnessScaling =FitnessFunction(phenotypes, selection_strength);
	//save the hybriditiy index as in the by SLiM provided tagF variable slot for each individual		
	inds.tagF = phenotypes;}	
	}

1000 late(){sim.simulationFinished();}

LTO-related compilation problems

Hi @molpopgen. The LTO stuff you contributed to SLiM in now out in the 3.2.1 release (thanks!), and a user sent me the following:

I had a little trouble compiling on my ubuntu 14.04 machine.

cmake seemed to work fine. But there were many compilation errors similar to:
/usr/bin/ranlib: pow_int.c.o: plugin needed to handle lto object

I added the following lines to CMakeLists.txt:

SET(CMAKE_AR "gcc-ar")
SET(CMAKE_C_ARCHIVE_CREATE "<CMAKE_AR> qcs <LINK_FLAGS> ")
SET(CMAKE_C_ARCHIVE_FINISH true)

SET(CMAKE_CXX_ARCHIVE_CREATE "<CMAKE_AR> qcs <LINK_FLAGS> ")
SET(CMAKE_CXX_ARCHIVE_FINISH true)

It then compiled, and all tests passed. Thought I would pass this on, in case it might be of use to others (though possibly it is an unusual issue, or easily solved by someone that compiles a lot of c++).

I have no idea idea what this all means; I'm hoping you do. :-> I'm not sure where in CMakeLists.txt to add that, much less what it means or what the problem is. If you grok this, could you possibly submit a pull request that fixes it? Thanks!

Ability to run command-line slim using stdin as input

Using the command-line utility, it would be useful to be able to pipe commands into slim, rather than have to rely on a file in the filesystem. E.g. I'd hope to be able to do

cat test.txt | slim

as well as

slim test.txt

This would allow me to e.g. generate an eidos script in python, and pipe it into slim, without having to use a temporary file as an intermediate step. You can do this with python scripts: the python man page says

... when called with a file name argument or with a file as standard input, it reads and executes a script from that file; ..... In non-interactive mode, the entire input is parsed before it is executed.

avoid extra copy on load of tree sequences?

When loading a tree sequence, SLiM makes an extra copy of it due to a restriction that - I think - has been removed. According to this note, tsk_table_collection_load( ) now allocates columns by default, which is I think what we needed here?

This could be the cause of some early spikes in memory usage we've seen.

Add support for installation in Bioconda

Apparently Bioconda (https://bioconda.github.io) is a thing people often use to install software in the Linux and Mac worlds. A user has requested that we add SLiM to it. This would require more Linux experience, etc., than I really have, I think, but @petrelharp has expressed possible willingness to take it on. I'm not sure how this normally works, but I imagine we would only want tagged releases of SLiM to be available through Bioconda; perhaps each time we did a new tagged release we would need to submit a new Bioconda configuration file or something...?

passing NAN to rgb2color segfaults

The following code

rgb2color(c(0/0, 0, 0));

produces a segfault, both on linux and OSX.

undefined reference to `clock_gettime' in eidos_functions.cpp

Hi I am trying to complie SLiM-v3.3.1 in our linux server, but it returns errors:

Linking CXX executable slim
CMakeFiles/slim.dir/eidos/eidos_functions.cpp.o: In function `Eidos_ExecuteFunction_clock(Eidos_intrusive_ptr<EidosValue> const*, int, EidosInterpreter&)':
eidos_functions.cpp:(.text+0xb36a): undefined reference to `clock_gettime'
CMakeFiles/slim.dir/eidos/eidos_functions.cpp.o: In function `Eidos_ExecuteLambdaInternal(Eidos_intrusive_ptr<EidosValue> const*, EidosInterpreter&, bool)':
eidos_functions.cpp:(.text+0x268cb): undefined reference to `clock_gettime'
eidos_functions.cpp:(.text+0x26946): undefined reference to `clock_gettime'
collect2: error: ld returned 1 exit status
make[3]: *** [slim] Error 1
make[2]: *** [CMakeFiles/slim.dir/all] Error 2
make[1]: *** [CMakeFiles/slim.dir/rule] Error 2
make: *** [slim] Error 2

But I can successfully compile SLiM-v3.3 after fixing build error on CenTOS 7 under gcc6.1.0.

CMake: add support for -lrt if needed?

In the initial conda-forge package conda-forge/staged-recipes#10613 we hit problems with real time clock stuff (see the PR for details). Perhaps a solution is to detect if -lrt is needed in the CMake and include it if so?

cc @brnorris03

add metadata recording to tree sequences

Over at #4 we currently record, in binary, as metadata:

mutations: mutaiton type ID, selection coefficient, subpopulation index, and time of origin
nodes: genome ID

We also have a method to get

individual: age, sex, x, y, and z

and can record it in node metadata as well... but we need to do a few things to make that happen:

combine genome ID with the individual information
get the individual information to RecordNewGenome somehow (some places it could be easily passed in, but less easily in DoCrossoverMutation... ?)
resolve: should we also add subpop ID to the individual info?

Write python tools to interface with tskit.

We'll need methods to:

read the stuff we store in metadata for individuals, sites, mutations, and nodes
write to the metadata to use when initializing SLiM with a .trees file
add info to the provenance table that SLiM expects when reading in a .trees file as "neutral burn-in"

... maybe that's it?

document tree sequence format more

It's come to my attention that we don't explain that the first generation individuals are present in the .trees output file anywhere (that I could find?) in the manual. This, and probably a few more things, should be in a section in "Output methods". I'm happy to write this.

Build failures on Linux

The command line version from messerlab.org is failing to build on my Linux machine. The machine is an Ubuntu 15.10:

Distributor ID: Ubuntu
Description: Ubuntu 15.10
Release: 15.10
Codename: wily

I see errors with the following compilers:

g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.2.1-22ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2)

clang++-3.7 (I edited Makefile to use this).

I get the same errors with both. This is the list of errors from clang++ 3.7:

./core/slim_sim.cpp:1311:23: error: invalid operands to binary expression ('const std::type_info' and
'const std::type_info')
if ((typeid(_sig) != typeid(_previous_sig)) ||
~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~
1 error generated.
./eidos/eidos_functions.cpp:794:23: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_smulll_overflow(product, operand, &product);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

^~~

./eidos/eidos_functions.cpp:855:23: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_saddll_overflow(sum, operand, &sum);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_functions.cpp:1284:23: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_smulll_overflow(old_product, temp, &product);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

^~~

./eidos/eidos_functions.cpp:1357:23: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_saddll_overflow(old_sum, temp, &sum);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

4 errors generated.
./eidos/eidos_interpreter.cpp:1449:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_saddll_overflow(first_operand, second_operand, &add_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1471:23: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_saddll_overflow(first_operand, second_operand, &add_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1496:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_saddll_overflow(singleton_int, second_operand, &add_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1520:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_saddll_overflow(first_operand, singleton_int, &add_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1665:21: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_ssubll_overflow(0, operand, &subtract_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1685:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_ssubll_overflow(0, operand, &subtract_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1739:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_ssubll_overflow(first_operand, second_operand, &subtract_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1761:23: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_ssubll_overflow(first_operand, second_operand, &subtract_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1786:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_ssubll_overflow(singleton_int, second_operand, &subtract_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:1810:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_ssubll_overflow(first_operand, singleton_int, &subtract_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2089:21: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_smulll_overflow(first_operand, second_operand, &multiply_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2111:22: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_smulll_overflow(first_operand, second_operand, &multiply_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2199:21: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
bool overflow = Eidos_smulll_overflow(first_operand, singleton_int, &multiply_result);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2935:25: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_saddll_overflow(operand1_value, operand2_value, &operand1_value);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2944:25: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_ssubll_overflow(operand1_value, operand2_value, &operand1_value);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2953:25: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_smulll_overflow(operand1_value, operand2_value, &operand1_value);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2974:26: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_saddll_overflow(int_vec_value, operand2_value, &int_vec_value);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:152:76: note: expanded from macro 'Eidos_saddll_overflow'

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2986:26: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_ssubll_overflow(int_vec_value, operand2_value, &int_vec_value);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:153:76: note: expanded from macro 'Eidos_ssubll_overflow'

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

^~~

./eidos/eidos_interpreter.cpp:2998:26: error: cannot initialize a parameter of type 'long long *' with an rvalue of type
'int64_t *' (aka 'long *')
...bool overflow = Eidos_smulll_overflow(int_vec_value, operand2_value, &int_vec_value);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./eidos/eidos_global.h:154:76: note: expanded from macro 'Eidos_smulll_overflow'

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

blank lines in vcf files

Hi SLiM team,

I am running simulations in SLiM before analysing them in R with the package vcfR. My session kept crashing when I tried to read in specific vcf files. After inspecting the files, I discovered they aren't being read in as they have consecutive blank rows. This only seems to happen with simplifyNucleotides=T inside the function, specifically when back mutations are removed. I'm able to work round this in R but still thought it worth making you aware if you weren't already.

The attached script outputs two files: "broken.vcf", with consecutive blank lines before eg POS=346; and "fixed.vcf" which does not.

simnuc.slim.zip

Many thanks,
Tom

informative error message when command-line string variable declarations don't parse

Right now, running

> slim -d x=5 recipe.slim

as does

> slim -d "x=5" recipe.slim

but the following fails with an uniformative error:

> slim -d x='mu' recipe.slim
// Initial random seed:
1847778891683

terminate called after throwing an instance of 'std::runtime_error'
  what():  A runtime error occurred in Eidos
Aborted

and it can be hard to hit upon the right combination of quotes that make it work (although this does appear in Section 17.2 in the manual):

> slim -d "x='mu'" recipe.slim

If this threw an error that said "Command-line definitions did not parse correctly." that would help new users.

Segmentation fault with complex mutation matrices

Hi everyone,

I've been experimenting with some DNA evolution simulations in SLiM 3.0 and I recently noticed an issue that arises when we try to use complex mutation matrices (i.e. F81) in a clonal evolution setting. I tried to get to the bottom of the problem using many different combinations of parameters, and in the end it seems like the problem consistently arises when:

We set the cloning rate > 0.0.
We use a mutation matrix that's different from K80 or JC69. Even symmetric mutation matrices result in segmentation fault errors.

How to replicate the issue:

This is a snippet of code that should reproduce the error that I'm seeing:

initialize() {
        defineConstant("L", 100);
        defineConstant("mode", "F81");
        defineConstant("mu", 1e-5);
        initializeSLiMOptions(nucleotideBased=T);
        initializeAncestralNucleotides(randomNucleotides(L));
        initializeMutationTypeNuc("m1", 0.5, "f", 0.0);

        if (mode == "JC69") {
                mut_matrix = mmJukesCantor(mu);
        }
        else if (mode == "K80") {
                mut_matrix = mmKimura(mu, 2.*mu);
        }
        else if (mode == "symmetric"){
                mut_matrix = matrix(c(0, 3.*mu, 4.*mu, 1.*mu,
                                      3.*mu, 0, 2.*mu, 3.*mu,
                                      4.*mu, 2.*mu, 0, 1.*mu,
                                      1.*mu, 3.*mu, 1.*mu, 0), ncol=4);
        }
        else {
                mut_matrix = matrix(c(0, 3.*mu, 3.*mu, 3.*mu,
                                      2.*mu, 0, 2.*mu, 2.*mu,
                                      4.*mu, 4.*mu, 0, 4.*mu,
                                      1.*mu, 1.*mu, 1.*mu, 0), ncol=4);
        }

        print(mut_matrix);

        initializeGenomicElementType("g1", m1, 1.0, mut_matrix);
        initializeGenomicElement(g1, 0, L-1);
        initializeRecombinationRate(0.0);
}

1 {
        sim.addSubpop("p1", 500);
        p1.setCloningRate(1.0);
}

2000 late() { sim.outputFixedMutations(); }

When I run this, I get the following error message:

// Initial random seed:
2888021158675

// RunInitializeCallbacks():
initializeSLiMOptions(nucleotideBased = T);
initializeAncestralNucleotides("TAACTACGAACGTCGGAAGG...");
initializeMutationTypeNuc(1, 0.5, "f", 0);
      [,0]  [,1]  [,2]  [,3]
[0,]     0 2e-05 4e-05 1e-05
[1,] 3e-05     0 4e-05 1e-05
[2,] 3e-05 2e-05     0 1e-05
[3,] 3e-05 2e-05 4e-05     0
initializeGenomicElementType(1, m1, 1);
initializeGenomicElement(g1, 0, 99);
initializeRecombinationRate(0);

// Starting run at generation <start>:
1 

Segmentation fault (core dumped)

Any help with this would be appreciated.

Add "add mutations" option to treeSeqOutput( )

To make it easier on folks, we should add a simple argument (the neutral mutation rate) to the tree seq output function, that will (a) load the tables into a tree sequence, (b) add mutations, and (c) then write out the .trees file.

This depends on some other things: currently I don't think we have the ability in tskit to add mutations to an existing file (as opposed to making a whole new sites/mutations tables).

working directory in qtslim

It looks like, from the error messages:

willExecuteScript: Unable to set the working directory to  /home/peter/Desktop  (error  2 )

that even if I start qtslim from the command line, it tries to set its working directory to ~/Desktop. It produces the same message every time I hit the 'recycle' button, also. (It gives me an error message rather than just setting it beause I don't have a ~/Desktop directory.)

I think that ~/Desktop is a reasonable default for the working directory if SLiM is started from the GUI, but if started from the command line then it should stick to the working directory it inherits from there. (So, if I run qtslim from the directory ~/my/slim/work, then SLiM's working directory should be ~/my/slim/work unless it is told otherwise - this is in fact true already on my system since SLiM tries to reset it to the Desktop but fails.) It's also probably harmless that it tries to reset working directories various times, but a little worrisome - it should just be able to fix it at the start if it's some wierd GUI-related directory, then leave it along.

Issue Installing SLiM

Hi I am trying to install the software
when I enter make command:
The error I get is:


Scanning dependencies of target gsl
[  1%] Building C object CMakeFiles/gsl.dir/gsl/blas/blas.c.o
[  2%] Building C object CMakeFiles/gsl.dir/gsl/linalg/cholesky.c.o
[  2%] Building C object CMakeFiles/gsl.dir/gsl/complex/math.c.o
[  3%] Building C object CMakeFiles/gsl.dir/gsl/complex/inline.c.o
[  4%] Building C object CMakeFiles/gsl.dir/gsl/err/error.c.o
[  5%] Building C object CMakeFiles/gsl.dir/gsl/err/stream.c.o
[  6%] Building C object CMakeFiles/gsl.dir/gsl/err/message.c.o
[  6%] Building C object CMakeFiles/gsl.dir/gsl/cdf/gauss.c.o
[  7%] Building C object CMakeFiles/gsl.dir/gsl/cdf/tdist.c.o
[  8%] Building C object CMakeFiles/gsl.dir/gsl/cblas/dgemv.c.o
[  9%] Building C object CMakeFiles/gsl.dir/gsl/cblas/dtrmv.c.o
[ 10%] Building C object CMakeFiles/gsl.dir/gsl/cblas/ddot.c.o
[ 10%] Building C object CMakeFiles/gsl.dir/gsl/cblas/dtrsv.c.o
[ 11%] Building C object CMakeFiles/gsl.dir/gsl/cblas/xerbla.c.o
[ 12%] Building C object CMakeFiles/gsl.dir/gsl/sys/coerce.c.o
[ 13%] Building C object CMakeFiles/gsl.dir/gsl/sys/infnan.c.o
[ 13%] Building C object CMakeFiles/gsl.dir/gsl/sys/pow_int.c.o
[ 14%] Building C object CMakeFiles/gsl.dir/gsl/sys/minmax.c.o
[ 15%] Building C object CMakeFiles/gsl.dir/gsl/sys/fdiv.c.o
[ 16%] Building C object CMakeFiles/gsl.dir/gsl/randist/cauchy.c.o
[ 17%] Building C object CMakeFiles/gsl.dir/gsl/randist/gauss.c.o
[ 17%] Building C object CMakeFiles/gsl.dir/gsl/randist/gamma.c.o
[ 18%] Building C object CMakeFiles/gsl.dir/gsl/randist/geometric.c.o
[ 19%] Building C object CMakeFiles/gsl.dir/gsl/randist/mvgauss.c.o
[ 20%] Building C object CMakeFiles/gsl.dir/gsl/randist/shuffle.c.o
[ 21%] Building C object CMakeFiles/gsl.dir/gsl/randist/exponential.c.o
[ 21%] Building C object CMakeFiles/gsl.dir/gsl/randist/multinomial.c.o
[ 22%] Building C object CMakeFiles/gsl.dir/gsl/randist/discrete.c.o
[ 23%] Building C object CMakeFiles/gsl.dir/gsl/randist/weibull.c.o
[ 24%] Building C object CMakeFiles/gsl.dir/gsl/randist/gausszig.c.o
[ 25%] Building C object CMakeFiles/gsl.dir/gsl/randist/poisson.c.o
[ 25%] Building C object CMakeFiles/gsl.dir/gsl/randist/beta.c.o
[ 26%] Building C object CMakeFiles/gsl.dir/gsl/randist/binomial_tpe.c.o
[ 27%] Building C object CMakeFiles/gsl.dir/gsl/randist/lognormal.c.o
[ 28%] Building C object CMakeFiles/gsl.dir/gsl/vector/init.c.o
[ 29%] Building C object CMakeFiles/gsl.dir/gsl/vector/vector.c.o
[ 29%] Building C object CMakeFiles/gsl.dir/gsl/vector/oper.c.o
[ 30%] Building C object CMakeFiles/gsl.dir/gsl/matrix/init.c.o
[ 31%] Building C object CMakeFiles/gsl.dir/gsl/matrix/copy.c.o
[ 32%] Building C object CMakeFiles/gsl.dir/gsl/matrix/submatrix.c.o
[ 33%] Building C object CMakeFiles/gsl.dir/gsl/matrix/rowcol.c.o
[ 33%] Building C object CMakeFiles/gsl.dir/gsl/matrix/swap.c.o
[ 34%] Building C object CMakeFiles/gsl.dir/gsl/matrix/matrix.c.o
[ 35%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/log.c.o
[ 36%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/erfc.c.o
[ 37%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/gamma.c.o
[ 37%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/pow_int.c.o
[ 38%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/gamma_inc.c.o
[ 39%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/exp.c.o
[ 40%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/elementary.c.o
[ 40%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/expint.c.o
[ 41%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/zeta.c.o
[ 42%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/beta.c.o
[ 43%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/psi.c.o
[ 44%] Building C object CMakeFiles/gsl.dir/gsl/specfunc/trig.c.o
[ 44%] Building C object CMakeFiles/gsl.dir/gsl/rng/mt.c.o
[ 45%] Building C object CMakeFiles/gsl.dir/gsl/rng/taus.c.o
[ 46%] Building C object CMakeFiles/gsl.dir/gsl/rng/inline.c.o
[ 47%] Building C object CMakeFiles/gsl.dir/gsl/rng/rng.c.o
[ 48%] Building C object CMakeFiles/gsl.dir/gsl/block/init.c.o
Linking C static library libgsl.a
[ 48%] Built target gsl
Scanning dependencies of target tables
[ 49%] Building C object CMakeFiles/tables.dir/treerec/tskit/avl.c.o
[ 50%] Building C object CMakeFiles/tables.dir/treerec/tskit/tables.c.o
[ 50%] Building C object CMakeFiles/tables.dir/treerec/tskit/util.c.o
[ 51%] Building C object CMakeFiles/tables.dir/treerec/tskit/hapgen.c.o
[ 52%] Building C object CMakeFiles/tables.dir/treerec/tskit/kastore/kastore.c.o
[ 53%] Building C object CMakeFiles/tables.dir/treerec/tskit/object_heap.c.o
[ 54%] Building C object CMakeFiles/tables.dir/treerec/tskit/vargen.c.o
[ 54%] Building C object CMakeFiles/tables.dir/treerec/tskit/uuid.c.o
[ 55%] Building C object CMakeFiles/tables.dir/treerec/tskit/text_input.c.o
[ 56%] Building C object CMakeFiles/tables.dir/treerec/tskit/newick.c.o
[ 57%] Building C object CMakeFiles/tables.dir/treerec/tskit/tree_sequence.c.o
[ 58%] Building C object CMakeFiles/tables.dir/treerec/tskit/fenwick.c.o
Linking C static library libtables.a
[ 58%] Built target tables
Scanning dependencies of target eidos
[ 58%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_rng.cpp.o
[ 59%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_ast_node.cpp.o
[ 60%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_value.cpp.o
[ 61%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_global.cpp.o
[ 61%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_test.cpp.o
[ 62%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_interpreter.cpp.o
[ 63%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_type_table.cpp.o
[ 64%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_token.cpp.o
[ 65%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_type_interpreter.cpp.o
[ 65%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_beep.cpp.o
[ 66%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_call_signature.cpp.o
[ 67%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_script.cpp.o
[ 68%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_symbol_table.cpp.o
[ 69%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_test_element.cpp.o
[ 69%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_property_signature.cpp.o
[ 70%] Building CXX object CMakeFiles/eidos.dir/eidos/eidos_functions.cpp.o
[ 71%] Building CXX object CMakeFiles/eidos.dir/eidostool/main.cpp.o
Linking CXX executable eidos
/usr/bin/ld: cannot open output file eidos: Is a directory
collect2: error: ld returned 1 exit status
make[2]: *** [eidos] Error 1
make[1]: *** [CMakeFiles/eidos.dir/all] Error 2
make: *** [all] Error 2

Self Test Issues

As far as I can tell, SLiMs self tests (slim -testEidos / slim -testSLiM) use hardcoded paths to generate files that are used for testing. These are not cleared up afterwards.

If you use the self tests then:

1./ your /tmp will contain a large number of slim related files which may or may not be an issue.
2./ more importantly, if another users has already run self tests, and you run them again, some of the files will have the same names, and the 2nd user won't be able to overwrite them.

When this occurs, it seems to generate a segfault.

This causes the tests to fail and can lead you to thinking that you have wasted a fair bit of time trying to make a working binary.

You may even be unlucky enough to delete it all and try rebuilding several times again, only to find that the self tests would have worked if you had cleared /tmp before running them...

`self' undefined inside functions, except if used in the calling SLiMEidosBlock scope

initialize() {
    initializeMutationRate(0);
    initializeMutationType("m1", 0.5, "f", 0);
    initializeGenomicElementType("g1", m1, 1.0);
    initializeGenomicElement(g1, 0, 1000-1);
    initializeRecombinationRate(0);
}

1 {
	//self;
	myfunc();
}

function (integer$)myfunc(void) {
	return self.id;
}

The above code produces this error with SLiM 3.3:

ERROR (EidosSymbolTable::_GetValue_RAW): undefined identifier self.

Error on script line 2, character 8 (inside runtime script block):

   return self.id;

But strangely, when the //self; line is uncommented, the code runs fine. I could understand if self were never available in the function scope, or if it were always available there, but being dependent on usage in the script block scope seems wrong.

documentation clarification

I've heard the docs on two issues have led to confusion. On p.395:

The simplificationRatio parameter controls how often automatic simplification of the recorded
tree sequence occurs. SLiM will try to find an optimal generation interval for simplification such that
the ratio of the memory used by the tree sequence tables, (before:after) simplification, is close to the
requested ratio. The default of 10 thus requests that SLiM try to find a generation interval such that
automatic simplification reduces the memory usage by a factor of ten. INF may be supplied as a
special value indicating that automatic simplification should never occur; 0 may be supplied to
indicate that automatic simplification should be performed at the end of every generation.

The phrase "reduces the memory usage" can be read as meaning that the maximum memory usage is reduced, but that's not what is meant here. Instead, how about:

The simplificationRatio parameter controls how often automatic simplification of the recorded
tree sequence occurs. This is a speed-memory tradeoff: more frequent simplification (lower simplificationRatio) means the stored tree sequences will use less memory, but at a cost of somewhat longer run times.  Conversely, a larger simplificationRatio means that SLiM will wait longer between simplifications. SLiM will try to find an optimal generation interval for simplification such that
the ratio of the memory used by the tree sequence tables, (before:after) simplification, is close to the
requested ratio. The default of 10 thus requests that SLiM try to find a generation interval such that
the maximum size of the stored tree sequences is ten times the size after simplification. INF may be supplied as a special value indicating that automatic simplification should never occur; 0 may be supplied to indicate that automatic simplification should be performed at the end of every generation.

And, Section 16.6, "Measuring the coalescence time of a model" says that coalescence implies equilibrium has been reached, which we now know is wrong. How about (edits marked in **)

A common problem in forward simulation is deciding how to conduct model burn-in. A burn-
in period is a period of simulation executed in order to arrive at an appropriate starting state for the
model one wishes to execute; often this starting state is for a neutral model at equilibrium, but not
necessarily so. But how long should the burn-in period be, to provide such an equilibrium state?
** Running until coalescence ensures that no genetic diversity in the population could date back to the
start of the simulation, so all polymorphic loci derive from after the start of the simulation. However,
to attain equilibrium takes longer: in a truly neutral population of constant size, a total of two or three times the time required to reach coalescence should suffice. ** But that just kicks the can down the
road; how long does it take to achieve coalescence?
...
Coalescence checking will work in all types of models (with the above caveats about timing and
granularity), regardless of selection, population structure, etc. However, coalescence is only a
**useful indicator of the time scale needed for equilibration** in fairly simple neutral models. If model dynamics during burn-in change over time, or are substantially non-neutral, coalescence may be a poor indicator of equilibration; indeed, such models may not even have an equilibrium state. What constitutes a proper burn-in for such models is a difficult question.

encode which individuals are being output in a VCF somehow

As discussed in #42, it ought to be possible to know from the VCF which individuals have been output. This only makes sense when individuals actually have unique identifiers, which is I think only if they have pedigreeIDs.

There, Ben said:

So assuming that works, that gets you identifiers on the .trees side. On the VCF side, it's a good idea for each genome in the VCF file to be annotated with the same identifiers, but that presently is not done. Where would such per-sample information typically be put, in a VCF file? I'm looking at the VCF 4 spec and not seeing an obvious spot for per-sample annotations, which perhaps is why I didn't already do this (but probably I'm just missing the obvious). :-> If you can suggest a good way for it to be encoded in the VCF, I could check that improvement in on GitHub within a day or two.

The last line in the header has the sample IDs, which you currently fill out like i0 i1 i2 .... These could be replaced with like i... if these are defined. This would be a good idea, I think.

That would still provide only the individual IDs, whereas it is likely the genome IDs that people want to get at.

Hm, I'm not sure about that. Most properties are individual-based (e.g., phenotype) and in both SLiM and in the tree sequence you can get the genome IDs from the individual IDs.

outputFull does not record (or, reload?) pedigree IDs

I don't know if outputFull does not record pedigree IDs, or if it doesn't reload them, but either way: saving out the population state with outputFull and then reloading it will end up with individuals with different pedigreeIDs.

Having them agree would be necessary for matching individuals across segments of a simulation that is run in pieces, (eg simulate->outputFull->reload->simulate->...).

However, outputting and reloading from tree sequences does preserve pedigreeIDs, so this is not a serious problem - but perhaps a gotcha. I thought you should at least know about this.

Check that chromosome lengths agree in `readFromPopulationFile()`

@gtsambos points out that reading in from a tree sequence file gives an inscrutible error if the chromosome length in SLiM and in the tree sequence file don't match. Here's a MWE (note that the length is only off-by-one: the length in SLiM should be 499

import msprime, pyslim
ts = msprime.simulate(sample_size=100, length=500)
new_ts = pyslim.annotate_defaults(ts, model_type="WF", slim_generation=1)
new_ts.dump("input.trees")

and

initialize() {
	initializeSLiMModelType("WF");
	initializeTreeSeq();
	initializeMutationRate(0);
	initializeMutationType("m1", 0.5, "f", 0.0);
	initializeGenomicElementType("g1", m1, 1.0);
	initializeGenomicElement(g1, 0, 500);
}
1 late() { 
   sim.readFromPopulationFile("input.trees");	
}
20 late() {
	sim.simulationFinished();
}

port SLIM to Windows (incl. optional external GSL in CMake)

It would simplify things considerably from the perspective of packaging SLiM in conda if there was an option to use an external GSL library in the CMake. Particularly for supporting Windows, where GSL has to be extensively patched to compile, it would make life a lot easier.

Unless GSL has been locally modified, I don't think this would affect SLiM itself? The default should stay as having GSL build locally so normal users aren't affected, but experts should have the option of linking to a system GSL.

Temporarily store individuals associated with tree-sequence nodes, culling as necessary during simplification

At the moment ~~the tree-sequence simplification process removes individuals from the tree sequence unless they are currently alive, or have been flagged by treeSeqRememberIndividuals()~~ individuals are only kept in the tree sequence if they are explicitly Remembered, or are present at the end of the simulation (see #10). But there is a use-case for retaining in the final tree sequence those individuals containing a coalescence point involving the current samples (i.e. all individuals that have an associated node in the tree sequence). There is discussion of this at
https://groups.google.com/d/msgid/slim-discuss/629168f4-4bb6-463a-b0d3-e3281787dceb%40googlegroups.com

Error: 'isfinite' was not declared in this scope

Hello, I am trying to build SLiM on a Linux (CentOS 7.x) machine, and I'm getting errors about variables not being defined. I've attached the standard output and standard error from running:

make slim

For example, this is the first error I encounter:

./core/subpopulation.cpp: In member function 'virtual EidosValue_SP Subpopulation::ExecuteInstanceMethod(EidosGlobalStringID, const EidosValue_SP*, int, EidosInterpreter&)':
./core/subpopulation.cpp:2815:28: error: 'isfinite' was not declared in this scope
     if (!isfinite(range_min) || !isfinite(range_max) || (range_min >= range_max))
                            ^
./core/subpopulation.cpp:2815:28: note: suggested alternative:
In file included from ./eidos/eidos_rng.h:56:0,
                 from ./core/subpopulation.h:34,
                 from ./core/subpopulation.cpp:21:
/software/gcc-6.2-el7-x86_64/include/c++/6.2.0/cmath:605:5: note:   'std::isfinite'
     isfinite(_Tp __x)
     ^~~~~~~~

Perhaps these are namespace issues.

I've downloaded the source from the Messer Lab website, which I believe is version 2.3 (build 1052; Eidos version 1.3) according to the VERSIONS file. I'm using gcc 6.2.0 which I believe is compliant with the C++11 standard.

Please let me know if there is anything missing in my setup (incorrect compilers, missing libraries, etc). I'm also happy to provide more details about my computing environment, as needed. If you prefer, I can post this issue to your "slim-discuss" mailing list.

make_slim.err.gz
make_slim.out.gz

Problem installing SLiM

From the documentation:

cmake -D CMAKE_BUILD_TYPE=Release -D -DCMAKE_INSTALL_PREFIX:PATH=/path/to/install ../SLiM
make
make install

Result:

CMake Warning:
  Manually-specified variables were not used by the project:

    -DCMAKE_INSTALL_PREFIX

make ignores the variable and attempts to install in "/usr/local/bin".

Invalid child values in large simulations

I'm basically moving tskit-dev/tskit#402 to here. I submitted number #61 to trigger the bug.

The code in #61 gives the output shown below when used with the script that follows after

terminate called after throwing an instance of 'std::runtime_error'
  what():  child index out of range: -704, 214748263

The two numbers are a child index and the length of the node table. The script is:

initialize() {
        // initializeSLiMModelType("nonWF");
        initializeTreeSeq(simplificationInterval=500);
        // defineConstant("Np", 50);
        // defineConstant("K", 1000000);        // total carrying capacity
        // defineConstant("Kp", asInteger(K / Np));

        initializeMutationType("m1", 0.5, "f", 0.0);
        // m1.convertToSubstitution = T;

        initializeGenomicElementType("g1", m1, 1.0);
        initializeGenomicElement(g1, 0, 600);
        initializeMutationRate(0);
        initializeRecombinationRate(0);
}
// reproduction() {
//      subpop.addCrossed(individual, subpop.sampleIndividuals(1));
// }
1 early() {
        print(time());
    sim.addSubpop("p1",1000000);
        // for (i in 1:Np)
        //      sim.addSubpop(i, Kp);
}
// early() {
        // for (subpop in sim.subpopulations)
    //     print(subpop.individualCount);
        // for (subpop in sim.subpopulations)
        //      subpop.fitnessScaling = Kp / subpop.individualCount;
// }
500 late() {
        print(time());
        sim.outputFixedMutations();
}

It seems that there are many fewer than expected nodes? Also, it really isn't clear how the negative child index is happening.

adding subpops in `early` leads to two generations in one, breaking tree sequence recording

Proposed fixes:

check when adding new individuals through an add subpop whether we're in early; if so, subtract one from their birth time
disallow this.

We're tentatively going with (1), which will be passed in as an argument to recordRecombination() (or, recordNewIndividual()?)

Allow for expressions in Eidos event times

I think a simple feature that could make Eidos a lot more extensible would be to allow for expressions in the generation that Eidos events are triggered. For example, (modifying an example from the manual):

initialize() { 
defineConstant("N", 1000);
initializeMutationRate(1e-7); 
initializeMutationType("m1", 0.5, "f", 0.0); 
initializeGenomicElementType("g1", m1, 1.0); 
initializeGenomicElement(g1, 0, 99999); 
initializeRecombinationRate(1e-8); 
} 

1 { sim.addSubpop("p1", N); } 

// start outputting after burnin, for twenty generations
(10*N):(10*N+20) late() { sim.outputFull(); }
10*N + 100 { sim.simulationFinished();  }

Overall, this would allow for burnins that are dependent on N, which is what the manual recommends and allow N to be varied across runs easily. I can imagine allowing for any arbitrary expression might slow things down too much, so perhaps an expression of constants only that's run at initialization would suffice.

Neutral-like behavior due to floating point error

We just ran into a problem with very large fitnesses and started to get inconsistent results due to floating point errors.

The problem appears to be that when we have two or more populations without migration, positive mutations that fix in one of them are not saved as substitutions, and thus always count toward fitness. If you have many of them, fitness goes to very large numbers. This becomes an issue because new mutations start behaving like neutral due to floating point error.

We think SLiM should at least throw an error when this happens.

Here I implemented a simple script that mimics this behavior:

// set up a simple neutral simulation
initialize() {
	initializeMutationRate(1e-7);
	
	// m1 mutation type: beneficial
	initializeMutationType("m1", 0.5, "f", 1.0);
	
	// g1 genomic element type: uses m1 for all mutations
	initializeGenomicElementType("g1", m1, 1.0);
	
	// uniform chromosome of length 100 kb with uniform recombination
	initializeGenomicElement(g1, 0, 99999);
	initializeRecombinationRate(1e-8);
}

// create a population of 500 individuals
1 {
	sim.addSubpop("p1", 1000);
}

100 {
	sim.addSubpopSplit(2, 1000, 1);
}

// print cached fitness
2000 { print(p1.cachedFitness(NULL)); }

Scripts report errors only for certain seeds?

Hi!

I'm simulating replicates using the attached eidos script with the command
slim -d var_age=${y} -d var_s=1 -d var_h=1.33 -t Slim3_50kb_HCG_BalSel-trackBoth.txt
and has recorded the seeds for each replicate after its completion. Two replicates errored out halfway (and produced incomplete outputs) with the error message

ERROR (EidosSymbolTable::_GetValue): undefined identifier p3. 
Error on script line 66, character 4:
    p3.outputMSSample(50);
    ^^

I traced back to the seeds for these two runs and was able to replicate the error message with the command
slim -d var_age=4501 -d var_s=1 -d var_h=1.33 -s 3515352887805 -t Slim3_50kb_HCG_BalSel-trackBoth.txt > temp.out
and
slim -d var_age=2501 -d var_s=1 -d var_h=1.33 -s 1990705088010 -t Slim3_50kb_HCG_BalSel-trackBoth.txt > temp.out

Meanwhile, if the seed wasn't specified in these commands, or another seed was specified, the run would be (mostly, given these two exceptions) smooth. I haven't tested the commands in another platform yet, though.

I know I can resolve this issue by using other seeds, but it just made me curious why this is happening.

Slim3_50kb_HCG_BalSel-trackBoth.txt

first generation locations not being correctly Remembered

... strangely. Here's a MWE:

recipe.slim:

initialize() {
	setSeed(123);
	initializeSLiMModelType("nonWF");
	initializeSLiMOptions(dimensionality="xy");
	initializeTreeSeq();
	
	initializeMutationType("m1", 0.5, "g", 0.0, 2);
	initializeGenomicElementType("g1", m1, 1.0);
	initializeGenomicElement(g1, 0, 9);
	initializeMutationRate(0.0);
	initializeRecombinationRate(1e-8);
	
}

reproduction() {
	mate = p1.sampleIndividuals(1);
    for (i in seqLen(2)) {
        offspring = subpop.addCrossed(individual, mate);
        offspring.setSpatialPosition(p1.pointUniform());
	}
	return;
}

1 early() {
	sim.addSubpop("p1", 10);
	p1.setSpatialBounds(c(0, 0, 1.0, 1.0));
	// random initial positions
	for (ind in p1.individuals) {
		ind.setSpatialPosition(p1.pointUniform());
	}
}

early() {
	p1.individuals.fitnessScaling = 0.5;
}

1: early() {
    for (ind in p1.individuals) {
        catn(c("early", ind.pedigreeID, ind.x, ind.y));
    }
}

1: late() {
    sim.treeSeqRememberIndividuals(p1.individuals);
    for (ind in p1.individuals) {
        catn(c("late", ind.pedigreeID, ind.x, ind.y));
    }
}

3 late() {
	sim.treeSeqOutput("first_gen.trees");;
	sim.simulationFinished();
}

After running that, run this ages.py:

import pyslim

ts = pyslim.load("first_gen.trees")

print("ped_id\tbirth\tage\tlocations\n")
for ind in ts.individuals():
    md = pyslim.decode_individual(ind.metadata)
    print("\t".join(map(str, [md.pedigree_id, ts.node(ind.nodes[0]).time, md.age] + list(ind.location))))

When I do that, SLiM tells me in output that, for instance, individual 1 has this position:

stage ped_id x y
early 1 0.2454 0.854505

but python says that:

ped_id	birth	age	locations
1	2.0	0	4.6390155e-317	0.0	0.0

I have no guesses why this is happening, currently. ?!?!

QtSLiM Travis-CI builds fail on macOS

Hi @brnorris03. I've accepted your pull requests and it looks like QtSLiM builds on Travis-CI using your CMake stuff – but only for Linux, not for macOS. I think we discussed this a bit on the old pull request #52, but didn't really get anywhere. I guess macOS is complicated because Qt could be installed by various routes (homebrew, macports, maybe also conda or something?), leading to different paths that need to be used for headers and linkage. I've opened this issue since the pull request is now closed. This is not presently urgent, but will become more important as we get toward wanting to merge the qtslim branch into master – maybe in two months or so?

Have `treeSeqRememberIndividuals` actually remember individuals?

This would be a useful thing to have, because some things we might want to have about ancestors (e.g. spatial location) is stored in the individual table, not in the node table.

But, if we do this, we need to think some things through:

Currently, we said that the individual table was in 1-to-1 mapping with the currently alive individuals. We could change this by adding an "alive" flag (either in metadata or in the flags column).
It was unclear what was meant by some properties (e.g., age and position) for non-alive individuals. I guess what they would mean is "whatever they were when the individuals got Remembered", which seems fine and consistent. We'd need a policy for what happens if someone got remembered twice (keep their data from the first time, maybe?).

Currently, we make the individual table anew whenver outputting the tree sequence. If we remember individuals, it would be easiest if the remembered individuals were always the first ones: then we would not have to ever update the individual property of the remembered nodes; when remembering new individuals, we would just append them on the end of the remembered-individual-table.

Proposal:

Keep a separate remembered_individual_table.
Whenever individuals are remembered, we check if they already are remembered, and if not, append them to the remembered_individual_table and append their nodes to the list of remembered node IDs.
Simplifying requires nothing new, because simplification does not (currently) touch the individual property of the node table, so retained nodes will still point to the right individuals. (need to check that simplify actually preserves the information, though)
On writing out tables, an individual table is added to the table collection by first copying in the remembered_individual_table, then adding on all alive individuals who aren't already remembered,
and setting the alive flag in all of these.

Note that we need an alive flag to cover the case where some individuals are both alive and remembered. (Well, if we were willing to rearrange the individual table, then we could guarantee that all alive individuals came at the end of the table, so we'd only need to store how may there were, but it seems cleaner to store the flag and not have to rearrange the table.)

system() waits for child to return, even when using &

The following script is accepted by both Rscript and eidos. The intended behaviour is that the sleep command is run in the background. This is what Rscript does, but the system() function in eidos always waits for the child process to complete. An alternative method to run a command in the background would be fine (e.g. a parameter to system()), but I have not found this functionality in the eidos manual.

system("sleep 5 &");
print("system() returned");

can't find the cause to CrosscheckTreeSeqIntegrity

Hi,
I have been doing some simulations in which I stop and restart them from a tree sequence, and save a new one. This can go on for ~6 iterations.

In some cases, I have been getting a weird error, e.g.:
ERROR (SLiMSim::CrosscheckTreeSeqIntegrity): (internal error) genome/allele size mismatch at position 365: the treeseq has 1 mutation of mutid 12, SLiM has 0 segregating and 0 fixed mutation(s).

@petrelharp has looked into the tree sequences and said they seem fine. You can try and load those tree sequences to SLiM and it will throw that error.

I put them here: http://deep.uoregon.edu/~murillor/weird_trees/

Do you have any idea why that would be happening? I am happy to provide more info on the type of simulation,etc. @petrelharp tried to come up with a minimal example but we weren't successful.

more than 2^26 individuals raises bad_alloc

This script:

initialize() {
    defineConstant("K", 67108865);  // 2^26 + 1
    initializeMutationType("m1", 0.5, "f", 0.0);
    initializeGenomicElementType("g1", m1, 1.0);
    initializeGenomicElement(g1, 0, 2);
    initializeMutationRate(0.0);
    initializeRecombinationRate(0.0);
}

1 early() {
    sim.addSubpop("p1", asInteger(K));
}

with SLiM built from HEAD, on Debian, produces this output:

// Starting run at generation <start>:
1 

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

With one fewer individuals, it does not.

Linux recipe to open an external pdf viewer (re: manual section 14.8)

Hi Ben,

I've made two recipes for you to look at, which have small diffs to the current recipe. The first uses xdg-open, which opens the pdf in the application that is associated with the pdf MIME type. This must be done after the pdf is created. As you indicate in section 14.8 of the SLiM manual, many pdf viewers will not check for file changes to refresh the pdf canvas.

With mupdf (my preferred pdf viewer), I can hit r within the application (or send a HUP signal) to refresh the canvas from the pdf file. If I do this repeatedly while SLiM is running the recipe, then mupdf sometimes exits because the pdf is invalid (there is a race with the R script). So, I've included a second recipe, specific to mupdf, which sends the HUP signal to mupdf after the R script has updated the pdf. It should be easily adapted to other pdf viewers, assuming they also have a way to update the pdf canvas.

-Graham

14.8-gg1.txt
14.8-gg2.txt

setSeed(rdunif(1, 0, asInteger(2^32)-1)) has short repeat period

The SLiM manual, section 9.2, suggests to reset the seed after restoring a simulation from a saved population file. However the recommended code can produce repeating sequences of seeds with a very short period.

for i in `seq 10`; do
  echo "setSeed($i); for (i in 1:1000000) { print(getSeed()); setSeed(rdunif(1, 0, asInteger(2^32)-1)); }" \
    | eidos /dev/stdin \
    | awk '{if (a[$1] > 1) {print "seed='$i': repeat at", NR", same as", a[$1]", delta =", NR-a[$1]; exit} a[$1]=NR}'
done

Produces the following output

seed=1: repeat at 77328, same as 73543, delta = 3785
seed=2: repeat at 22486, same as 19722, delta = 2764
seed=3: repeat at 90743, same as 86958, delta = 3785
seed=4: repeat at 87473, same as 83688, delta = 3785
seed=5: repeat at 97762, same as 93977, delta = 3785
seed=6: repeat at 45142, same as 29758, delta = 15384
seed=7: repeat at 31713, same as 27274, delta = 4439
seed=8: repeat at 94206, same as 90421, delta = 3785
seed=9: repeat at 56386, same as 52601, delta = 3785
seed=10: repeat at 29464, same as 25025, delta = 4439

I've changed my code to use setSeed(rdunif(1, 0, asInteger(2^62)-1)), which seems ok for now. But this seems like a very short repeat period for a 32 bit seed, so it might be worth taking a closer look at why this happens.

-G

eidos no longer takes FIFOs

Until recently, one could run simple commands through eidos on the command line like so:

eidos <(echo "2 + 3;")
# 5

What's going on here is that the <( ... ) syntax in bash makes a named pipe (see the bit on "process substitution"), and the output of the command (here, the text "2 + 3;") is put in a special file-like object with a name like /dev/fd/63; this can be treated just like a file by many programs. Note that this works on osX as well as linux.

However, ever since 91fbb3d, this resulted in the error message

ERROR (main): input file /dev/fd/63 is not a regular file (it might be a directory or other special file).

Was there a problem this was avoiding? Being able to pipe scripts in was really handy for quickly debugging things without a GUI (e.g., remotely). It also made it possible to do things like

slim <(cat script.slim | sed -e 's/CONST/20/g')

(I know it'd be easier to do this example with a defineConstant( ) and a -d flag, but you get the idea.)

`addNewDrawnMutation( )` improperly vectorized

As pointed out over here, there seems to be an issue when addNewDrawnMutation() is called with a vector of positions. A haphazard collection of the mutations after the first mutation are recorded in the MutationTable without metadata. I haven't checked if this is a problem only with tree sequence recording or if the actual mutations are wrong in SLiM.

Here's the MWE:

initialize() {
    // problem only happens about half the time
    setSeed(42);
	initializeTreeSeq();
	initializeMutationRate(0);
    initializeMutationType("m1", 0.5, "f", 0.1);
    defineConstant("chr_len", 10);
    initializeGenomicElementType("g1", m1, 1.0);
    initializeGenomicElement(g1, 0, chr_len-1);
    initializeRecombinationRate(0);
}

1 {
	sim.addSubpop("p1", 2);
}

1 late() {
	target = sample(sim.subpopulations.genomes, 1);
    balancing_pos = sample(0:(chr_len-1), 2);
    // THIS does not work:
	target.addNewDrawnMutation(m1, asInteger(balancing_pos));

    // THIS works:
    // for (pos in balancing_pos) {
    //     target.addNewDrawnMutation(m1, pos);
    // }

    sim.treeSeqOutput("balancing_output.trees");
    sim.simulationFinished();
}

and python:

import pyslim

ts = pyslim.load("balancing_output.trees")

print(ts.tables.mutations)

g = ts.genotype_matrix()

The genotype matrix call throws the error, but the problem is visible in the mutation table (as the second mutation has no derived state and no metadata); here is the output:

// RunInitializeCallbacks():
initializeTreeSeq();
initializeMutationRate(0);
initializeMutationType(1, 0.5, "f", 0.1);
initializeGenomicElementType(1, m1, 1);
initializeGenomicElement(g1, 0, 9);
initializeRecombinationRate(0);

// Starting run at generation <start>:
1 

id	site	node	derived_state	parent	metadata
0	0	4	1	-1	AQAAAM3MzD0BAAAAAQAAAP8=
1	1	4		-1	
Traceback (most recent call last):
  File "debug_mutate_to_vcf.py", line 7, in <module>
    g = ts.genotype_matrix()
  File "/home/peter/.local/lib/python3.6/site-packages/tskit-0.2.4.dev0-py3.6-linux-x86_64.egg/tskit/trees.py", line 3035, in genotype_matrix
    impute_missing_data=impute_missing_data)
_tskit.LibraryError: Inconsistent mutations: state already equal to derived state

Thanks to @josieparis for reporting this problem.

Adding fixed value onto selective coefficient of type "g" mutations

Hi Ben,

I'm trying to simulate overlapping functional regions, and got stuck at defining the selective coefficients for the mutations in overlapping part.

So I want to simulate a length>1 functional element A with arbitrary fixed selective coefficient s_a, and for the gene elements (exons, introns, etc.) it sits in, I want both the deleterious and neutral mutations (m1 and m2) in regions covered by A have s_a added onto their initial s, and started with defining new mutation types m1a and m2a for those in the overlapping part.

I was going to use fitness() function, and realized
fitness(m1a){return relFitness + s_a; }
would add an s_a every generation instead of just once. So I attempted to use an indicator/boolean variable to show the addition has (or has not) been done:

initialize(){
...
addflag=0;
}   
fitness(m1a){
  if(addflag == 0){
    return relFitness + s_a ; 
  }else return relFitness
}

But everytime I run it, it says

ERROR (EidosSymbolTable::_GetValue): undefined identifier addflag.
Error on script line 70, character 28:
    if(addflag==0){
       ^^^^^^^

I then tried replacing addflag=0; as defineConstant("addflag",0);, but then I'd get

ERROR (EidosSymbolTable::SetValueForSymbol): identifier 'addflag' cannot be redefined because it is a constant.

And now I'm confused and not sure what to do next... Is there a better way to do it?

Possibly provide the "SLiM_build" directory in the GitHub repo

Is there any reason not to provide an empty SLiM_build directory (possibly containing a readme.md) in the GitHub repo, which would save one step in the installation process (mkdir SLiM_build)? It would make some auto-installation procedures a little smoother, I think (e.g. I'm trying to install on Google Colab, and have to make that directory on installation, or check if it exists).

I'm guessing there might be some development reason what not, though.

messerlab / slim Goto Github PK

slim's People

Contributors

Stargazers

Watchers

Forkers

slim's Issues

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

define Eidos_saddll_overflow(a, b, c) __builtin_saddll_overflow((a), (b), (c))

define Eidos_ssubll_overflow(a, b, c) __builtin_ssubll_overflow((a), (b), (c))

define Eidos_smulll_overflow(a, b, c) __builtin_smulll_overflow((a), (b), (c))

Recommend Projects

Recommend Topics

Recommend Org

Jobs