Comments (19)
Worked on this a bit today. As of now, using the guided merge pair proposal gives:
Iter 1 max ll -41894.4 ESS: 8.49452
Iter 2 max ll -40501.4 ESS: 3.56237
Iter 3 max ll -39115.4 ESS: 4.6236
Iter 4 max ll -37910.1 ESS: 2.69429
Iter 5 max ll -36668.2 ESS: 1
Iter 6 max ll -35306.1 ESS: 1
Iter 7 max ll -34157.2 ESS: 1.11207
Iter 8 max ll -33017 ESS: 5.60864
Iter 9 max ll -31912.5 ESS: 1.70446
Iter 10 max ll -30840.6 ESS: 4.44762
Iter 11 max ll -29781.8 ESS: 1.97459
Iter 12 max ll -28655.3 ESS: 1.00026
Iter 13 max ll -27495.1 ESS: 1.14868
Iter 14 max ll -26416.5 ESS: 1.02748
Iter 15 max ll -25527.5 ESS: 1.10675
Iter 16 max ll -24689.3 ESS: 1.85626
Iter 17 max ll -23912.5 ESS: 1.2003
Iter 18 max ll -23170.8 ESS: 3.9946
Iter 19 max ll -22328.7 ESS: 1.45438
Iter 20 max ll -21630.5 ESS: 3.80698
Iter 21 max ll -20991.3 ESS: 2.15786
Iter 22 max ll -20419.7 ESS: 5.05691
Iter 23 max ll -19891.8 ESS: 5.35881
Iter 24 max ll -19583.4 ESS: 4.08674
Iter 25 max ll -19240.8 ESS: 1.79467
Iter 26 max ll -18881.7 ESS: 4.1033
Iter 27 max ll -18718.4 ESS: 22.7285
Iter 28 max ll -18625.4 ESS: 13.8119
Iter 29 max ll -18625.4 ESS: -nan
And the uniform pair proposal is giving:
Iter 1 max ll -41894.5 ESS: 1
Iter 2 max ll -40501.9 ESS: 1.00174
Iter 3 max ll -39182.1 ESS: 1
Iter 4 max ll -37752.4 ESS: 1
Iter 5 max ll -36510.4 ESS: 1.05639
Iter 6 max ll -35375.7 ESS: 1
Iter 7 max ll -34236.1 ESS: 1.78139
Iter 8 max ll -33032.8 ESS: 1
Iter 9 max ll -31927.8 ESS: 1.07705
Iter 10 max ll -30856.2 ESS: 2.9304
Iter 11 max ll -29799.8 ESS: 1
Iter 12 max ll -28691.2 ESS: 1.99718
Iter 13 max ll -27533.4 ESS: 1.00028
Iter 14 max ll -26441.2 ESS: 2.73419
Iter 15 max ll -25556.7 ESS: 3.19573
Iter 16 max ll -24717.1 ESS: 1.15374
Iter 17 max ll -23941 ESS: 1
Iter 18 max ll -23197 ESS: 1.04459
Iter 19 max ll -22371.6 ESS: 1
Iter 20 max ll -21677.6 ESS: 2.93049
Iter 21 max ll -21051.2 ESS: 2.55918
Iter 22 max ll -20498.5 ESS: 3.27221
Iter 23 max ll -19970.6 ESS: 5.26381
Iter 24 max ll -19666.7 ESS: 1.79627
Iter 25 max ll -19327.7 ESS: 7.2991
Iter 26 max ll -18975 ESS: 3.33462
Iter 27 max ll -18810.4 ESS: 20.1285
Iter 28 max ll -18697.5 ESS: 13.7451
Iter 29 max ll -18697.5 ESS: -nan
This is with --bl-opt-steps 5 in both cases.
from sts.
Interesting.
I do think that topology (rather than distance) guided proposals have some
advantages, because they explicitly encode the sequence of events that have
to occur to get a winning topology.
I think the current formulation in smctex is correct, but will check it
again today and let you know. Then we can talk about if implementation is
practical.
2012/11/12 Aaron Darling [email protected]
Worked on this a bit today. As of now, using the guided merge pair
proposal gives:Iter 1 max ll -41894.4 ESS: 8.49452
Iter 2 max ll -40501.4 ESS: 3.56237
Iter 3 max ll -39115.4 ESS: 4.6236
Iter 4 max ll -37910.1 ESS: 2.69429
Iter 5 max ll -36668.2 ESS: 1
Iter 6 max ll -35306.1 ESS: 1
Iter 7 max ll -34157.2 ESS: 1.11207
Iter 8 max ll -33017 ESS: 5.60864
Iter 9 max ll -31912.5 ESS: 1.70446
Iter 10 max ll -30840.6 ESS: 4.44762
Iter 11 max ll -29781.8 ESS: 1.97459
Iter 12 max ll -28655.3 ESS: 1.00026
Iter 13 max ll -27495.1 ESS: 1.14868
Iter 14 max ll -26416.5 ESS: 1.02748
Iter 15 max ll -25527.5 ESS: 1.10675
Iter 16 max ll -24689.3 ESS: 1.85626
Iter 17 max ll -23912.5 ESS: 1.2003
Iter 18 max ll -23170.8 ESS: 3.9946
Iter 19 max ll -22328.7 ESS: 1.45438
Iter 20 max ll -21630.5 ESS: 3.80698
Iter 21 max ll -20991.3 ESS: 2.15786
Iter 22 max ll -20419.7 ESS: 5.05691
Iter 23 max ll -19891.8 ESS: 5.35881
Iter 24 max ll -19583.4 ESS: 4.08674
Iter 25 max ll -19240.8 ESS: 1.79467
Iter 26 max ll -18881.7 ESS: 4.1033
Iter 27 max ll -18718.4 ESS: 22.7285
Iter 28 max ll -18625.4 ESS: 13.8119
Iter 29 max ll -18625.4 ESS: -nanAnd the uniform pair proposal is giving:
Iter 1 max ll -41894.5 ESS: 1
Iter 2 max ll -40501.9 ESS: 1.00174
Iter 3 max ll -39182.1 ESS: 1
Iter 4 max ll -37752.4 ESS: 1
Iter 5 max ll -36510.4 ESS: 1.05639
Iter 6 max ll -35375.7 ESS: 1
Iter 7 max ll -34236.1 ESS: 1.78139
Iter 8 max ll -33032.8 ESS: 1
Iter 9 max ll -31927.8 ESS: 1.07705
Iter 10 max ll -30856.2 ESS: 2.9304
Iter 11 max ll -29799.8 ESS: 1
Iter 12 max ll -28691.2 ESS: 1.99718
Iter 13 max ll -27533.4 ESS: 1.00028
Iter 14 max ll -26441.2 ESS: 2.73419
Iter 15 max ll -25556.7 ESS: 3.19573
Iter 16 max ll -24717.1 ESS: 1.15374
Iter 17 max ll -23941 ESS: 1
Iter 18 max ll -23197 ESS: 1.04459
Iter 19 max ll -22371.6 ESS: 1
Iter 20 max ll -21677.6 ESS: 2.93049
Iter 21 max ll -21051.2 ESS: 2.55918
Iter 22 max ll -20498.5 ESS: 3.27221
Iter 23 max ll -19970.6 ESS: 5.26381
Iter 24 max ll -19666.7 ESS: 1.79627
Iter 25 max ll -19327.7 ESS: 7.2991
Iter 26 max ll -18975 ESS: 3.33462
Iter 27 max ll -18810.4 ESS: 20.1285
Iter 28 max ll -18697.5 ESS: 13.7451
Iter 29 max ll -18697.5 ESS: -nanThis is with --bl-opt-steps 5 in both cases.
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/60#issuecomment-10317880.
Frederick "Erick" Matsen, Assistant Member
Fred Hutchinson Cancer Research Center
http://matsen.fhcrc.org/
from sts.
Well, I'm not sure the distance guide is implemented correctly just yet, and even if it is, it's not clear that its much better than uniform merging. 50LL units could be within the realm of normal run-to-run variation. One thing I'm noticing is that even though ESS is a little higher at the first few iters with the guided prop, that seems to get wiped out by iter 5. This could be that the bias distribution fails to adapt in the current implementation, or could be that distance bias is no longer useful after the first few joins, or a bit of both.
The topology bias would surely guide proposal of internal nodes and anywhere that short internal branches make choosing close pairs of nodes to merge based on distance ambiguous or misleading.
But I think for the first few iters under the natural forest extension, the topology guide only reduces our O(n^2) possible pairs problem to somewhere between O(n) and O(1) depending on topology while the distance guide may get us a little closer to O(1). Maybe the real underlying problem is the natural forest extension??
from sts.
Hm, interesting.
As I think you already note, the thing that scares me about a distance
guide are collections of closely related triplets, quadruplets, etc, where
if we want to inject some randomness we might kick these distances into the
wrong regime. This has to do with the fact that distances are only pairwise
rather than 3-wise.
I'll also review and flesh out the smctex section on simulations on the
natural forest extension.
On Tue, Nov 13, 2012 at 8:02 AM, Aaron Darling [email protected]:
Well, I'm not sure the distance guide is implemented correctly just yet,
and even if it is, it's not clear that its much better than uniform
merging. 50LL units could be within the realm of normal run-to-run
variation. One thing I'm noticing is that even though ESS is a little
higher at the first few iters with the guided prop, that seems to get wiped
out by iter 5. This could be that the bias distribution fails to adapt in
the current implementation, or could be that distance bias is no longer
useful after the first few joins, or a bit of both.The topology bias would surely guide proposal of internal nodes and
anywhere that short internal branches make choosing close pairs of nodes to
merge based on distance ambiguous or misleading.But I think for the first few iters under the natural forest extension,
the topology guide only reduces our O(n^2) possible pairs problem to
somewhere between O(n) and O(1) depending on topology while the distance
guide may get us a little closer to O(1). Maybe the real underlying problem
is the natural forest extension??—
Reply to this email directly or view it on GitHubhttps://github.com//issues/60#issuecomment-10331414.
Frederick "Erick" Matsen, Assistant Member
Fred Hutchinson Cancer Research Center
http://matsen.fhcrc.org/
from sts.
Do you think the distance approach is going to prevent the correct merge from being selected in such cases? Or will it just be less informative/efficient than a topology-guide?
from sts.
Less informative/efficient.
I just made a pass through the smctex manuscript and I'm happy with the
sections on topology-guided proposals and the local->global simulations.
The former may seem like a pipe dream, but I have some ideas about how to
make it more efficient that I'm working out now. But if you have an
alternative proposal let it rip!
On Tue, Nov 13, 2012 at 9:14 AM, Aaron Darling [email protected]:
Do you think the distance approach is going to prevent the correct merge
from being selected in such cases? Or will it just be less
informative/efficient than a topology-guide?—
Reply to this email directly or view it on GitHubhttps://github.com//issues/60#issuecomment-10334164.
Frederick "Erick" Matsen, Assistant Member
Fred Hutchinson Cancer Research Center
http://matsen.fhcrc.org/
from sts.
w00t, only 100 log units to go baby!
koadman@winnebago:~/git/sts$ ./_build/sts --bl-opt-steps 10 -t data/thirty.tree data/thirty.ma > bobochan
Iter 1 max ll -41894.4 ESS: 85.9111
Iter 2 max ll -40501.3 ESS: 45.0118
Iter 3 max ll -39115.4 ESS: 20.2245
Iter 4 max ll -37751.1 ESS: 21.7675
Iter 5 max ll -36509.1 ESS: 71.5553
Iter 6 max ll -35303.8 ESS: 49.1422
Iter 7 max ll -34153.5 ESS: 4.23615
Iter 8 max ll -33013.3 ESS: 57.4791
Iter 9 max ll -31909.7 ESS: 16.37
Iter 10 max ll -30837.9 ESS: 16.1731
Iter 11 max ll -29779 ESS: 10.7576
Iter 12 max ll -28652.2 ESS: 7.66589
Iter 13 max ll -27477.3 ESS: 12.9671
Iter 14 max ll -26342.3 ESS: 10.2697
Iter 15 max ll -25455 ESS: 18.2079
Iter 16 max ll -24616.6 ESS: 7.25548
Iter 17 max ll -23840.4 ESS: 4.17219
Iter 18 max ll -23105 ESS: 12.967
Iter 19 max ll -22260.5 ESS: 6.36874
Iter 20 max ll -21545.6 ESS: 30.6595
Iter 21 max ll -20940.2 ESS: 27.5663
Iter 22 max ll -20374.5 ESS: 6.36076
Iter 23 max ll -19846.6 ESS: 2.9915
Iter 24 max ll -19536.4 ESS: 9.97693
Iter 25 max ll -19208.6 ESS: 12.6592
Iter 26 max ll -18854.1 ESS: 9.56112
Iter 27 max ll -18690.2 ESS: 37.3598
Iter 28 max ll -18562.5 ESS: 14.7043
Iter 29 max ll -18562.5 ESS: 624.238
from sts.
... and check out those ESS's!
2012/12/9 Aaron Darling [email protected]
w00t, only 100 log units to go baby!
koadman@winnebago:~/git/sts$ ./_build/sts --bl-opt-steps 10 -t data/thirty.tree data/thirty.ma > bobochan
Iter 1 max ll -41894.4 ESS: 85.9111
Iter 2 max ll -40501.3 ESS: 45.0118
Iter 3 max ll -39115.4 ESS: 20.2245
Iter 4 max ll -37751.1 ESS: 21.7675
Iter 5 max ll -36509.1 ESS: 71.5553
Iter 6 max ll -35303.8 ESS: 49.1422
Iter 7 max ll -34153.5 ESS: 4.23615
Iter 8 max ll -33013.3 ESS: 57.4791
Iter 9 max ll -31909.7 ESS: 16.37
Iter 10 max ll -30837.9 ESS: 16.1731
Iter 11 max ll -29779 ESS: 10.7576
Iter 12 max ll -28652.2 ESS: 7.66589
Iter 13 max ll -27477.3 ESS: 12.9671
Iter 14 max ll -26342.3 ESS: 10.2697
Iter 15 max ll -25455 ESS: 18.2079
Iter 16 max ll -24616.6 ESS: 7.25548
Iter 17 max ll -23840.4 ESS: 4.17219
Iter 18 max ll -23105 ESS: 12.967
Iter 19 max ll -22260.5 ESS: 6.36874
Iter 20 max ll -21545.6 ESS: 30.6595
Iter 21 max ll -20940.2 ESS: 27.5663
Iter 22 max ll -20374.5 ESS: 6.36076
Iter 23 max ll -19846.6 ESS: 2.9915
Iter 24 max ll -19536.4 ESS: 9.97693
Iter 25 max ll -19208.6 ESS: 12.6592
Iter 26 max ll -18854.1 ESS: 9.56112
Iter 27 max ll -18690.2 ESS: 37.3598
Iter 28 max ll -18562.5 ESS: 14.7043
Iter 29 max ll -18562.5 ESS: 624.238—
Reply to this email directly or view it on GitHubhttps://github.com//issues/60#issuecomment-11176594.
Frederick "Erick" Matsen, Assistant Member
Fred Hutchinson Cancer Research Center
http://matsen.fhcrc.org/
from sts.
yeah, the ESS really did come up. i guess the proposals are almost all equally good (or equally bad) under the natural forest extension. :^)
from sts.
Eked out another 40 log units with c700db8 :
Iter 29 max ll -18519.7 ESS: 398.772
from sts.
@koadman -- AG has implemented a nice way of visualizing particle history in sts-log-graph branch. I'd be curious about what your runs look like.
from sts.
good idea! trying to do it but having some trouble with the grapher:
koadman@winnebago:~/git/sts$ python ./python/generate_graph.py bobo10.log bobograph
Traceback (most recent call last):
File "./python/generate_graph.py", line 29, in <module>
main()
File "./python/generate_graph.py", line 27, in main
graph.write_png(sys.argv[2])
File "/usr/lib/pymodules/python2.7/pydot.py", line 1602, in <lambda>
lambda path, f=frmt, prog=self.prog : self.write(path, format=f, prog=prog))
File "/usr/lib/pymodules/python2.7/pydot.py", line 1696, in write
dot_fd.write(self.create(prog, format))
File "/usr/lib/pymodules/python2.7/pydot.py", line 1740, in create
self.write(tmp_name)
File "/usr/lib/pymodules/python2.7/pydot.py", line 1694, in write
dot_fd.write(self.to_string())
File "/usr/lib/pymodules/python2.7/pydot.py", line 1452, in to_string
graph.append( node.to_string()+'\n' )
File "/usr/lib/pymodules/python2.7/pydot.py", line 722, in to_string
node_attr.append( attr + '=' + quote_if_necessary(value) )
TypeError: cannot concatenate 'str' and 'int' objects
I posted the log file in question here:
http://edhar.genomecenter.ucdavis.edu/~koadman/misc/bobo10.log.bz2
This is based on a fork of master from a while back, has the log format changed?
from sts.
Did you just grab that script?
You will need to merge in that branch in order to use it. There are some
minor changes in the format.
In fact, merging that branch into master then merging master into your
branch would seem sage.
On Tue, Dec 11, 2012 at 7:32 AM, Aaron Darling [email protected]:
good idea! trying to do it but having some trouble with the grapher:
koadman@winnebago:~/git/sts$ python ./python/generate_graph.py bobo10.log bobograph
Traceback (most recent call last):
File "./python/generate_graph.py", line 29, in
main()
File "./python/generate_graph.py", line 27, in main
graph.write_png(sys.argv[2])
File "/usr/lib/pymodules/python2.7/pydot.py", line 1602, in
lambda path, f=frmt, prog=self.prog : self.write(path, format=f, prog=prog))
File "/usr/lib/pymodules/python2.7/pydot.py", line 1696, in write
dot_fd.write(self.create(prog, format))
File "/usr/lib/pymodules/python2.7/pydot.py", line 1740, in create
self.write(tmp_name)
File "/usr/lib/pymodules/python2.7/pydot.py", line 1694, in write
dot_fd.write(self.to_string())
File "/usr/lib/pymodules/python2.7/pydot.py", line 1452, in to_string
graph.append( node.to_string()+'\n' )
File "/usr/lib/pymodules/python2.7/pydot.py", line 722, in to_string
node_attr.append( attr + '=' + quote_if_necessary(value) )
TypeError: cannot concatenate 'str' and 'int' objectsI posted the log file in question here:
http://edhar.genomecenter.ucdavis.edu/~koadman/misc/bobo10.log.bz2This is based on a fork of master from a while back, has the log format
changed?—
Reply to this email directly or view it on GitHubhttps://github.com//issues/60#issuecomment-11247912.
Frederick "Erick" Matsen, Assistant Member
Fred Hutchinson Cancer Research Center
http://matsen.fhcrc.org/
from sts.
I think Erick's right about needing more than the script...it works on our
setup:
http://cl.ly/image/322W47053h0Y/bobo10.png
On Tue, Dec 11, 2012 at 8:06 AM, Erick Matsen [email protected]:
Did you just grab that script?
You will need to merge in that branch in order to use it. There are some
minor changes in the format.In fact, merging that branch into master then merging master into your
branch would seem sage.On Tue, Dec 11, 2012 at 7:32 AM, Aaron Darling [email protected]:
good idea! trying to do it but having some trouble with the grapher:
koadman@winnebago:~/git/sts$ python ./python/generate_graph.py
bobo10.log bobograph
Traceback (most recent call last):
File "./python/generate_graph.py", line 29, in
main()
File "./python/generate_graph.py", line 27, in main
graph.write_png(sys.argv[2])
File "/usr/lib/pymodules/python2.7/pydot.py", line 1602, in
lambda path, f=frmt, prog=self.prog : self.write(path, format=f,
prog=prog))
File "/usr/lib/pymodules/python2.7/pydot.py", line 1696, in write
dot_fd.write(self.create(prog, format))
File "/usr/lib/pymodules/python2.7/pydot.py", line 1740, in create
self.write(tmp_name)
File "/usr/lib/pymodules/python2.7/pydot.py", line 1694, in write
dot_fd.write(self.to_string())
File "/usr/lib/pymodules/python2.7/pydot.py", line 1452, in to_string
graph.append( node.to_string()+'\n' )
File "/usr/lib/pymodules/python2.7/pydot.py", line 722, in to_string
node_attr.append( attr + '=' + quote_if_necessary(value) )
TypeError: cannot concatenate 'str' and 'int' objectsI posted the log file in question here:
http://edhar.genomecenter.ucdavis.edu/~koadman/misc/bobo10.log.bz2This is based on a fork of master from a while back, has the log format
changed?—
Reply to this email directly or view it on GitHub<
https://github.com/metatangle/sts/issues/60#issuecomment-11247912>.Frederick "Erick" Matsen, Assistant Member
Fred Hutchinson Cancer Research Center
http://matsen.fhcrc.org/—
Reply to this email directly or view it on GitHubhttps://github.com//issues/60#issuecomment-11249457.
from sts.
thanks for the graph link. any ideas on how it compares to others? i'll work on the merges...
from sts.
pretty sobering that all but 1 final generation particles in that graph have a common ancestor only 3 generations ago. dear great grandma: your particles need recombination!
from sts.
Seems a lot better than master: http://cl.ly/image/3d203R1D281e/thirty.master.png
from sts.
FYI, Aaron, http://cl.ly/image/3d203R1D281e/thirty.master.png was called
"the pushbroom" by @habnabit. We will probably use that terminology.
from sts.
I love it!
from sts.
Related Issues (20)
- Verify log likelihoods of trees sampled from sts HOT 4
- post-processing tool
- particles with incomplete trees HOT 1
- least-squares fitting of JC likelihood function HOT 1
- make terminology consistent
- simulations to help understand how good the "natural extension" is HOT 6
- check out performance of SMC on mixture distributions HOT 6
- inference ll != test ll HOT 3
- A general system to store particle metadata HOT 11
- Evaluate semantics of state modifications HOT 6
- MCMC move diagnostics/logging HOT 1
- Read model / rate distribution parameters from posterior
- Use lcfit for attachment branch length proposals? HOT 2
- Online: cache likelihoods
- BEAGLE has moved to version 2, whereas we are using hmsbeagle-1. HOT 1
- Propose attachment uniformly over tree length, rather than uniformly over edges HOT 1
- Track induced split frequencies
- consider using a two-parameter lcfit? HOT 1
- License HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sts.