probmods / probmods2 Goto Github PK
View Code? Open in Web Editor NEWprobmods 2: electric boogaloo
Home Page: https://probmods.org/v2
probmods 2: electric boogaloo
Home Page: https://probmods.org/v2
In the "Non-parametric models" chapter, in section "Properties of DP Memoized Procedures", the two examples with code
var memoizedGaussian = DPmem(1, gaussian);
viz(repeat(10000, function() {return memoizedGaussian(0,1)}));
var memoizedGaussian = DPmem(1, gaussian);
viz(repeat(10000, function() {return gaussian(0, 1)}));
viz(repeat(10000, function() {return memoizedGaussian(0,1)}));
do not work since DPmem
is undefined. Copying the code for DPmem
from the previous examples doesn't seem to do the right thing: the graph resulting from the
viz(repeat(10000, function() {return memoizedGaussian(0,1)}));
line produces a graph with a single bar assigning all probability to null
.
In patterns of inference, on the example Medical Diagnosis halfway down the page this is stated:
"To illustrate, observe how the probabilities of cold and lungDisease change when we observe cough is true:"
followed by:
"Both cold and lung disease are now far more likely that their baseline probability: the probability of having a cold increases from 2% to around 50%; the probability of having lung disease also increases from 1 in a 1000 to around 50%."
This seemed wrong to me immediately, considering the code is:
var cough = (cold && flip(0.5)) || (lungDisease && flip(0.5)) || flip(0.001);
lung disease and cold should have about the same prior probability.
the actual code looks like this:
var smokes = flip(.2);
var lungDisease = flip(0.001) || (smokes && flip(0.1));
var cold = flip(0.02);
which means lung disease is 2.1% given you know nothing about the patient or, expressed similarly, 21 in 1000 to around 50%
I think we updated the webppl version to use lodash internally, but some of the code in the BDA chapter uses the underscore forms (e.g. _.where) so the code breaks.
Hi,
I have three questions.
As far as I can tell, this gives an error:
var latent = sample(latentPrior)
var thunk = function() {return observe(latent)}
var sequence = repeat(2,thunk)
In particular, the error I get is "sample() expected a distribution". I'm confused by this, because I was under the impression that Gaussian() was a distribution!
Interestingly, what is highlighted is the observe function, not sample. Not sure what that means.
In a number of places, we were forced to use a condition statement in Church, but a factor would be nicer, e.g.
var speaker = function(state, depth) {
return Infer({method: 'enumerate'}, function() {
var words = sample(sentencePrior)
var listenerDist = listener(words, depth)
condition(state == sample(listenerDist))
return words
})
};
could be rewritten as
var speaker = function(state, depth) {
return Infer({method: 'enumerate'}, function() {
var words = sample(sentencePrior)
var listenerDist = listener(words, depth)
factor(listenerDist.score(state))
return words
})
};
When visiting https://probmods.org/, Chrome shows a big scary warning that the security certificate is invalid.
Change the data so that model predictions are too extreme relative to the data (i.e., there is some random guessing in the data)
While grading the BDA exercises, it seems like a lot of the time the priors come out looking wonky. This is undesirable because this exercise is supposed to introduce the student to the beta family of priors. I suggest refactoring this exercise to have 2 Infer
s, one with method:"forward"
to show the prior, and the other with MCMC or rejection for the posterior.
The current version in chapter 5 uses observe
:
var latent = sample(latentPrior)
var thunk = function() {return observe(latent)}
var sequence = repeat(2,thunk)
which is confusing because observe
is now a primitive which means something else
basic idea is to ask students how you might evaluate these inference models on data (i.e. building on BDA chapter). Some notes:
i'm not sure if the url should point to probmods.github.io/prombods2 or just to probmods.org?
I'm going to be teaching this chapter next week (Feb 15) and was hoping to make an effort to try and fix the chapter before then :)
I tried to do a bit of debugging of the two currently broken vending machine inference examples. From what I can tell, the issue is that buttonsToOutcomeProbs(action) is called from within the inner inference, and the memoization of that call does not hold across samples during the rejection sampling of the inner inference. So, the rejection sampling process is drawing new probabilities each time it samples e.g., ['a'] from the prior (and these are in turn different than the ones returned from the outer inference and graphed). This actually makes some sense -- what if we wanted to make the inference about the probabilities within the inner rather than the outer Infer? The inner infer doesn't know it's an embedded inference. You can see what I'm talking about in the attached debugging file, which eliminates the outer inference.
This isn't an issue in the earlier examples in the chapter because those distributions are all parametric with a fixed number of a priori known actions, and so the probabilities are drawn outside of the inner inference and then passed to that inference.
I've attached some code with a work-around, where I just implemented the rejection sampling by hand without using Infer, but obviously that doesn't solve the broader problem of cases where you want memoization in the outer inference to also hold in the inner inference. I also fixed a couple of other more minor bugs in the models (I think the main one was that the condition in the second model was incorrect).
Besides that, there is also the currently non-converted sketch of the planning section at the end of the chapter. Thanks!
inference_debugging.txt
inference_about_inference_model0.txt
inference_about_inference_model1.txt
I'm trying to include some MDP/POMDP in a chapter for my fork of probmods2. And I've been banging my head against the code for a few hours, trying to figure out how agentmodels.org includes dependencies like webppl-agents.
Likely, this is because I don't really understand how github pages get compiled, what files & code are required by github pages, what is custom to probmods or agentmodels, etc. Moreover, different javascript packages are included through different methods, sometimes in the same file (ex: the chapter.html template in probmods uses a script tag -- which I understand -- and a structure called custom_js -- which I don't).
If there are any tutorials that I should look at, I'd love being pointed at them. Here's a specific question in the meantime:
I want to use code from webppl-agents in a chapter in my fork of probmods2. agentmodels contains a file called 'package.json' that contains the following:
"dependencies": {
"webppl": "github:probmods/webppl",
"webppl-agents": "github:agentmodels/webppl-agents",
"webppl-dp": "^0.1.1",
"webppl-editor": "github:probmods/webppl-editor",
"webppl-nn": "github:null-a/webppl-nn",
"webppl-timeit": "^0.3.0"
}
So it seems like maybe this is somehow involved in making webppl-agents available? But for the life of me, I can't figure out what calls 'package.json' or under what circumstances. More concerning, I noticed that much of the code from the webppl-agents package is actually included in agentmodels version of webppl.min.js. So I'm not sure that 'package.json' does anything at all.
I'd really appreciate any help in figuring out how to do this!
//==============================================================================================
// Inference of the posterior latent threshold distribution of a widget tester
// WPPL-code modified PCM 2017/02/02
//----------------------------------------------------------------------------------------------
var starttime = Date.now()
var nReplications = 10E2; var nWidgets = 3; var myMaxScore = -0
print("nReplications = "); print(nReplications)
var widgetMachine = Categorical({vs: [.2,.3,.4,.5,.6,.7,.8], ps: [.05,.1,.2,.3,.2,.1,.05]})
viz.hist(widgetMachine, {xLabel: 'SizeOfWidgetProduced'})
// this is the assumed prior latent threshold distribution of the widget tester:
var thresholdPrior = Categorical({vs: [.3,.4,.5,.6,.7], ps: [.1,.2,.4,.2,.1]})
viz.hist(thresholdPrior, {xLabel: 'LatentThresholdPriorOfWidgetTester'})
var observations = [ // these are 2 widgets leaving the factory with the certificate "good"
[.6,.7,.8], [.6,.6,.6]] // runtime with one observation = ca. 5 sec
print("observations (=certifiedGoodWidgetTriples):"); print(observations)
//----------------------------------------------------------------------
var makeGoodWidgetSeq = function(numWidgets, threshold) {
if(numWidgets == 0) { return [] }
else {
var widget = sample(widgetMachine) // production of one widget
return (widget > threshold ? // if widget is "good" then save it in a ...
[widget].concat(makeGoodWidgetSeq(numWidgets - 1, threshold)) : //... basket
makeGoodWidgetSeq(numWidgets, threshold)) }} // otherwise proceed with next
//----------------------------------------------------------------------
var latentThresholdPosteriorOfTester =
Infer({method: 'rejection', samples: nReplications, maxScore: myMaxScore}, function() {
var threshold = sample(thresholdPrior) // sample from tester's latent prior threshold distr
var goodWidgetSeq = makeGoodWidgetSeq(nWidgets, threshold) // nWidgets good widgets saved
// http://underscorejs.org/#isEqual // if widgets are above threshold a n d are accepted ...
condition(all(function(observation) {_.isEqual(observation, goodWidgetSeq)}, observations))
return [threshold].join("") }) // then note this threshold for posterior
//----------------------------------------------------------------------
viz.hist(latentThresholdPosteriorOfTester,
{xLabel: 'LatentThresholdPosteriorOfTester | certifiedGoodWidgetTriples'})
print('LatentThresholdPosteriorOfTester | certifiedGoodWidgetTriples')
display(latentThresholdPosteriorOfTester)
var stoptime = Date.now()
var runTimeSec = (stoptime-starttime)/1000
print("runTime in seconds ="); display(runTimeSec)
//----------------------------------------------------------------------------------------------
Hi,
there is the line " condition(_.isEqual([.6, .7, .8], goodWidgetSeq))". I don't understand why the sequence of good widgets should have an ascending order. Isn't it sufficient that all members of the good widgets are above the sampled threshold of the widget tester ?
Best, Claus
In section "Thoughts on hierarchical models", fair vs. unfair model selection link is broken
after moving earlier in the book, some major revisions are in order (consolidating Noah's comments in chapter markdown with other comments):
Github links, contributors, usage recommendations, support
with PR #66 a bunch of code blocks got marked with "~~~js" which i don't think is needed.
This is more customary for our scatter plots
see Chapter 2, Exercise 7 for an example
flip(0.8)? flip(): false
Looks like they're missing from old probmods
Footnotes, when clicked, should show up in a little callout right in the original text location
Hi,
I stumpled across a line "return T.get(bEffects, 1)})". I assume that this means a fetch of the second component of the posterior vector bEffects. But why "T.get" ? I did not find a proper docu of "T.get".
Where can I find an explanation ?
Best, Claus
Chapter 8:
Cheng and colleagues [@cheng] have suggested assuming that C and background effects can both cause C
Chang reference is dangling.
Also it should probably be "both cause E"
PR #66 has a lot of changes in .css. I'm not sure if these are silent or actually change the styling. Should check out on different browsers? @jkhartshorne ?
a few of the more complex social inference models in Chap. 6 are giving trouble: you can't enumerate for the inner inference because of the continuous dirichlet samples, and it takes too long to build it every time with MCMC or rejection.
You could cache the inner inference, but a couple of the arguments are functions and even if you refactored it to construct the functions inside the inner inference, the relevant parameters are floats so caching wouldn't even help that much. Is this a case where variational would save us, or am I missing some bad inefficiency?
https://probmods.org/v2/chapters/06-inference-about-inference.html#epistemic-states
Meta-note: This is maybe the most interesting disadvantage i've seen compared to Church's rejection-query
implementation, since there you can just directly take samples from the inner distribution without needing to first construct a reasonably representative distribution object.
i have a draft from fall 2017 class... need to finish and add text.
Currently, the first code box uses the variable bagToPrototype
to refer to the local prototype of the bag.
Code box 2, changes this to colorProb
(and drops the prototype label)
Code box 3, prototype
comes back but now refers to the global prototype.
From there on out, it looks like prototype
refers to the global prototype
I'm intending to do a major (and long overdue) re-organization shortly. My proposed new outline is:
Introduction — A brief introduction to the philosophy.
Basics:
Generative models — Representing working models with probabilistic programs.
Conditioning — Asking questions of models by conditional inference.
Causal and statistical dependence —
Conditional dependence — conditional dependence, explaining away, screening off, etc.
Bayesian data analysis — Making scientific inferences about data and models
Algorithms for inference — The landscape of inference methods, efficiency tradeoffs of different algorithms.
Rational process modes — From competence to process,
[Models for sequences of observations — Generative models of the relations between data points **Delete: move iid, exch to learning section, move hmm,pcfg elsewhere.]
Learning:
Learning as conditional inference — How inferences change as data accumulate. (Include iid, exch, etc.)
Learning compositional hypotheses — RR, PLOT, etc (include PCFG here?)
Learning continuous functions. — Deep probabilistic models. GPs?
Hierarchical models — The power of abstraction.
Occam's Razor — Penalizing extra model flexibility.
Mixture models — Models for inferring the kinds of things.
Non-parametric models — What to do when you don't know how many kinds there are. **Get rid of this chapter? [add small summary of non-parametrics (dirichlet-discrete to dirichlet process via sequential sampling) to ch 11. get rid of ch 12.]
Social reasoning:
Agents as probabilistic programs — One-shot decision problems, softmax choice
Sequential decisions — Markov Decision Processes and Partially-Observable Markof Decision Processes
Inference about inference — social cognition, pragmatics
Appendix - JavaScript basics — A very brief primer on JavaScript.
a
and b
intuitively correspond to? Currently students mostly waxing poetic about the exact shapes of the 3 things we asked them to plot rather than abstracting away to a view of the whole family.Beta section still needs to be filled in. Also would be nice to have an intuitive image of the simplex like this for different settings of the alpha
parameter (especially since the hierarchical models chapter splits out the relative numbers assigned to the categories, like [1,1,4], and the magnitude by which they're multiplied, which pushes them toward the extreme points or toward the center)
Once probmods is running on webppl 0.9.6, the justSample
can be safely dropped (from here). This is because the samples
property is now always available on the marginal returned by sampling based algos, and the justSample
option has no effect. See probmods/webppl#712.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.