GithubHelp home page GithubHelp logo

Comments (5)

csy530216 avatar csy530216 commented on May 20, 2024

My personal opinion is that the reason why we use the Dirichlet distribution as the prior is that it is the conjugate of Multinomial distribution. However currently ZhuSuan does not support detecting the conjugacy and figuring out analytic updating formula of variational inference (but it is under development, see here, so you can wait for some while:) ) Therefore instead of using Dirichlet distribution which is defined on a probability simplex, we first sample from normal distribution (each dimension is independent) and transform it onto the probability simplex by softmax (Latent0 <- Normal, Latent1 = softmax(Latent0) ), i.e. the logistic normal distribution. Now the latent variable lies in an unconstrained space, which is convenient. You can also see how to approximate Dirichlet distribution with some parameter by logistic normal distribution in this paper.
By the way, unlike the standard LDA, I think the model in your ultimate goal does not have conjugacy, making conjugacy between Dirichlet and Multinomial meaningless. You can still use the black-box inference method in ZhuSuan, such as VI and HMC. Since your model does not have conjugacy, I think there is basically no loss if you change Dirichlet to logistic normal.
By the way, to learn more about the background and implementation about logistic normal topic model in ZhuSuan, you can refer to this tutorial:)

from zhusuan.

keskinm avatar keskinm commented on May 20, 2024

Hello

Thanks you a lot for you reply.
Indeed, put a Dirichlet prior is not important, actually what I care about is to retrieve Latents it-selves and the priors are just a way to do it, so replace with logistic normal should not be a problem for me.

I am trying then to do this:

observations <- zs.Multinomial(toy_logits, n_experiments)

latent0 <- zs.Normal(latent0_mean, latent0_stdev, ngroup_dims=0)
latent1 <- tf.nn.softmax(latent0)
model_observations <- zs.Multinomial(latent1, n_experiments)

I have a piece of code that should be a good start:

import numpy as np
import tensorflow as tf
import zhusuan as zs

observations = zs.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100)

latent0_mean = tf.placeholder(tf.float32, shape=[5,], name='latent0_mean')
latent0_stdev = tf.placeholder(tf.float32, shape=[5,],name='latent0_stdev')

Latent0_mean = np.zeros(5, dtype=np.float32)
Latent0_stdev = np.zeros(5, dtype=np.float32)

def lognormalmult(observed, latent0_mean, latent0_stdev):
    with zs.BayesianNet(observed=observed) as model:
        latent0 = zs.Normal('latent0', mean=latent0_mean, logstd=latent0_stdev, group_ndims=0)
        latent1 = tf.nn.softmax(latent0)
        x = zs.Multinomial('x', tf.log(latent1),normalize_logits=False,dtype=tf.float32,n_experiments=100)
    return model

tf.set_random_seed(1)
kernel_width = 0.1
n_chains = 1
n_iters = 200
n_leapfrogs = 5

# Build the computation graph
def log_joint(observed):
    model = lognormalmult(observed, latent0_mean, latent0_stdev)
    log_platent0, log_px = model.local_log_prob(['latent0', 'x'])
    return log_platent0 + log_px

hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs,target_acceptance_rate=0.9)
x = tf.Variable(tf.zeros([n_chains, 5]), name='x')
latent0 = tf.Variable(tf.zeros([n_chains, 5]), name='latent0')
sample_op, hmc_info = hmc.sample(log_joint, observed={'x': x}, latent={'latent0': latent0})

# Run the inference
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    samples = []
    print('Sampling...')
    for i in range(n_iters):
        _, x_sample = sess.run([sample_op, hmc_info.samples['x']],
        feed_dict={latent0_mean: Latent0_mean, latent0_stdev:Latent0_stdev})
        samples.append(x_sample)
    print('Finished.')
    samples = np.vstack(samples)
    print('latent0 mean:')
    print(np.mean(samples,0))
    print('latent0_stdev:')
    print(np.stdev(samples,0))

Especially I don't know where to use "observations", and if I can do it in this way

from zhusuan.

csy530216 avatar csy530216 commented on May 20, 2024

In

observations = zs.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100)

zs.Multinomial is only supposed to be used in a BayesianNet context. You can use

observations = zs.distribution.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100).sample(size_of_dataset)

My understanding is that your code intends to infer the global latent variable latent0, which corresponds to the logits parameter of the supposed multinomial distribution of observations. Then I modified your code as follows, which can run:

import numpy as np
import tensorflow as tf
import zhusuan as zs

N = 50
observations = zs.distributions.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100).sample(N)

latent0_mean = tf.placeholder(tf.float32, shape=[5,], name='latent0_mean')
latent0_logstdev = tf.placeholder(tf.float32, shape=[5,],name='latent0_logstdev')

Latent0_mean = np.zeros(5, dtype=np.float32)
Latent0_logstdev = np.zeros(5, dtype=np.float32)

def lognormalmult(observed, latent0_mean, latent0_logstdev):
    with zs.BayesianNet(observed=observed) as model:
        latent0 = zs.Normal('latent0', mean=latent0_mean, logstd=latent0_logstdev, group_ndims=1, n_samples=n_chains)
        x = zs.Multinomial('x', tf.expand_dims(latent0, 1),n_experiments=100)
    return model

tf.set_random_seed(1)
kernel_width = 0.1
n_chains = 2
n_iters = 200
n_leapfrogs = 5

# Build the computation graph
def log_joint(observed):
    model = lognormalmult(observed, latent0_mean, latent0_logstdev)
    log_platent0, log_px = model.local_log_prob(['latent0', 'x'])
    return log_platent0 + tf.reduce_sum(log_px, -1)

hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs,target_acceptance_rate=0.9,adapt_step_size=True)
latent0 = tf.Variable(tf.zeros([n_chains, 5]), name='latent0')
sample_op, hmc_info = hmc.sample(log_joint, observed={'x': observations}, latent={'latent0': latent0})

# Run the inference
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    samples = []
    print('Sampling...')
    for i in range(n_iters):
        _, latent0_sample = sess.run([sample_op, hmc_info.samples['latent0']],
        feed_dict={latent0_mean: Latent0_mean, latent0_logstdev:Latent0_logstdev})
        samples.append(latent0_sample)
    print('Finished.')
    samples = np.vstack(samples)
    print('the generative logits')
    print(np.log([1.,5.,10.,5.,1.]))
    print('latent0 mean: (notice logits=[a,b] and logits=[a+k,b+k] are the same:')
    print(np.mean(samples,0))
    print('latent0_stdev:')
    print(np.std(samples,0))

from zhusuan.

keskinm avatar keskinm commented on May 20, 2024

Hello
Thanks a lot, that works well like that, when I do a multinomial sampling with obtained logits I obtain data similar to the original one "observations".

However, I obtain the same kind of problem as in Edward (but in Pyro that works):
Before doing my "ultimate goal" (with logistic normal instead of Dirichlet with Zhusuan as you suggest), I want to try more simple thing first, like:

Constant1 <- tf.constant([..])
Constant2 <- tf.constant([..])
Observations <- zs.distributions.Multinomial(Constant1 + Constant2, n_experiments))
Latent0<- zs.distributions.Normal(params)
Observations_model <- zs.distributions.Multinomial(Latent0 + Constant2, n_experiments))

but that fails to retrieve correct params so that we could obtain the Latent0 similar to Constant1 and then similar data to the original one (see also https://discourse.edwardlib.org/t/a-little-example-on-dirichlet-multinomial-inference/802 if interested, this is the same thing with Dirichlet instead of Normal).

Code:

import numpy as np
import tensorflow as tf
import zhusuan as zs

N = 500
constant1=tf.constant([40.,5.,10.,20.,80])
constant2=tf.constant([10.,2.,5.,3.,50.])
observations = zs.distributions.Multinomial(name='observations',logits=tf.log(constant1+constant2), n_experiments=8000).sample(N)

latent0_mean = tf.placeholder(tf.float32, shape=[5,], name='latent0_mean')
latent0_logstdev = tf.placeholder(tf.float32, shape=[5,],name='latent0_logstdev')

Latent0_mean = np.zeros(5, dtype=np.float32)
Latent0_logstdev = np.zeros(5, dtype=np.float32)

def lognormalmult(observed, latent0_mean, latent0_logstdev):
    with zs.BayesianNet(observed=observed) as model:
        latent0 = zs.Normal('latent0', mean=latent0_mean, logstd=latent0_logstdev, group_ndims=1)

        constant2 = tf.constant([10.,2.,5.,3.,50.])
        
        x = zs.Multinomial('x', logits=(latent0+constant2), n_experiments=8000)
    return model

tf.set_random_seed(1)
kernel_width = 0.1
n_chains = 2
n_iters = 2000
n_leapfrogs = 5

# Build the computation graph
def log_joint(observed):
    model = lognormalmult(observed, latent0_mean, latent0_logstdev)
    log_platent0, log_px = model.local_log_prob(['latent0', 'x'])
    return log_platent0 + tf.reduce_sum(log_px, -1)

hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs,target_acceptance_rate=0.9,adapt_step_size=True)
latent0 = tf.Variable(tf.zeros([5,]), name='latent0')
sample_op, hmc_info = hmc.sample(log_joint, observed={'x': observations}, latent={'latent0': latent0})

# Run the inference(same thing as before)..

(I removed n_chains to make it simpler)

from zhusuan.

csy530216 avatar csy530216 commented on May 20, 2024

Sorry for the (very) late reply. The problem is at:

observations = zs.distributions.Multinomial(name='observations',logits=tf.log(constant1+constant2), n_experiments=8000).sample(N)

And

x = zs.Multinomial('x', logits=(latent0+constant2), n_experiments=8000)

You can see there should be 'latent0+constant2=tf.log(constant1+constant2)' (up to a difference of an constant), and this does not imply 'latent0=tf.log(constant1)'.
Sorry that the code in my reply before is not clear enough: I should write constant1=np.log([1.,5.,10.,5.,1.]) and replace the other log([1.,5.,10.,5.,1.]) with constant1. The usage of log is to show the conversion between (unnormalized) probability and logits, but here we only need to work in the logits, so no need to use log.

from zhusuan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.