Hello Is it possible to add a little example for doing this: Latent <- zs.d

In <div class="snippet-clipboard-content notranslate position-relative overflow-au

Sorry for the (very) late reply. The problem is at: <div class="snippet-clipboard-

Dirichlet + Categorical or Dirichlet + Multinomial toy example ? about zhusuan HOT 5 CLOSED

keskinm commented on May 20, 2024

Dirichlet + Categorical or Dirichlet + Multinomial toy example ?

from zhusuan.

Comments (5)

csy530216 commented on May 20, 2024

My personal opinion is that the reason why we use the Dirichlet distribution as the prior is that it is the conjugate of Multinomial distribution. However currently ZhuSuan does not support detecting the conjugacy and figuring out analytic updating formula of variational inference (but it is under development, see here, so you can wait for some while:) ) Therefore instead of using Dirichlet distribution which is defined on a probability simplex, we first sample from normal distribution (each dimension is independent) and transform it onto the probability simplex by softmax (Latent0 <- Normal, Latent1 = softmax(Latent0) ), i.e. the logistic normal distribution. Now the latent variable lies in an unconstrained space, which is convenient. You can also see how to approximate Dirichlet distribution with some parameter by logistic normal distribution in this paper.
By the way, unlike the standard LDA, I think the model in your ultimate goal does not have conjugacy, making conjugacy between Dirichlet and Multinomial meaningless. You can still use the black-box inference method in ZhuSuan, such as VI and HMC. Since your model does not have conjugacy, I think there is basically no loss if you change Dirichlet to logistic normal.
By the way, to learn more about the background and implementation about logistic normal topic model in ZhuSuan, you can refer to this tutorial:)

from zhusuan.

keskinm commented on May 20, 2024

Hello

Thanks you a lot for you reply.
Indeed, put a Dirichlet prior is not important, actually what I care about is to retrieve Latents it-selves and the priors are just a way to do it, so replace with logistic normal should not be a problem for me.

I am trying then to do this:

observations <- zs.Multinomial(toy_logits, n_experiments)

latent0 <- zs.Normal(latent0_mean, latent0_stdev, ngroup_dims=0)
latent1 <- tf.nn.softmax(latent0)
model_observations <- zs.Multinomial(latent1, n_experiments)

I have a piece of code that should be a good start:

import numpy as np
import tensorflow as tf
import zhusuan as zs

observations = zs.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100)

latent0_mean = tf.placeholder(tf.float32, shape=[5,], name='latent0_mean')
latent0_stdev = tf.placeholder(tf.float32, shape=[5,],name='latent0_stdev')

Latent0_mean = np.zeros(5, dtype=np.float32)
Latent0_stdev = np.zeros(5, dtype=np.float32)

def lognormalmult(observed, latent0_mean, latent0_stdev):
    with zs.BayesianNet(observed=observed) as model:
        latent0 = zs.Normal('latent0', mean=latent0_mean, logstd=latent0_stdev, group_ndims=0)
        latent1 = tf.nn.softmax(latent0)
        x = zs.Multinomial('x', tf.log(latent1),normalize_logits=False,dtype=tf.float32,n_experiments=100)
    return model

tf.set_random_seed(1)
kernel_width = 0.1
n_chains = 1
n_iters = 200
n_leapfrogs = 5

# Build the computation graph
def log_joint(observed):
    model = lognormalmult(observed, latent0_mean, latent0_stdev)
    log_platent0, log_px = model.local_log_prob(['latent0', 'x'])
    return log_platent0 + log_px

hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs,target_acceptance_rate=0.9)
x = tf.Variable(tf.zeros([n_chains, 5]), name='x')
latent0 = tf.Variable(tf.zeros([n_chains, 5]), name='latent0')
sample_op, hmc_info = hmc.sample(log_joint, observed={'x': x}, latent={'latent0': latent0})

# Run the inference
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    samples = []
    print('Sampling...')
    for i in range(n_iters):
        _, x_sample = sess.run([sample_op, hmc_info.samples['x']],
        feed_dict={latent0_mean: Latent0_mean, latent0_stdev:Latent0_stdev})
        samples.append(x_sample)
    print('Finished.')
    samples = np.vstack(samples)
    print('latent0 mean:')
    print(np.mean(samples,0))
    print('latent0_stdev:')
    print(np.stdev(samples,0))

Especially I don't know where to use "observations", and if I can do it in this way

from zhusuan.

csy530216 commented on May 20, 2024

observations = zs.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100)

zs.Multinomial is only supposed to be used in a BayesianNet context. You can use

observations = zs.distribution.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100).sample(size_of_dataset)

My understanding is that your code intends to infer the global latent variable latent0, which corresponds to the logits parameter of the supposed multinomial distribution of observations. Then I modified your code as follows, which can run:

import numpy as np
import tensorflow as tf
import zhusuan as zs

N = 50
observations = zs.distributions.Multinomial(name='observations',logits=tf.log([1.,5.,10.,5.,1.]), n_experiments=100).sample(N)

latent0_mean = tf.placeholder(tf.float32, shape=[5,], name='latent0_mean')
latent0_logstdev = tf.placeholder(tf.float32, shape=[5,],name='latent0_logstdev')

Latent0_mean = np.zeros(5, dtype=np.float32)
Latent0_logstdev = np.zeros(5, dtype=np.float32)

def lognormalmult(observed, latent0_mean, latent0_logstdev):
    with zs.BayesianNet(observed=observed) as model:
        latent0 = zs.Normal('latent0', mean=latent0_mean, logstd=latent0_logstdev, group_ndims=1, n_samples=n_chains)
        x = zs.Multinomial('x', tf.expand_dims(latent0, 1),n_experiments=100)
    return model

tf.set_random_seed(1)
kernel_width = 0.1
n_chains = 2
n_iters = 200
n_leapfrogs = 5

# Build the computation graph
def log_joint(observed):
    model = lognormalmult(observed, latent0_mean, latent0_logstdev)
    log_platent0, log_px = model.local_log_prob(['latent0', 'x'])
    return log_platent0 + tf.reduce_sum(log_px, -1)

hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs,target_acceptance_rate=0.9,adapt_step_size=True)
latent0 = tf.Variable(tf.zeros([n_chains, 5]), name='latent0')
sample_op, hmc_info = hmc.sample(log_joint, observed={'x': observations}, latent={'latent0': latent0})

# Run the inference
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    samples = []
    print('Sampling...')
    for i in range(n_iters):
        _, latent0_sample = sess.run([sample_op, hmc_info.samples['latent0']],
        feed_dict={latent0_mean: Latent0_mean, latent0_logstdev:Latent0_logstdev})
        samples.append(latent0_sample)
    print('Finished.')
    samples = np.vstack(samples)
    print('the generative logits')
    print(np.log([1.,5.,10.,5.,1.]))
    print('latent0 mean: (notice logits=[a,b] and logits=[a+k,b+k] are the same:')
    print(np.mean(samples,0))
    print('latent0_stdev:')
    print(np.std(samples,0))

from zhusuan.

keskinm commented on May 20, 2024

Hello
Thanks a lot, that works well like that, when I do a multinomial sampling with obtained logits I obtain data similar to the original one "observations".

However, I obtain the same kind of problem as in Edward (but in Pyro that works):
Before doing my "ultimate goal" (with logistic normal instead of Dirichlet with Zhusuan as you suggest), I want to try more simple thing first, like:

Constant1 <- tf.constant([..])
Constant2 <- tf.constant([..])
Observations <- zs.distributions.Multinomial(Constant1 + Constant2, n_experiments))
Latent0<- zs.distributions.Normal(params)
Observations_model <- zs.distributions.Multinomial(Latent0 + Constant2, n_experiments))

but that fails to retrieve correct params so that we could obtain the Latent0 similar to Constant1 and then similar data to the original one (see also https://discourse.edwardlib.org/t/a-little-example-on-dirichlet-multinomial-inference/802 if interested, this is the same thing with Dirichlet instead of Normal).

Code:

import numpy as np
import tensorflow as tf
import zhusuan as zs

N = 500
constant1=tf.constant([40.,5.,10.,20.,80])
constant2=tf.constant([10.,2.,5.,3.,50.])
observations = zs.distributions.Multinomial(name='observations',logits=tf.log(constant1+constant2), n_experiments=8000).sample(N)

latent0_mean = tf.placeholder(tf.float32, shape=[5,], name='latent0_mean')
latent0_logstdev = tf.placeholder(tf.float32, shape=[5,],name='latent0_logstdev')

Latent0_mean = np.zeros(5, dtype=np.float32)
Latent0_logstdev = np.zeros(5, dtype=np.float32)

def lognormalmult(observed, latent0_mean, latent0_logstdev):
    with zs.BayesianNet(observed=observed) as model:
        latent0 = zs.Normal('latent0', mean=latent0_mean, logstd=latent0_logstdev, group_ndims=1)

        constant2 = tf.constant([10.,2.,5.,3.,50.])
        
        x = zs.Multinomial('x', logits=(latent0+constant2), n_experiments=8000)
    return model

tf.set_random_seed(1)
kernel_width = 0.1
n_chains = 2
n_iters = 2000
n_leapfrogs = 5

# Build the computation graph
def log_joint(observed):
    model = lognormalmult(observed, latent0_mean, latent0_logstdev)
    log_platent0, log_px = model.local_log_prob(['latent0', 'x'])
    return log_platent0 + tf.reduce_sum(log_px, -1)

hmc = zs.HMC(step_size=1e-3, n_leapfrogs=n_leapfrogs,target_acceptance_rate=0.9,adapt_step_size=True)
latent0 = tf.Variable(tf.zeros([5,]), name='latent0')
sample_op, hmc_info = hmc.sample(log_joint, observed={'x': observations}, latent={'latent0': latent0})

# Run the inference(same thing as before)..

(I removed n_chains to make it simpler)

from zhusuan.

csy530216 commented on May 20, 2024

Sorry for the (very) late reply. The problem is at:

observations = zs.distributions.Multinomial(name='observations',logits=tf.log(constant1+constant2), n_experiments=8000).sample(N)

And

x = zs.Multinomial('x', logits=(latent0+constant2), n_experiments=8000)

You can see there should be 'latent0+constant2=tf.log(constant1+constant2)' (up to a difference of an constant), and this does not imply 'latent0=tf.log(constant1)'.
Sorry that the code in my reply before is not clear enough: I should write constant1=np.log([1.,5.,10.,5.,1.]) and replace the other log([1.,5.,10.,5.,1.]) with constant1. The usage of log is to show the conversion between (unnormalized) probability and logits, but here we only need to work in the logits, so no need to use log.

from zhusuan.

Dirichlet + Categorical or Dirichlet + Multinomial toy example ? about zhusuan HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs