pronobis / libspn Goto Github PK
View Code? Open in Web Editor NEWLibrary for learning and inference with Sum-product Networks
License: Other
Library for learning and inference with Sum-product Networks
License: Other
Reinvestigate the performance differences when using custom Ops and create PR when there is a clear benefit of using these. Otherwise focus on optional compilation of custom Ops so that it becomes easier for contributors to add their own custom Ops.
This issue tracks the updates that should happen to the website only for the release. This includes updating of installation instructions, features etc. This issues does not track any changes to demonstrations/tutorials etc.
/
Hi guys, you are doing a great job with this repo.
I have a task that requires some special requirements.
Say an SPN specifies a joint distribution over the following set of variables:
x_1, x_2, x_3, x_4, where each is a continuous variable modelled using a univariate Gaussian leaf.
My first case is simple, some of the variables are corrupted during inference (where C indicates corrupted):
x_1, x_2, C, C.
For this case, the marginal likelihood is used for the leaves of the corrupted variable. The marginal likelihood is computed from -infinity to +infinity, giving a value of 1 for the corrupted leaves.
The second case is similar to the first case, but with added complexity. As for the first case, some of the variables are corrupted. However, for this case, we know that the true value of the variable exists somewhere between the observed value, and -infinity. Therefore, during inference, I need the marginal likelihood to be computed over the bounds -infinity to the obseved value of the variable. If this does not make sense, I have formulas available to make it more clear.
Would this be possible with LibSPN? I have made this work with other SPN libraries out there, but I want to use libSPN due to DPC-SPN and the heavy utilisation of GPU.
Thanks,
Aaron.
Hi,
I have modified tutorial 2 to use normal leaves and to use the evidence indicator feed. Was curious why the output for no evidence is 0.2, and not 1.0? It seems to change to whatever I change the first weight of the root to (0.2 at the moment). Is it something I am missing?
Here is the output from the script (see real_x_data and mask in the script, where mask is the evidence indicator)
[[0.07978846]
[0.03167224]
[0.2 ]
[0.2 ]]
[[0.03351115]
[0.01330234]
[0.08400001]
[0.08400001]]
Here is the modified script of tutorial 2:
import libspn as spn
import tensorflow as tf
num_vars = 2
num_leaf_components = 2
scale_init = 1.0
loc_init = 1.0
evidence_indicator_feed = tf.placeholder(tf.bool, shape=[None, num_vars], name='evidence_indicator_feed')
normal_x = spn.NormalLeaf(num_components=num_leaf_components, num_vars=num_vars,
trainable_scale=False, trainable_loc=True, scale_init=scale_init, loc_init=loc_init, evidence_indicator_feed=evidence_indicator_feed)
sum_11 = spn.Sum((normal_x, [0,1]), name="sum_11")
sum_11.generate_weights(initializer=tf.initializers.constant([0.4, 0.6]))
sum_12 = spn.Sum((normal_x, [0,1]), name="sum_12")
sum_12.generate_weights(initializer=tf.initializers.constant([0.1, 0.9]))
sum_21 = spn.Sum((normal_x, [2,3]), name="sum_21")
sum_21.generate_weights(initializer=tf.initializers.constant([0.7, 0.3]))
sum_22 = spn.Sum((normal_x, [2,3]), name="sum_22")
sum_22.generate_weights(initializer=tf.initializers.constant([0.8, 0.2]))
prod_1 = spn.Product(sum_11, sum_21, name="prod_1")
prod_2 = spn.Product(sum_11, sum_22, name="prod_2")
prod_3 = spn.Product(sum_12, sum_22, name="prod_3")
root = spn.Sum(prod_1, prod_2, prod_3, name="root")
root.generate_weights(initializer=tf.initializers.constant([0.2, 0.3, 0.5]))
indicator_y = root.generate_latent_indicators(name="indicator_y") # Can be added manually
print(root.get_num_nodes())
print(root.get_scope())
print(root.is_valid())
init_weights = spn.initialize_weights(root)
marginal_val = root.get_value(inference_type=spn.InferenceType.MARGINAL)
mpe_val = root.get_value(inference_type=spn.InferenceType.MPE)
mask = [
[True, False],
[True, True],
[False, False], # should output 1.0 for this?
[False, False], # should output 1.0 for this?
]
real_x_data = [
[1.0, 1.0],
[1.0, 1.1],
[1.0, 1.0],
[10.0, -1.0],
]
indicator_y_data = [[0], [0], [0], [0]]
with tf.Session() as sess:
init_weights.run()
marginal_val_arr = sess.run(marginal_val, feed_dict={normal_x: real_x_data, evidence_indicator_feed: mask, indicator_y: indicator_y_data})
mpe_val_arr = sess.run(mpe_val, feed_dict={normal_x: real_x_data, evidence_indicator_feed: mask, indicator_y: indicator_y_data})
print(marginal_val_arr)
print(mpe_val_arr)
There is a small bug in Concat
node:
class Concat(OpNode):
def __init__(self, *inputs, name="Concat", axis=1):
super().__init__(InferenceType.MARGINAL, name)
...
name
is passed in as the second argument in the super class init call.
Looking at the constructor for OpNode
:
class OpNode(Node):
def __init__(self, inference_type=InferenceType.MARGINAL, gradient_type=GradientType.SOFT,
name=None):
...
The gradient_type
argument is before name
. This causes gradient_type
to be assigned with the value of name
, a string.
Commits discovered by blame:
In concat.py
: 40e2369
In node.py
: ba537a4
Issue associated with LIP3.
Issue associated with LIP13.
Hey guys,
at first thanks a lot for your work on SPNs and this nice library!
I'm currently checking out your library's features and found some missing import in the ipython notebook for tutorial 1c:
"import tensorflow as tf"
should do the trick.
thanks for your fix,
Tobias
In 'spn_saving_loading' ipynb example, a simple dense SPN is created and saved to files, once before initializing the weights, and once after initializing weights. Then the original network is evaluated in a session, followed by loading the two previously saved networks from the respective files, and in turn evaluating them in sessions. The example states that the output of the original network and the network created by loading from the post-initialization file should be the same, but are not.
Printing the weights of the root node of the two networks show that the weights of the second network are different from that of the first, indicating a potential bug in spn.initialize_weights() function.
Scattering of values is now achieved by several indirect Ops (padding + gathering). We can instead use the indices passed to scatter_values
directly through tf.SparseTensor
and tf.sparse.to_dense
, which will potentially be faster.
A lot of the pull requests are not valid anymore since most of the code was merged together into dev
. Those old pull requests related to features that are already merged into dev
should be closed.
This issue will be closed when all PRs are either:
libspn-future
milestone if they contain code that is not yet ready for prime timelibspn-release
projectHello developers of LibSPN
I'm trying to save the convSPN created in your tutorial 7 with the code:
spn.JSONSaver('convSPN.spn', pretty=True).save(root)
after training. It is returning the error:
[50, 27, 26, ..., 20, 21, 61]],
[[38, 23, 29, ..., 8, 14, 16],
[23, 20, 10, ..., 33, 10, 50]]]) is not JSON serializable
from the file _encode_json(obj)
Is there something I can do to save it?
The code already in dev
should be reviewed by the original authors to make sure it is of sufficiently high quality for merging into master
. Basic clean-up should happen for existing features, but no new significant features should be introduced (unless they are required as part of the cleanup).
When this issue is closed, the dev
branch will be merged into master
.
New features potentially targeted for the release, should be sent as a PR against dev
, and added to the libspn-release
"needs triage" column. These PRs will either be merged to dev
before it is merged with master
, or more likely will be re-based against master
after the merge.
Issue associated with LIP17.
The tutorial about Wicker SPNs is not working. On execution, I get:
Testing: 0%| | 0/313 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.UnimplementedError: Generic conv implementation only supports NHWC tensor format for now.
[[{{node LogValue_1/ConvProductsDepthwise/Conv2D}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "python/benchmarks/performance.py", line 417, in <module>
result = run()
File "python/benchmarks/performance.py", line 131, in run_wicker_spn
batch_matches = sess.run(match_op, fd(batch_x, batch_y))
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnimplementedError: Generic conv implementation only supports NHWC tensor format for now.
[[node LogValue_1/ConvProductsDepthwise/Conv2D (defined at /home/steven/test/env/lib/python3.6/site-packages/libspn/graph/op/conv_products_depthwise.py:64) ]]
Errors may have originated from an input operation.
Input Source operations connected to node LogValue_1/ConvProductsDepthwise/Conv2D:
LogValue_1/ConvProductsDepthwise/ones (defined at /home/steven/test/env/lib/python3.6/site-packages/libspn/graph/op/conv_products_depthwise.py:60)
LogValue_1/ConvProductsDepthwise/Reshape_1 (defined at /home/steven/test/env/lib/python3.6/site-packages/libspn/graph/op/conv_products_depthwise.py:72)
Original stack trace for 'LogValue_1/ConvProductsDepthwise/Conv2D':
File "python/benchmarks/performance.py", line 417, in <module>
result = run()
File "python/benchmarks/performance.py", line 113, in run_wicker_spn
predictions, = mpe_state.get_state(root_marginalized, marginalized_ivs)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/inference/mpe_state.py", line 45, in get_state
self._mpe_path.get_mpe_path(root)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/inference/mpe_path.py", line 72, in get_mpe_path
self._value.get_value(root)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/inference/value.py", line 65, in get_value
return compute_graph_up(root, val_fun=fun, all_values=self._values)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/graph/algorithms.py", line 61, in compute_graph_up
last_val = val_fun(next_node, *input_vals)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/inference/value.py", line 59, in fun
return node._compute_log_value(*args, **kwargs)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/utils/lrucache.py", line 71, in helper
memo[key] = f(*args, **kwargs)
File "/home/steven/test/env/lib/python3.6/site-packages/libspn/graph/op/conv_products_depthwise.py", line 64, in _compute_log_value
data_format='NCHW'
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 1953, in conv2d
name=name)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1071, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
op_def=op_def)
File "/home/steven/test/env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
Hey,
I am in the process of working my way into libspn, but stumbled upon a very basic question:
Is there already support for learning SPNs with continuous variables? I have already seen the examples for (Hard) EM learning and they seem to work for me with discrete variables. I have also seen that there is a Node called "ContVars" but when trying changing the IVs to ContVars, the learning breaks completely giving: Avg likelihood (this batch data on previous weights): nan
Are ContVars supposed to work with EM and I am doing something wrong? In that case I would try to create a minimal example and dig deeper into the issue. Also: Are there any examples with continuous variables and learning?
Otherwise if there is no support for learning with ContVars right now: What would be the best practice to work around this problem? Simply discretizing the variables and using lots of indicator variables does not seem like a good idea to me.
I would very much appreciate any feedback!
Kind regards!
Implement leaf distributions based on tensorflow-probability
. By using tensorflow-probability
, we can easily support a wide range of pdfs.
I'm now working on an SPN model that uses discriminative GD learning. I would like to compute the loss on the training dataset as well as on the testing dataset, so as to see if the model overfits. I noticed that in gd.py, there are functions cross_entropy_loss
and mle_loss
, which creates internally an operation called value_gen
. This object is exactly what would be helpful for my situation and I wonder if you could make this object accessible externally? Right now I have to modify the code in gd,py myself which is not a good thing to do.
Issue associated with LIP5.
Issue associated with LIP8.
Issue associated with LIP10.
Issue associated with LIP14.
Hi,
Thank you for creating this library.
Initially, it is TensorFlow 2.8
I am trying to run the command "import libspn as spn" but it keeps giving me the error " No module named 'tensorflow.contrib'".
I tried to install TensorFlow 1.13.2, but then it gives a different error "This version of TensorFlow Probability requires TensorFlow version >= 2.8; Detected an installation of version 1.13.2. Please upgrade TensorFlow to proceed."
What should I do?
Issue associated with LIP9.
Hi, I have been able to get tutorials 1a-1c, 2, and 6 to work fine. However, when trying to get tutorial 3 and 4 to work, the following error occurs:
[WARNING] [tensorflow:getattr] From tutorial_4.py:50: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.
[WARNING] [tensorflow:getattr] From /home/aaron/tf/lib/python3.6/site-packages/libspn/graph/node.py:40: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
[WARNING] [tensorflow:getattr] From /home/aaron/tf/lib/python3.6/site-packages/libspn/graph/leaf/indicator.py:63: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
[WARNING] [tensorflow:getattr] From /home/aaron/tf/lib/python3.6/site-packages/libspn/graph/leaf/indicator.py:91: The name tf.log is deprecated. Please use tf.math.log instead.
It seems to be related with the following lines:
hard_em_learning = spn.HardEMLearning(root=root)
update_op = hard_em_learning.accumulate_and_update_weights()
llh_op = tf.reduce_mean(root.get_log_value())
and something going on here:
File "/home/aaron/tf/lib/python3.6/site-packages/libspn/utils/math.py", line 186, in gather_cols_3d
pad_elem = np.array(pad_elem).astype(tf.DType(params.dtype).as_numpy_dtype)
File "/home/aaron/tf/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py", line 80, in init
type_enum = int(type_enum)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'DType'
Is there a package that I don't have that would cause this?
I am running tensorflow 1.14, and tensorflow_probability 0.7.0/
I have not tried tutorial 5.
Issue associated with LIP6.
While experimenting with various SPN sizes I encountered the problem, that the runtime quickly gets very slow.
For example using the following SPN (24k nodes) on an i7 machine:
gen = spn.DenseSPNGenerator(num_decomps=2, num_subsets=2, num_mixtures=2)
iv_x = spn.IVs(num_vars=20, num_components=4)
root = gen.generate(iv_x)
class_roots = [gen.generate(iv_x) for _ in range(10)]
root = spn.Sum(*class_roots)
accumulate_updates = learning.accumulate_updates()
takes more than 6 minutesmpe_state_gen.get_state(root, iv_x, latent)
takes nearly 3 minutesAfter generating all required weights and ops the process takes already about 17gb of memory without having any training data loaded yet.
Those numbers seem very high to me - especially the runtime for accumulate_updates and the total memory usage.
Am I underestimating the work load or is there something wrong?
I attached the used IPython notebook I used for time measurements for reference.
I would be very glad if someone could take a look and tell me if I am doing anything wrong or if this is normal.
Thanks in advance!
Kind regards,
Simon
Issue associated with LIP2.
tensorflow 1.14 is not compatible with macbook m1
I get the error: "ModuleNotFoundError: No module named 'tensorflow.contrib'". This occurs because the contrib module was removed in TensorFlow 2.x. This module was a part of TensorFlow 1.x, but it was deprecated and is no longer available in TensorFlow 2.x. Unfortunately, only Tensorflow 2.x and above are compatible with the m1 chip.
Issue associated with LIP11.
Issue associated with LIP1
Issue associated with LIP16.
Issue associated with LIP7.
The current SPN graph visualization is quite limited, hard to read, and unusable for larger graphs.
Some desired features:
Issue associated with LIP15.
Issue associated with LIP4.
TL;DR: how should I duplicate an SPN? I have a big IVs
and I want the duplicated SPNs to take different subsets of indicators from this big IVs
. How should I do that?
I am now on commit a649c62 (09/10/2018) "batch noise now shuffles instead of rolls".
In a prior version of the libspn that I used (on the master branch), I was able to duplicate an SPN by using my own compute_graph_up
function, pasted below. The key difference is that this function deals with the case where there are indicator variables specified in the format of a tuple (node, indices).
from collections import deque, defaultdict
def mod_compute_graph_up(root, val_fun, **kwargs):
all_values = {}
stack = deque() # Stack of inputs to process
stack.append((root, None)) # node and index
while stack:
next_input = stack[-1]
# Was this node already processed?
# This might happen if the node is referenced by several parents
if next_input not in all_values:
if next_input[0].is_op:
# OpNode
input_vals = [] # inputs to the node of 'next_input'
all_input_vals = True
# Gather input values for non-const val fun
for inpt in next_input[0].inputs:
if inpt: # Input is not empty
try:
# Check if input_node in all_vals
if inpt.indices is None:
input_vals.append(all_values[(inpt.node, None)])
else:
input_vals.append(all_values[(inpt.node, tuple(inpt.indices))])
except KeyError:
all_input_vals = False
if inpt.indices is None:
stack.append((inpt.node, None))
else:
stack.append((inpt.node, tuple(inpt.indices)))
else:
# This input was empty, use None as value
input_vals.append(None)
# Got all inputs?
if all_input_vals:
last_val = val_fun(next_input, *input_vals, **kwargs)
all_values[next_input] = last_val
stack.pop()
else:
# VarNode, ParamNode
last_val = val_fun(next_input, **kwargs)
all_values[next_input] = last_val
stack.pop()
else:
stack.pop()
return last_val
Then I wrote a function that uses mod_compute_graph_up
which actually copies the entire structure and parameters of a given SPN, while keeping the same inputs. This worked well before, and I have documented how it works. But now after I have switched to newer code (a649c62), my SPN duplication code does not work any more. I got an error:
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
I think this is due to some changes to the SPN structure internally. Then, how should I duplicate an SPN? I have a big IVs
and I want the duplicated SPNs to take different subsets of indicators from this big IVs
. How should I do that?
(apologies for keep posting issues)
There is a ton of issues open at this point which we must re-evaluate and assign to the release project, close, or mark for the future.
This task will be closed once all issues are:
libspn-release
or to the milestone libspn-future
An issue should be reviewed by the person creating it in the first place, but everyone is welcome to comment on and suggest fate of any issue.
Issue associated with LIP12.
Hi, for tutorial 7, I was wondering which node from the graph I should give to sess.run() to get the log probability of each class given an observation.
E.g. for 32 observations from the MNIST set:
the size of batch_x is [32,784]. I want output of sess.run() to be of size [32,10], where these are the log probabilities of each of the 10 classes.
Thank you very much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.