idsia / crema Goto Github PK

Crema: Credal Models Algorithms

Home Page: https://crema-toolbox.readthedocs.io/

License: GNU Lesser General Public License v3.0

Java 100.00% Stata 0.01%

crema credal-models bayesian-models inference credal bayesian probabilistic-graphical-models imprecise-probability probability

crema's People

Contributors

Stargazers

Watchers

Forkers

giorgiaauroraadorni radum2275 antithesis1250 degiorgig

crema's Issues

Random seed for ArraysUtil::shuffle

The methods ArraysUtil::shuffle do not use the global Random object and hence the it cannot be replicated. An so it happens with the derived methods.

The following code does not always produces the same result, though it should.

        RandomUtil.setRandomSeed(10);
        VertexFactor vfi = VertexFactorUtilities.random(Strides.as(0,3), 3);
        System.out.println(vfi);

        BayesianFactor bf = BayesianFactorUtilities.random(Strides.as(0,3), Strides.empty());
        System.out.println(bf);

        double[] v = RandomUtil.sampleNormalized(3);
        System.out.println(Arrays.toString(v));

        double[] v2 = new double[]{0,1,2,3,4};
        ArraysUtil.shuffle(v2);
        System.out.println(Arrays.toString(v2));

Merge convex hull functionality (crepo updating)

The first step for integrating crepo should imply to integrate the new convex hull functionality from isipta21benchmark branch to the dev. In short, these are a set of CH methods defined at ConvexHull.Method that can be applied to a matrix of doubles, a VertexFactor or applied during inference (with VE) after each marginalization.

A summary of this functionality is:

	DAGModel model = (DAGModel) IO.readUAI("./models/pgm-vcredal.uai");

	VertexFactor vf = (VertexFactor) model.getFactor(0);

	// Apply convexhull to a matrix of doubles
	ConvexHull.as(ConvexHull.Method.QUICK_HULL).apply(vf.getData()[0])
	ConvexHull.as(ConvexHull.Method.LP_CONVEX_HULL).apply(vf.getData()[0])
	ConvexHull.as(ConvexHull.Method.REDUCED_HULL_2).apply(vf.getData()[0])

	//Apply to a factor (functional method)
	VertexFactor r = vf.convexHull(ConvexHull.Method.DEFAULT);

	//Apply to a factor (inline method)
	vf.applyConvexHull(ConvexHull.Method.REDUCED_HULL_3);

	//Variable elimination with convex hull (after marginalization)
	CredalVariableElimination inf = new CredalVariableElimination(model);
	
	inf.setConvexHullMarg(ConvexHull.Method.REDUCED_HULL_3);
	inf.query(0);

	inf.setConvexHullMarg(ConvexHull.Method.REDUCED_HULL_2);
	inf.query(0);

Pipelines and algorithms

Long time ago, for the Belief Propagation we added the Algorithm Interface.
With my last commit of today, with the help of @davidhuber , I also added a pipeline class called Pipe. The idea is to use the Pipe to build the Belief Propagation algorithm as a Pipeline that can produce a self-updatable JunctionTree.

Doing so I thought that we need some kind of interfaces as in sklearn or Spark:

a Transformer interface that act like the current Algorithm interface and transform one input to an output;
an Estimator interface that take some inputs but does not produce any kind of output.

(Names are from Spark)

Will this be somewhat helpful?

Read credal networks in uai format

More info about the format: https://www.cse.huji.ac.il/project/PASCAL/fileFormat.php

network specification in format .uai
evidence in format .uai.evid
query in format .uai.query

Add classes at ch.idsia.crema.model.io.uai

Travis tests working with external jars

Travis tests are failing because external packages (polco and lpsolve) is not available.

ArrayIndexOutOfBoundsException with CredalApproxLP

The following code


        // define the structure
        SparseModel cnet = new SparseModel();
        int a = cnet.addVariable(2);
        int b = cnet.addVariable(3);
        cnet.addParent(a,b);

        // add credal set K(B)
        SeparateHalfspaceFactor fb = new SeparateHalfspaceFactor(cnet.getDomain(b), Strides.empty());
        fb.addConstraint(new double[]{1,0,0}, Relationship.GEQ, 0.2);
        fb.addConstraint(new double[]{1,0,0}, Relationship.LEQ, 0.3);
        fb.addConstraint(new double[]{0,1,0}, Relationship.GEQ, 0.4);
        fb.addConstraint(new double[]{0,1,0}, Relationship.LEQ, 0.5);
        fb.addConstraint(new double[]{0,0,1}, Relationship.GEQ, 0.2);
        fb.addConstraint(new double[]{0,0,1}, Relationship.LEQ, 0.3);
        cnet.setFactor(b,fb);

        // add credal set K(A|B)
        SeparateHalfspaceFactor fa = new SeparateHalfspaceFactor(cnet.getDomain(a), cnet.getDomain(b));
        fa.addConstraint(new double[]{1,0}, Relationship.GEQ, 0.5, 0);
        fa.addConstraint(new double[]{1,0}, Relationship.LEQ, 0.6, 0);
        fa.addConstraint(new double[]{0,1}, Relationship.GEQ, 0.4, 0);
        fa.addConstraint(new double[]{0,1}, Relationship.LEQ, 0.5, 0);
        fa.addConstraint(new double[]{1,0}, Relationship.GEQ, 0.3, 1);
        fa.addConstraint(new double[]{1,0}, Relationship.LEQ, 0.4, 1);
        fa.addConstraint(new double[]{0,1}, Relationship.GEQ, 0.6, 1);
        fa.addConstraint(new double[]{0,1}, Relationship.LEQ, 0.7, 1);
        fa.addConstraint(new double[]{1,0}, Relationship.GEQ, 0.1, 2);
        fa.addConstraint(new double[]{1,0}, Relationship.LEQ, 0.2, 2);
        fa.addConstraint(new double[]{0,1}, Relationship.GEQ, 0.8, 2);
        fa.addConstraint(new double[]{0,1}, Relationship.LEQ, 0.9, 2);
        cnet.setFactor(a,fa);

        // set up the inference and run the queries
        Inference inf = new CredalApproxLP(cnet);
        IntervalFactor res1 = (IntervalFactor) inf.query(b, ObservationBuilder.observe(a, 0));
        IntervalFactor res2 = (IntervalFactor) inf.query(b);

        System.out.println(res1);
        System.out.println(res2);

raises the exception

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1

DAGmodel conversors

For converting a full model with a given kind specification to another one, now a copy of the structure should be done and then each of the factors are converted (one by one).

Encapsulate this functionality.
Create a proper class structure for allowing all the possible conversion types (and those not implemented yet).

ApproxLP non stable results.

ApproxLP occasionally produces NaN results for queries that are solvable.

Consider this code:

        double eps = 0.000000001; // eps > 0. If set to 0.0, the inference will not work.

        DAGModel model = new DAGModel();
        int x = model.addVariable(2);
        int u = model.addVariable(3);
        model.addParent(x,u);

        BayesianFactor ifx = new BayesianFactor(model.getDomain(x,u));
        ifx.setData(new double[] {
                1., 0.,
                1., 0.,
                0., 1.,
        });
        model.setFactor(x, ifx);

        IntervalFactor ifu = new IntervalFactor(model.getDomain(u), model.getDomain());
        ifu.set(new double[] { 0, 0, 0.8-eps}, new double[] { 0.2, 0.2, 0.8 });
        model.setFactor(u, ifu);

        for(int i=0;i<100;i++) {
            ApproxLP2 inference = new ApproxLP2();
            double[] upper = inference.query(model, x).getUpper();
            System.out.println(Arrays.toString(upper));
        }

The out would be:

[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[NaN, NaN]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
[0.20000000099999993, 0.8]
...
``

Move causality code to credici

https://github.com/IDSIA/credici

Link javadoc to RTD documentation

Make credal parsers consistent with UAI endianness

In V-CREDAL and H-CREDAL:

UAI format seems to consider a big endian encoding (first variable is the most significant one):

[p(0,0), p(0,1), p(0,2), p(1,0), p(1,1), p(1,2)]

while Crema is the opposite (little endian):

[p(0,0), p(1,0), p(0,1), p(1,1), p(0,2), p(1,2)]

In the file it should be encoded as big endian following the UAI format, but then translated into the crema format. (The current implementation considers that files are in little endian)

About the domain, it seems that the conditioned variable (i.e., variable on the left) is always specified in the last place. For the V-CREDAL and H-credal we should also change this to the UAI style.

NOTE: check the BAYES parser where all this has already been solved.

Bnet UAI writer

Fix all tests

XMLtest and StructuralCausalModel::merge tests have been disabled. Fix them.

Posterior query without non-negative constraints (ApproxLP)

When running a posterior query without nonnegative constraints and without the epsilon perturbation of 0.0 values,
the inference could fail. The reason is that the factor in the denominator contain zeros. This produces many NaN values,
which cannot handled by the optimiser.

One possible solution is to keep non-negative constraints and perturb them by adding an epsilon value. However, this
can make inference untracktable in large networks. Find a solution that allows omitting non-negative constraints.

Code example:

      double eps = 0.0000001;
        String prj_folder = ".";

        SparseModel model = (SparseModel) IO.read(prj_folder+"/models/chain3-nonmarkov.uai");
        for(int v : model.getVariables()) {
            SeparateHalfspaceFactor f = (SeparateHalfspaceFactor) model.getFactor(v);
            f = f.mergeCompatible();
            f = f.removeNormConstraints();
            //f = f.removeNonNegativeConstraints();   // Not working WITH this
            f = f.getPerturbedZeroConstraints(eps); // Not working WITHOUT this

            model.setFactor(v, f);
        }


        for(int v : model.getVariables()) {
            ((SeparateHalfspaceFactor) model.getFactor(v)).printLinearProblem();
        }


        CredalApproxLP inf = new CredalApproxLP(model);
        SparseModel infModel = (SparseModel) inf.getInferenceModel(1, ObservationBuilder.observe(2,0));

        System.out.println(
                inf.query(1, ObservationBuilder.observe(0,0))
        );

Read auxiliary uai files

These files can be:

Evidence file with format .uai.evid
Intervention file with format uai.do
Query file with format uai.query

test

Cannot perform inference with ApproxLP2 and IntervalFactor

As title says, when one declares a DAGModel<IntervalFactor> and he try to run an inference using ApproxLP2, no models work.

VertexToHspace not throwing exception

When transformation is not feasible, conversor VertexToHspace returns a factor with data equal to null, instead of returning an exception.

Basic Usage Tutorials

Implement Basic usage tutorials that describe:

Domains
Factors
Graphical Networks
Inference
IO

Remove ch.idsia.crema.model.graphica.Graph

We should remove ch.idsia.crema.model.graphica.Graph and use the JGraphT::graph instead. With this change we will also get rid of the h.idsia.crema.model.graphica.SparseList

This WILL slightly impact credici as the getVariables methods will not be available anymore and will have to be replaced with calls to vertexSet and the JGraphT methods

Simplify definition of H-credal sets

UAI format documentation and tests

Write the documentation and complete the tests for the writers and parsers of models in UAI format:

VCREDAL
HCREDAL
BAYES

Belief Propagation issues with Naive Bayes-like networks

When we have a structure like the Bayes Network (one parent with many childs), the Belief Propagation behave in a strange way: cliques are created correctly (i.e. one Clique contains one child and the parent) but when the messages pass from one clique to the root, all information is lost (the passing message has no domain).

See 49f2b8f for details.

Conversor from V to H factors

Convert objects of class VertexFactor into an equivalent object of class SeparateHalfSpaceFactor.

Read SCMs in uai format

EM documentation

CSV data loader/writer

Load data from a file and transform it (e.g., TIntIntHashMap[] )

Wrong data length after filtering a SeparateHalfspaceFactor

After using filter operation with a conditioning variable over a SeparateHalfspaceFactor , the resulting data object in the factor has a wrong length: positions non consistent with the filter are simply set with an empty list, instead of being removed.

Consider the following code:


	int x = 0, y=1, z=2;

	// P(x|y,z)
	SeparateHalfspaceFactorFactory ff = SeparateHalfspaceFactorFactory
			.factory().domain(Strides.as(x,2), Strides.as(y,2,z,2));

	

	ff.constraint(new double[]{1,0}, Relationship.EQ, 1.0, 0);
	ff.constraint(new double[]{0,1}, Relationship.EQ, 0.0, 0);

	ff.constraint(new double[]{1,0}, Relationship.EQ, 0.2, 1);
	ff.constraint(new double[]{0,1}, Relationship.EQ, 0.8, 1);

	ff.constraint(new double[]{1,0}, Relationship.EQ, 0.9, 2);
	ff.constraint(new double[]{0,1}, Relationship.EQ, 0.1, 2);

	ff.constraint(new double[]{1,0}, Relationship.EQ, 0.5, 3);
	ff.constraint(new double[]{0,1}, Relationship.EQ, 0.5, 3);


	SeparateHalfspaceFactor f = ff.get();

	TIntObjectMap data = f.getData();
	for(int i=0; i<data.size(); i++)
		System.out.println(i+": "+data.get(i));

	System.out.println("");

	data = f.filter(y,0).getData();
	for(int i=0; i<data.size(); i++)
		System.out.println(i+": "+data.get(i));


	/* Output:
	
	0: [org.apache.commons.math3.optim.linear.LinearConstraint@de6ed1f6, org.apache.commons.math3.optim.linear.LinearConstraint@607ed1f6]
	1: [org.apache.commons.math3.optim.linear.LinearConstraint@47ced1f5, org.apache.commons.math3.optim.linear.LinearConstraint@c60ed1f5]
	2: [org.apache.commons.math3.optim.linear.LinearConstraint@12bed1f7, org.apache.commons.math3.optim.linear.LinearConstraint@c65ed1f5]
	3: [org.apache.commons.math3.optim.linear.LinearConstraint@de7ed1f6, org.apache.commons.math3.optim.linear.LinearConstraint@5f9ed1f6]
	
	0: [org.apache.commons.math3.optim.linear.LinearConstraint@de6ed1f6, org.apache.commons.math3.optim.linear.LinearConstraint@607ed1f6]
	1: []
	2: [org.apache.commons.math3.optim.linear.LinearConstraint@12bed1f7, org.apache.commons.math3.optim.linear.LinearConstraint@c65ed1f5]
	3: []

	 */

whereas the output for the filtered factor should be

	0: [org.apache.commons.math3.optim.linear.LinearConstraint@de6ed1f6, org.apache.commons.math3.optim.linear.LinearConstraint@607ed1f6]
	1: [org.apache.commons.math3.optim.linear.LinearConstraint@12bed1f7, org.apache.commons.math3.optim.linear.LinearConstraint@c65ed1f5]

Builder of observation list

Build a list of observations from a 2D array of data and a 1D of IDs. Integrate in ch.idsia.crema.model.ObservationBuilder?

Example:

        int[][] dataX = {
                {0, 0, 0},
                {1, 1, 1},
                {0, -1, 0},
                {1, -1, 1}
        };

        TIntIntMap[] observations = new TIntIntMap[dataX.length];
        for(int i=0; i<observations.length; i++) {
            observations[i] = new TIntIntHashMap();
            for(int j=0; j<dataX[i].length; j++) {
                if(dataX[i][j]>=0)
                    observations[i].put(X[j], dataX[i][j]);
            }
        }

Add readthedocs documentation

Write a brief sphinx documentation and upload it too rtd. Check if java doc can be integrated
with sphinx.

Crepo compatibility with crema 0.2.0

I have done a first review of the crepo java code for making it compatible with crema 0.2.0 (https://github.com/IDSIA/crepo/tree/dev-crema-0.2.0). Here I detail some points that I do not know how to do in the new version of crema or that I would like to discuss:

A random v-factor can be created with the following method, but how do you limit the number of decimals?

	VertexFactorUtilities.random(leftDomain, Strides.empty(), nVert);

In a existing H-factor, how can I specify set of constraints for a given parent combination from a linearContraintSet? This was done with setLinearProblemAt. This method is not available anymore.

    public static SeparateHalfspaceFactor vertexToHspace(VertexFactor factor) throws IOException, InterruptedException {


        SeparateHalfspaceFactor HF = SeparateHalfspaceFactorFactory.factory()
                .domain(factor.getDataDomain(), factor.getSeparatingDomain()).get();
        
        for(int i=0; i<factor.getSeparatingDomain().getCombinations(); i++) {
            VertexFactor VFi = new VertexDefaultFactor(factor.getDataDomain(), Strides.empty(), new double[][][]{factor.getData()[i]});
            SeparateHalfspaceDefaultFactor HFi = (SeparateHalfspaceDefaultFactor) Convert.margVertexToHspace(VFi);
            HF.setLinearProblemAt(i, HFi.getLinearProblemAt(0));
        }

        return HF;

    }

Make public the method SeparateHalfspaceDefaultFactor::setLinearProblemAt.

Define a common interface for IO

Implement EM

Implement the learning algorithm EM. Try to do it as general as possible because
it will be used with BNs but also with SCM in credici.

Some code was already done:

https://github.com/IDSIA/crema/blob/master/src/main/java/ch/idsia/crema/learning/ExpectationMaximization.java

HalfspaceToRandomBayesianFactor fails without non-negative constraints

When running approxLP, in the method ch.idsia.crema.inference.approxlp1.Neighbourhood::random uses the converter HalfspaceToRandomBayesianFactor, which would fail if the H-factor does not contain the non-negative constraints. However, this is not the case SeparateLinearToRandomBayesian, which was indeed the converter used in older crema versions.

Is there any advantage of HalfspaceToRandomBayesianFactor wrt SeparateLinearToRandomBayesian? If not, we might need to change the converter for allowing H-factors without non-negative constraints. Or at least, set a flag for controlling which one is used.

Here you have a simple code where one works and the other does not.


        double[][] coef = ArraysUtil.reshape2d(new double[]{1,0,0,1}, 2,2);
        double[] vals0 = new double[]{0,1};
        double[] vals1 = new double[]{1,0};


        SeparateHalfspaceFactorFactory ff = SeparateHalfspaceFactorFactory
                .factory()
                .domain(Strides.as(0,2), Strides.as(1,2));


        ff.constraint(coef[0], Relationship.EQ, vals0[0], 0);
        ff.constraint(coef[1], Relationship.EQ, vals0[1], 0);
        ff.constraint(coef[0], Relationship.EQ, vals1[0], 1);
        ff.constraint(coef[1], Relationship.EQ, vals1[1], 1);

        // Non-negativeness constraints
        if(false) {
            ff.constraint(new double[]{1, 0}, Relationship.GEQ, 0.0, 0);
            ff.constraint(new double[]{0, 1}, Relationship.GEQ, 0.0, 0);
            ff.constraint(new double[]{1, 0}, Relationship.GEQ, 0.0, 1);
            ff.constraint(new double[]{0, 1}, Relationship.GEQ, 0.0, 1);
        }

        SeparateHalfspaceDefaultFactor f = (SeparateHalfspaceDefaultFactor) ff.get();
        f.printLinearProblem();

        // This works without non-neg constraints
        BayesianFactor bf1 = new SeparateLinearToRandomBayesian().apply(f, 0);
        System.out.println(bf1);

        // This does not work (unfeasible solution)
        BayesianFactor bf2 = new HalfspaceToRandomBayesianFactor().apply(f,0);
        System.out.println(bf2);

@cbonesana, @davidhuber , what do you think?

Read a discrete BN in uai format

Convex hull implementation

Implement different convex hull implementations, all of them using a common interface. Then, Credal VE could be modified to use any of these implementations (or None).

Parser for BIF format

The Interchange Format for Bayesian Networks (BIF) is commonly used as an interchange format, like the UAI.

We need an implementation of at least a parser for this format.

Entropy

We currently have the following entropy methods:

AbellanEntropy
MaximumEntropy
BayesianEntropy

The branch dev-feature-entropy was created for these features.

Uniform unit test suit

The tests requires both junit:4.12 and junit-jupiter-params:5.4.2.

Junit4 is used for all tests, while Jupiter is used only for some parametric tests. It would be nice to use only one version of Junit. Given the requirement for parametric tests, I suggest to switch to use Junit5.

In addition, personally prefer to throw IllegalArgumentException for sanity checks and not user a unit asserts from a test framework like Junit (example UAIParser.java#L123 and subclasses).

idsia / crema Goto Github PK

crema's People

Contributors

Stargazers

Watchers

Forkers

crema's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs