GithubHelp home page GithubHelp logo

Comments (5)

celikmustafa89 avatar celikmustafa89 commented on August 23, 2024

I guess you should set the instances header.
just debug the code you will find the null part of your code.
dataset is null, try to set it.

from moa.

onofricamila avatar onofricamila commented on August 23, 2024

image

The instance header for every instance d is null, but not the data ... I did not mention that in my question because I did not think of it as the cause of the problem. I used the same data generator, and the same code for StreamKM, and there wasn't any problem with that.

This code works:

import com.yahoo.labs.samoa.instances.DenseInstance;
import moa.cluster.Clustering;
import moa.clusterers.streamkm.StreamKM;

public class TestingStreamKM {
    static DenseInstance randomInstance(int size) {
        DenseInstance instance = new DenseInstance(size);
        for (int idx = 0; idx < size; idx++) {
            instance.setValue(idx, Math.random());
        }
        return instance;
    }
    public static void main(String[] args) {
        StreamKM streamKM = new StreamKM();
        streamKM.numClustersOption.setValue(5); // default setting
        streamKM. resetLearningImpl();
        for (int i = 0; i < 1000; i++) {
            DenseInstance d = randomInstance(2);
            streamKM.trainOnInstanceImpl(d);
        }
        Clustering result = streamKM.getClusteringResult();
    }
}

image


Now, if the null instance header is the problem, where should I set it? It must the same for the whole dataset ...

Thanks for answering so fast!

from moa.

celikmustafa89 avatar celikmustafa89 commented on August 23, 2024

I have updated the code.
It is working as I mentioned, you have to assign header to your instance.
here is stackoverflow link https://stackoverflow.com/questions/58869442/java-lang-nullpointerexception-when-trying-moa-stream-clustering-algorithm-denst/58910104#58910104

here is the updated code:

static DenseInstance randomInstance(int size) {

	// generates the name of the features which is called as InstanceHeader
	ArrayList<Attribute> attributes = new ArrayList<Attribute>();
	for (int i = 0; i < size; i++) {
		attributes.add(new Attribute("feature_" + i));
	}
	// create instance header with generated feature name
	InstancesHeader streamHeader = new InstancesHeader(
			new Instances("Mustafa Çelik Instance",attributes, size));

	// generates random data
	double[] data = new double[2];
	Random random = new Random();
	for (int i = 0; i < 2; i++) {
		data[i] = random.nextDouble();
	}

	// creates an instance and assigns the data
	DenseInstance inst = new DenseInstance(1.0, data);

	// assigns the instanceHeader(feature name)
	inst.setDataset(streamHeader);

	return inst;
}
public static void main(String[] args) {
	WithDBSCAN withDBSCAN = new WithDBSCAN();
	withDBSCAN.resetLearningImpl();
	withDBSCAN.initialDBScan();
	for (int i = 0; i < 1500; i++) {
		DenseInstance d = randomInstance(5);

		withDBSCAN.trainOnInstanceImpl(d);
	}
	Clustering clusteringResult = withDBSCAN.getClusteringResult();
	Clustering microClusteringResult = withDBSCAN.getMicroClusteringResult();

	System.out.println(clusteringResult);

}

here is the screenshot of debug process, as you see the clustering result is:

Screen Shot 2019-11-18 at 10 52 52 AM

from moa.

celikmustafa89 avatar celikmustafa89 commented on August 23, 2024

image

The instance header for every instance d is null, but not the data ... I did not mention that in my question because I did not think of it as the cause of the problem. I used the same data generator, and the same code for StreamKM, and there wasn't any problem with that.

This code works:

import com.yahoo.labs.samoa.instances.DenseInstance;
import moa.cluster.Clustering;
import moa.clusterers.streamkm.StreamKM;

public class TestingStreamKM {
    static DenseInstance randomInstance(int size) {
        DenseInstance instance = new DenseInstance(size);
        for (int idx = 0; idx < size; idx++) {
            instance.setValue(idx, Math.random());
        }
        return instance;
    }
    public static void main(String[] args) {
        StreamKM streamKM = new StreamKM();
        streamKM.numClustersOption.setValue(5); // default setting
        streamKM. resetLearningImpl();
        for (int i = 0; i < 1000; i++) {
            DenseInstance d = randomInstance(2);
            streamKM.trainOnInstanceImpl(d);
        }
        Clustering result = streamKM.getClusteringResult();
    }
}

image

Now, if the null instance header is the problem, where should I set it? It must the same for the whole dataset ...

Thanks for answering so fast!

Algorithms have different abilities, and differs from some points. Streamkm algorithm can work without assigning header. WithDBSCAN needs the headers, you must assign them. The have different data structures. They may inherit from same classes, but works differently.

Debug your code and try to fill the null parameters. It is a good way to find the gaps.

from moa.

onofricamila avatar onofricamila commented on August 23, 2024

Hey, thanks! You really helped me out here :)

I have a few questions left to ask:

  1. why is the macro clusters weight field = 0 in the debugger?

image

If you open the nested micro clusters for a given macro cluster, you will see they have a weight defined.

image

  1. Furthermore, micro clusters have a null center and radius (which is weird because the MicroCluster class extends CFCluster which extends SphereCluster). I am able to get those values using the getter methods, but that called my attention.

image

  1. Check also the N value for a macro cluster is not the sum of the micro clusters it has inside N values ...

image

It seems something strange is happening ...

Thanks again for the support.

from moa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.