GithubHelp home page GithubHelp logo

Comments (6)

vruusmann avatar vruusmann commented on July 20, 2024

Appreciate your reproducible example - helps me to see this mysterious ClassCastException myself:

WARNING: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.target_fields' is not set. Assuming [y1, y2, y3, y4, y5] as the name of target fields
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.util.List
        at sklearn2pmml.pipeline.PMMLPipeline.initLabel(PMMLPipeline.java:581)
        at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:130)
        at com.sklearn2pmml.Main.run(Main.java:91)
        at com.sklearn2pmml.Main.main(Main.java:66)

from sklearn2pmml.

vruusmann avatar vruusmann commented on July 20, 2024

For an unknown reason, the SkLearn2PMML has trouble figuring out the intent of the classification task. It currently thinks that it's dealing with a multioutput task (ie. five independent labels). The correct thinking would be a multiclass task (a single label with five independent classes).

The problem can be fixed by wrapping the y(_train) from Numpy array to Pandas series, and calling PMMLPipeline.fit(X, y) with it (instead of MLPClassifier.fit(X, y)):

from pandas import Series

# THIS!
y_train = Series(y_train, name = "y")

model = MLPClassifier(hidden_layer_sizes=(10,5,2), max_iter=1000)
# Don't do MLPClassifier.fit(X, y)
#model.fit(X_train, y_train)

pmml_pipeline = PMMLPipeline([	
 	("classifier", model)
])
# THIS!
pmml_pipeline.fit(X_train, y_train)

sklearn2pmml(pmml_pipeline, 'model01.pmml')

These two changes together help SkLearn2PMML to understand that it's dealing with a multiclass classification problem (one label, five category levels).

from sklearn2pmml.

algarnims avatar algarnims commented on July 20, 2024

Thank you very much for your quick response. We hope to see this resolved in the future which will improve the package overall.

from sklearn2pmml.

vruusmann avatar vruusmann commented on July 20, 2024

The SkLearn2PMML package is fine. It is your pipeline that needs upgrading.

Most importantly, you should be calling PMMLPipeline.fit(X, y) always (as opposed to calling Model.fit(X, y) and then wrapping it into a dummy PMMLPipeline object), because that will set the PMMLPipeline.active_fields and PMMLPipeline.target_fields attributes to meaningful values. This way, the SkLearn2PMML package does not need to make guesses about the nature of the classification problem.

Look at your own SkLearn2PMML console log that precedes the error. It clearly states that SkLearn2PMML makes an assumption that there must be five labels present (ie. multioutput).

from sklearn2pmml.

vruusmann avatar vruusmann commented on July 20, 2024

The fix will probably be in the form of making the "has the PMMLPipeline object been fitted or not" check more strict.

So, instead of making a guess, it would raise a clear "the PMMLPipeline object is not fitted" error.

from sklearn2pmml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.