Comments (6)
Appreciate your reproducible example - helps me to see this mysterious ClassCastException
myself:
WARNING: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.target_fields' is not set. Assuming [y1, y2, y3, y4, y5] as the name of target fields
Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.util.List
at sklearn2pmml.pipeline.PMMLPipeline.initLabel(PMMLPipeline.java:581)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:130)
at com.sklearn2pmml.Main.run(Main.java:91)
at com.sklearn2pmml.Main.main(Main.java:66)
from sklearn2pmml.
For an unknown reason, the SkLearn2PMML has trouble figuring out the intent of the classification task. It currently thinks that it's dealing with a multioutput task (ie. five independent labels). The correct thinking would be a multiclass task (a single label with five independent classes).
The problem can be fixed by wrapping the y(_train)
from Numpy array to Pandas series, and calling PMMLPipeline.fit(X, y)
with it (instead of MLPClassifier.fit(X, y)
):
from pandas import Series
# THIS!
y_train = Series(y_train, name = "y")
model = MLPClassifier(hidden_layer_sizes=(10,5,2), max_iter=1000)
# Don't do MLPClassifier.fit(X, y)
#model.fit(X_train, y_train)
pmml_pipeline = PMMLPipeline([
("classifier", model)
])
# THIS!
pmml_pipeline.fit(X_train, y_train)
sklearn2pmml(pmml_pipeline, 'model01.pmml')
These two changes together help SkLearn2PMML to understand that it's dealing with a multiclass classification problem (one label, five category levels).
from sklearn2pmml.
Thank you very much for your quick response. We hope to see this resolved in the future which will improve the package overall.
from sklearn2pmml.
The SkLearn2PMML package is fine. It is your pipeline that needs upgrading.
Most importantly, you should be calling PMMLPipeline.fit(X, y)
always (as opposed to calling Model.fit(X, y)
and then wrapping it into a dummy PMMLPipeline
object), because that will set the PMMLPipeline.active_fields
and PMMLPipeline.target_fields
attributes to meaningful values. This way, the SkLearn2PMML package does not need to make guesses about the nature of the classification problem.
Look at your own SkLearn2PMML console log that precedes the error. It clearly states that SkLearn2PMML makes an assumption that there must be five labels present (ie. multioutput).
from sklearn2pmml.
The fix will probably be in the form of making the "has the PMMLPipeline object been fitted or not" check more strict.
So, instead of making a guess, it would raise a clear "the PMMLPipeline object is not fitted" error.
from sklearn2pmml.
Related Issues (20)
- Using `ExpressionTransformer` for column selection
- API for customizing `OutputField` names HOT 3
- PMML target output from float to double precision HOT 2
- Function "stringLength" is not defined HOT 2
- Fail to create pmml when `expr` of `ExpressionTransformer` is a function HOT 6
- The `StringNormalizer` transformer should perform input validation HOT 9
- ExpressionTransformer : Name 'X' refers to a row vector. HOT 7
- Excuse me, can I convert the vgg .pkl trained by pytroch to pmml format through sklean2pmml HOT 1
- Post-processing predicted probabilities using a helper (ie. exogenous) feature HOT 7
- SelectFirstClassifier will cause the predicted value of pmml to all become 0? HOT 9
- builtins functions are not working when try to save pipeline HOT 4
- Is there a way to generate a derived field using another derived field? HOT 2
- Pipeline and Loaded Model Gives Different Probabilities HOT 2
- What is an `x-multiModelChain` multiple model type? HOT 1
- [Request] Add support for CalibratedClassifierCV HOT 1
- Support for `AdaBoostClassifier` (revisited) HOT 1
- Unpickling error: invalid pickle opcode: 64 HOT 3
- Support for `numpy.datetime64` data type(s) HOT 23
- Failure to create a pmml file when using CountVectorizer with analyzer = 'char' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sklearn2pmml.