GithubHelp home page GithubHelp logo

Comments (8)

vruusmann avatar vruusmann commented on August 20, 2024 1

Looks like Scikit-Learn 1.4.0 has introduced a new routed_params parameter to the Pipeline._fit() method:
https://github.com/scikit-learn/scikit-learn/blob/1.4.0/sklearn/pipeline.py#L385-L421

And, as a result of this, the PMMLPipeline.fit() method fails already in the Python layer; the evaluation does not get to the Java layer, which would then raise this "unsupported Scikit-Learn version" error message.

from sklearn2pmml.

vruusmann avatar vruusmann commented on August 20, 2024

I have a piece of code that used to work that no longer works

This issue coupled with your earlier issue (#408) leads me to believe that there's something wrong with your base SkLearn package installation. Perhaps you've installed some very old SkLearn package version (some five years old version)?

The PMMLPIpeline._fit() method takes up to three positional arguments:
https://github.com/jpmml/sklearn2pmml/blob/0.101.0/sklearn2pmml/pipeline/__init__.py#L57-L71

In your code, I can't see the fourth argument that Python is complaining about. Could it be that one of X_train or y_train is actually a tuple?

from sklearn2pmml.

mmarinaki avatar mmarinaki commented on August 20, 2024

Hi @vruusmann

  • I am using version 1.4.1 of sklearn, and 0.101.0 of sklearn2pmml so it's not older versions - up to recently I was using the version 0.90.2 I believe from sklearn2pmml
  • I was aware of that previous issue that was posted some years ago, that's why I checked to make sure it's not a tuple before posting, so not sure what is up.
  • Again, I have been using this piece of code over the past months multiple times and there was no issue there, so I believe something must have changed between the compatibility of the packages. I'll play around with different versions and see if anything changes.

from sklearn2pmml.

mmarinaki avatar mmarinaki commented on August 20, 2024

I asked my coworker to also rerun this ⏫ code that he was running a couple of weeks ago / before Christmas and he confirmed that he is also facing the same exact issue even if his code in the same environment was working before as expected.

from sklearn2pmml.

mmarinaki avatar mmarinaki commented on August 20, 2024

@vruusmann I even tried your example with the iris dataset, and these updated packages and I am running into the same error

import pandas

# iris_df = pandas.read_csv("Iris.csv")

iris_X = iris_df[iris_df.columns.difference(["species"])]
iris_y = iris_df["species"]

from sklearn.tree import DecisionTreeClassifier
from sklearn2pmml.pipeline import PMMLPipeline

pipeline = PMMLPipeline([
	("classifier", DecisionTreeClassifier())
])
pipeline.fit(iris_X, iris_y)

from sklearn2pmml import sklearn2pmml

sklearn2pmml(pipeline, "DecisionTreeIris.pmml", with_repr = True)
Screenshot 2024-01-23 at 12 11 25 PM

from sklearn2pmml.

mmarinaki avatar mmarinaki commented on August 20, 2024

This is my poetry file btw, the python version I am using is 3.10.6
[tool.poetry]
name = "xxx"
version = "1.0.0"
description = ""
authors = ["Marinaki Maria"]
readme = "README.md"
packages = [{include = "xxx"}]

[tool.poetry.dependencies]
python = "^3.9"
numpy = "1.23.2"
pandas = "1.5.2"
sklearn-pandas = "2.2.0"
sklearn2pmml = "0.101.0"
seaborn = "0.12.2"
pypmml = "0.9.17"
matplotlib = "3.6.3"
xgboost = "1.7.0"
s3fs = "2022.11.0"
fastparquet = "2022.12.0"
python-dotenv = "1.0.0"
psycopg2 = "2.9.6"
python_ml_common = {git = "https://github.com/turo/python-ml-common.git", rev = "v1.12.2"}
shap = "^0.41.0"
skl2onnx = "^1.14.1"
onnxmltools = "^1.11.2"
onnxruntime = "^1.15.1"
mlxtend = "^0.22.0"
logging = "^0.4.9.6"
selenium = "^4.11.2"
pillow = "^10.0.0"
statsmodels = "^0.14.0"
geopy = "^2.4.1"
nbformat = "^5.9.2"
scikit-learn = "^1.4.0"

[tool.poetry.group.dev.dependencies]
ipykernel = "^6.23.2"

[tool.poetry.group.test.dependencies]
pytest = "^7.2.0"
pytest-cov = "^4.0.0"
pytest-mock = "^3.10.0"
pytest-asyncio = "^0.20.3"
mypy = "^0.991"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

python_ml_common = {path = "../python-ml-common", develop = true}

from sklearn2pmml.

vruusmann avatar vruusmann commented on August 20, 2024

Downgrade your Scikit-Learn version to 1.3(.2), and see if this fixes the issue. Perhaps Scikit-Learn did change something in its Pipeline.fit(X, y) method implementation when transitioning from 1.3 to 1.4.

Also, it is mighty strange that the SkLearn2PMML package is willing to accept a Scikit-Learn 1.4 pickle file. It should be throwing an error (from its Java library component) stating that the latest supported Scikit-Learn is 1.3.2:
https://github.com/jpmml/jpmml-sklearn/blob/1.7.46/pmml-sklearn/src/main/java/sklearn/Step.java#L37-L46

from sklearn2pmml.

mmarinaki avatar mmarinaki commented on August 20, 2024

I was about to come back and tell you this :) Cause I played around with the Scikit learn older versions and it worked! Thanks for confirming and finding the source of the issue too in the code! Can confirm I am able to transform the pipeline now!

from sklearn2pmml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.