GithubHelp home page GithubHelp logo

mljar / mljar-api-python Goto Github PK

View Code? Open in Web Editor NEW
42.0 42.0 10.0 109 KB

A simple python wrapper over MLJAR API.

Home Page: https://docs.mljar.com/

License: Apache License 2.0

Python 100.00%
machine-learning machine-learning-api mljar-api-python prediction-algorithm predictive-analytics predictive-modeling python

mljar-api-python's People

Contributors

pplonski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mljar-api-python's Issues

`fit` on discrete features and binary target yields error

Here is what I'm doing

>>> model.fit(train_features, train_target.squeeze())
Ups, Something bad happend! There is no attributes usage defined for your dataset

When I look in mljar.com, I see that the attribute usage is not accepted.

A related point may be that this is a larger dataset than I used to upload. Previous datasets were 1 million rows and 100 columns, continuous, and around 500 MB. This one is 1 million rows and 300 columns, discrete, and around 500 MB also.

Edit 1: Looking in mljar.com, I also notice that the target is categorical with unique values True, False, "target" (string being "target"). My target in python was a numpy pandas array with just True/False. Maybe this was the problem with automatic acceptance of attribute usage.
This columns is also marked with "use it" and not as target. Accepting without changing it to target says something along the lines of "error, should have target". Changing it to target yields error like "binary classification target should have 2 values only."

Edit 2: target was pandas array and not numpy array .. fixed inline

mljar.fit returning prediction from result not belong to my current experiment

There are multiple experiments in my project.
When I ran mljar.fit(...) for my 2nd experiment,
I got results that looked like results from my 1st experiment.
Digging deeper, I found that the function ResultClient(project id).get_results(experiment id)
(link)
was not filtering the results for the experiment ID being passed.
For example, the code below


clf_mlj = Mljar(
  project='some project name',
  experiment='some experiment name',
  ...
)

from mljar.client.result import ResultClient
client = ResultClient(clf_mlj.project.hid)
results = client.get_results(clf_mlj.experiment.hid)
len(results) # returns 75
results = client.get_results(None)
len(results) # returns 75 as well

columns get renamed when `predict` is called

Hi. I used the predict method on a pandas dataframe with the same column names of the dataframe used in the fit method, but the columns with the predict method got renamed from 1 to attribute_1, 2 to attribute_2, etc. Is this because the column names are numeric?

Use git tags to easily relate git commit to pypi version

Just my 2 cents:
ATM, it seems that there are no git tags in the repository,
which makes it non-straight-forward to figure out if a function parameter,
e.g. fit(..., dataset_title),
is part of the 0.0.6 release on pypi.

Perhaps for 0.0.7 and later versions, you can just run git tag 0.0.7 && git push origin 0.0.7 for the same version published to pypi

problem with running predictions on windows machines

When computing predictions on windows machine there is error:

pred = Mljar.compute_prediction(df, model_id = 'xxxxxxx', project_id = 'xxxxxxxxxxxx',keep_dataset=True)

IOError                                   Traceback (most recent call last)
<ipython-input-8-e3ccbf396609> in <module>()
----> 1 pred = Mljar.compute_prediction(df, model_id = 'xxxxxxxx', project_id = 'xxxxxxxxxx',keep_dataset=True)

c:\python27\lib\site-packages\mljar\mljar.pyc in compute_prediction(X, model_id, project_id, keep_dataset)
    328 
    329         # chack if dataset exists in mljar if not upload dataset for prediction
--> 330         dataset = DatasetClient(project_id).add_dataset_if_not_exists(X, y = None)
    331 
    332         # check if prediction is available

c:\python27\lib\site-packages\mljar\client\dataset.pyc in add_dataset_if_not_exists(self, X, y, title_prefix)
    128         if len(dataset_details) == 0:
    129             # add new dataset
--> 130             dataset_details = self.add_new_dataset(data, y, title_prefix)
    131         else:
    132             dataset_details = dataset_details[0]

c:\python27\lib\site-packages\mljar\client\dataset.pyc in add_new_dataset(self, data, y, title_prefix)
    166         prediction_only = y is None
    167         # save to local storage
--> 168         data.to_csv(file_path, index=False)
    169         # compress
    170         file_path_zip = file_path + '.zip'

c:\python27\lib\site-packages\pandas\core\frame.pyc in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
   1401                                      doublequote=doublequote,
   1402                                      escapechar=escapechar, decimal=decimal)
-> 1403         formatter.save()
   1404 
   1405         if path_or_buf is None:

c:\python27\lib\site-packages\pandas\io\formats\format.pyc in save(self)
   1569             f, handles = _get_handle(self.path_or_buf, self.mode,
   1570                                      encoding=self.encoding,
-> 1571                                      compression=self.compression)
   1572             close = True
   1573 

c:\python27\lib\site-packages\pandas\io\common.pyc in _get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text)
    377         if compat.PY2:
    378             # Python 2
--> 379             f = open(path_or_buf, mode)
    380         elif encoding:
    381             # Python 3 and encoding

IOError: [Errno 2] No such file or directory: '/tmp/dataset-f8f18b2e.csv'

python 3 support?

Hello. It seems that mljar is python2-compatible but not python3-compatible (e.g. some print ... calls without parentheses, the from mljar import Mljar call in mljar.__init__).
Any plan on releasing a python3-compatible version?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.