GithubHelp home page GithubHelp logo

carefree0910 / carefree-learn Goto Github PK

View Code? Open in Web Editor NEW
401.0 401.0 39.0 6.58 MB

Deep Learning ❤️ PyTorch

Home Page: https://carefree0910.me/carefree-learn-doc/

License: MIT License

Python 99.98% Dockerfile 0.02%
algorithm automl computer-vision data-science deep-learning ensemble machine-learning numpy python pytorch tabular-data tabular-datasets

carefree-learn's People

Contributors

carefree0910 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carefree-learn's Issues

Add `metric_targets`

So once all metrics reach the corresponding target, we can safely early stop the training process.

TypeError: estimate() got an unexpected keyword argument 'pipelines'

When running tutorial code :

#%%
import cflearn
from cfdata.tabular import TabularDataset

import cflearn

from cfdata.tabular import *

# prepare iris dataset
iris = TabularDataset.iris()
iris = TabularData.from_dataset(iris)
# split 10% of the data as validation data
split = iris.split(0.1)
train, valid = split.remained, split.split
x_tr, y_tr = train.processed.xy
x_cv, y_cv = valid.processed.xy
data = x_tr, y_tr, x_cv, y_cv

m = cflearn.make().fit(*data)
# Make label predictions
m.predict(x_cv)
# Make probability predictions
m.predict_prob(x_cv)
# Estimate performance
cflearn.estimate(x_cv, y_cv, pipelines=m)

We get :

                                     Traceback (most recent call last):
  File "C:\ProgramData\miniconda\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-78d46f42bbd0>", line 24, in <module>
    cflearn.estimate(x_cv, y_cv, pipelines=m)
TypeError: estimate() got an unexpected keyword argument 'pipelines'

ValueError: too many values to unpack (expected 2)

Hi,

FYI, I got this error running the example:

import cflearn
from cfdata.tabular import TabularDataset

x, y = TabularDataset.iris().xy
m = cflearn.make().fit(x, y)
Traceback (most recent call last):
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\IPython\core\interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-7dea84d121c1>", line 5, in <module>
    m = cflearn.make().fit(x, y)
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\cflearn\bases.py", line 860, in fit
    self._before_loop(x, y, x_cv, y_cv)
  File "D:\Anaconda3\envs\tipjar\lib\site-packages\cflearn\bases.py", line 829, in _before_loop
    self.cv_data, self.tr_data = self.tr_data.split(self._cv_split)
ValueError: too many values to unpack (expected 2)

Was able to fix by installing the latest:

pip install -U git+https://github.com/carefree0910/carefree-learn.git

Split?

Hi, I am new to carefree and enjoying it so far. I am using cv_split=.2. My data is not IID and temporal so want to make sure the split is doesn't shuffle/stratify. It appears from your code that it does not shuffle:

split = self.tr_data.split(self._cv_split)

Is this correct?

AttributeError: module 'cflearn' has no attribute 'make'

On my Ubuntu 18.04.4 server, when I run your quick start code, I get this error:
root@server1:~/newautoml# python3 cflearn.py
Traceback (most recent call last):
File "cflearn.py", line 1, in
import cflearn
File "/root/newautoml/cflearn.py", line 5, in
m = cflearn.make().fit(x, y)
AttributeError: module 'cflearn' has no attribute 'make'

AttributeError: module 'cflearn' has no attribute 'Auto'

when running turorial code :

import cflearn

from cfdata.tabular import *

# prepare iris dataset
iris = TabularDataset.iris()
iris = TabularData.from_dataset(iris)
# split 10% of the data as validation data
split = iris.split(0.1)
train, valid = split.remained, split.split
x_tr, y_tr = train.processed.xy
x_cv, y_cv = valid.processed.xy
data = x_tr, y_tr, x_cv, y_cv

#%%
fcnn = cflearn.make().fit(*data)

# 'overfit' validation set
auto = cflearn.Auto(TaskTypes.CLASSIFICATION).fit(*data, num_jobs=2)

# estimate manually
predictions = auto.predict(x_cv)
print("accuracy:", (y_cv == predictions).mean())

# estimate with `cflearn`
cflearn.estimate(
    x_cv,
    y_cv,
    pipelines=fcnn,
    other_patterns={"auto": auto.pattern},
)

Get this error :

File "C:\ProgramData\miniconda\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 4, in
auto = cflearn.Auto(TaskTypes.CLASSIFICATION).fit(*data, num_jobs=2)
AttributeError: module 'cflearn' has no attribute 'Auto'

AutoML Mode Question

Maybe I am misunderstanding how it works, but when I run your latest version in automl mode (using https://github.com/carefree0910/carefree-learn/blob/dev/examples/titanic/automl.py), it runs the same methods it always did (fcnn_optuna, tree_dnn_optuna, etc.) and gets the same Kaggle score as before (0.78947). But in your latest version you wrote that you
"Implemented more models (Linear, TreeLinear, Wide and Deep, RNN, Transformer, etc.).". Shouldn't those be in the automl part?

Integrate DeepSpeed

This is mainly for downstream usages, because in most cases neural networks are not required to train distributedly when they are targeting tabular datasets.

[CRITICAL] Regression is now broken due to legacy bugs

This bug is caused by introducing get_outputs. Previously, when binary_threshold_outputs is not None, we know that this is definitely a binary classification task. But now it will never be None, even on regression tasks.

Fix the design of `Environment`

Currently we cannot change the default behaviour of Environment. Fix it by refactor _preset_config and _init_config stuffs.

Clean up APIs

  • Provide better user-side experiences.
  • Made development on carefree-learn carefree as well.

Keyring is skipped due to an exception: "WindowsPath" object has no attribute 'read_text'

On a Windows 10 Pro machine, trying to install via pip install carefree-learn breaks on some anaconda environments (but not on others, as I've confirmed)
I'm using the most up-to-date version of pip, 20.2.2

Keyring is skipped due to an exception: "WindowsPath" object has no attribute 'read_text'

I figured this was due to pathlib2 interfering with pathlib so I uninstalled pathlib2, but still no joy.

I performed a conda clean --all -y to remove any lingering tarballs, but this too did not help.

I then had an idea to manually install the dependencies manually, so started with pip install carefree-ml and this successfully installed carefree-ml.
I then was able to successfully run pip install -carefree-learn

Strange error. I wonder if it may be related to the new pip 20.2 dependency resolver.
Regardless, it's installed now. Hope this helps anyone else who encounters a similar install issue.

AttributeError: module 'cflearn' has no attribute 'Ensemble'

When I git clone your repo, pip3 install it (Ubuntu 18.04.5) , and run test_titanic.py without any changes, I get this error:
root@ns544446:~/carefree-learn/examples/titanic# sudo python3 test_titanic.py
Traceback (most recent call last):
  File "test_titanic.py", line 64, in
    test_adaboost()
  File "test_titanic.py", line 60, in test_adaboost
    _test("adaboost", _adaboost_core)
  File "test_titanic.py", line 44, in _test
    data, pattern = _core(train_file)
  File "test_titanic.py", line 36, in _adaboost_core
    ensemble = cflearn.Ensemble(TaskTypes.CLASSIFICATION, config)
AttributeError: module 'cflearn' has no attribute 'Ensemble'

Misc enhancements

  • Visualize pipe structures in ModelBase
  • Implement Factory class
  • Record best epoch & step

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.