carefree0910 / carefree-learn Goto Github PK
View Code? Open in Web Editor NEWDeep Learning ❤️ PyTorch
Home Page: https://carefree0910.me/carefree-learn-doc/
License: MIT License
Deep Learning ❤️ PyTorch
Home Page: https://carefree0910.me/carefree-learn-doc/
License: MIT License
So once all metrics reach the corresponding target, we can safely early stop the training process.
This bug breaks customization codes in v0.1.5
, which makes it a broken release.
Otherwise the make_from
API will always be buggy.
Never mind, I think I figured it out.
Finished at 5c6b08f
So we can utilize carefree-learn
's APIs on other models (e.g. sklearn models).
The scores in your Titanic demo, with the new AutoML system, are not as good as they were before. I tried it now using https://github.com/carefree0910/carefree-learn/blob/dev/examples/titanic/test_titanic.py
and submitted to Kaggle and got:
Optuna: 0.77751
HPO - 0.75598
AdaBoost: 0.67703
Maybe they should be moved to a new repo
When running tutorial code :
#%%
import cflearn
from cfdata.tabular import TabularDataset
import cflearn
from cfdata.tabular import *
# prepare iris dataset
iris = TabularDataset.iris()
iris = TabularData.from_dataset(iris)
# split 10% of the data as validation data
split = iris.split(0.1)
train, valid = split.remained, split.split
x_tr, y_tr = train.processed.xy
x_cv, y_cv = valid.processed.xy
data = x_tr, y_tr, x_cv, y_cv
m = cflearn.make().fit(*data)
# Make label predictions
m.predict(x_cv)
# Make probability predictions
m.predict_prob(x_cv)
# Estimate performance
cflearn.estimate(x_cv, y_cv, pipelines=m)
We get :
Traceback (most recent call last):
File "C:\ProgramData\miniconda\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-78d46f42bbd0>", line 24, in <module>
cflearn.estimate(x_cv, y_cv, pipelines=m)
TypeError: estimate() got an unexpected keyword argument 'pipelines'
Hi,
FYI, I got this error running the example:
import cflearn
from cfdata.tabular import TabularDataset
x, y = TabularDataset.iris().xy
m = cflearn.make().fit(x, y)
Traceback (most recent call last):
File "D:\Anaconda3\envs\tipjar\lib\site-packages\IPython\core\interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-7dea84d121c1>", line 5, in <module>
m = cflearn.make().fit(x, y)
File "D:\Anaconda3\envs\tipjar\lib\site-packages\cflearn\bases.py", line 860, in fit
self._before_loop(x, y, x_cv, y_cv)
File "D:\Anaconda3\envs\tipjar\lib\site-packages\cflearn\bases.py", line 829, in _before_loop
self.cv_data, self.tr_data = self.tr_data.split(self._cv_split)
ValueError: too many values to unpack (expected 2)
Was able to fix by installing the latest:
pip install -U git+https://github.com/carefree0910/carefree-learn.git
Hi, I am new to carefree and enjoying it so far. I am using cv_split=.2. My data is not IID and temporal so want to make sure the split is doesn't shuffle/stratify. It appears from your code that it does not shuffle:
split = self.tr_data.split(self._cv_split)
Is this correct?
On my Ubuntu 18.04.4 server, when I run your quick start code, I get this error:
root@server1:~/newautoml# python3 cflearn.py
Traceback (most recent call last):
File "cflearn.py", line 1, in
import cflearn
File "/root/newautoml/cflearn.py", line 5, in
m = cflearn.make().fit(x, y)
AttributeError: module 'cflearn' has no attribute 'make'
when running turorial code :
import cflearn
from cfdata.tabular import *
# prepare iris dataset
iris = TabularDataset.iris()
iris = TabularData.from_dataset(iris)
# split 10% of the data as validation data
split = iris.split(0.1)
train, valid = split.remained, split.split
x_tr, y_tr = train.processed.xy
x_cv, y_cv = valid.processed.xy
data = x_tr, y_tr, x_cv, y_cv
#%%
fcnn = cflearn.make().fit(*data)
# 'overfit' validation set
auto = cflearn.Auto(TaskTypes.CLASSIFICATION).fit(*data, num_jobs=2)
# estimate manually
predictions = auto.predict(x_cv)
print("accuracy:", (y_cv == predictions).mean())
# estimate with `cflearn`
cflearn.estimate(
x_cv,
y_cv,
pipelines=fcnn,
other_patterns={"auto": auto.pattern},
)
Get this error :
File "C:\ProgramData\miniconda\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 4, in
auto = cflearn.Auto(TaskTypes.CLASSIFICATION).fit(*data, num_jobs=2)
AttributeError: module 'cflearn' has no attribute 'Auto'
Here's the on-going repo.
e.g. lower the batch size when the std is too small
So we can use other forms of data in carefree-learn
.
Maybe I am misunderstanding how it works, but when I run your latest version in automl mode (using https://github.com/carefree0910/carefree-learn/blob/dev/examples/titanic/automl.py), it runs the same methods it always did (fcnn_optuna, tree_dnn_optuna, etc.) and gets the same Kaggle score as before (0.78947). But in your latest version you wrote that you
"Implemented more models (Linear, TreeLinear, Wide and Deep, RNN, Transformer, etc.).". Shouldn't those be in the automl part?
This is mainly for downstream usages, because in most cases neural networks are not required to train distributedly when they are targeting tabular datasets.
Hi, Is it possible or are there plans to provide kfold or time series split cross validation? Thank you
This bug is caused by introducing get_outputs
. Previously, when binary_threshold_outputs
is not None
, we know that this is definitely a binary classification task. But now it will never be None
, even on regression tasks.
Currently we cannot change the default behaviour of Environment
. Fix it by refactor _preset_config
and _init_config
stuffs.
carefree-learn
carefree as well.On a Windows 10 Pro machine, trying to install via pip install carefree-learn
breaks on some anaconda environments (but not on others, as I've confirmed)
I'm using the most up-to-date version of pip
, 20.2.2
Keyring is skipped due to an exception: "WindowsPath" object has no attribute 'read_text'
I figured this was due to pathlib2
interfering with pathlib
so I uninstalled pathlib2
, but still no joy.
I performed a conda clean --all -y
to remove any lingering tarballs, but this too did not help.
I then had an idea to manually install the dependencies manually, so started with pip install carefree-ml
and this successfully installed carefree-ml
.
I then was able to successfully run pip install -carefree-learn
Strange error. I wonder if it may be related to the new pip 20.2 dependency resolver.
Regardless, it's installed now. Hope this helps anyone else who encounters a similar install issue.
When I git clone your repo, pip3 install it (Ubuntu 18.04.5) , and run test_titanic.py without any changes, I get this error:
root@ns544446:~/carefree-learn/examples/titanic# sudo python3 test_titanic.py
Traceback (most recent call last):
File "test_titanic.py", line 64, in
test_adaboost()
File "test_titanic.py", line 60, in test_adaboost
_test("adaboost", _adaboost_core)
File "test_titanic.py", line 44, in _test
data, pattern = _core(train_file)
File "test_titanic.py", line 36, in _adaboost_core
ensemble = cflearn.Ensemble(TaskTypes.CLASSIFICATION, config)
AttributeError: module 'cflearn' has no attribute 'Ensemble'
pipe
structures in ModelBase
Factory
classA declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.