GithubHelp home page GithubHelp logo

pylightgbm's People

Contributors

alno avatar ardalanm avatar ebazarov avatar ihopethiswillfi avatar miguelgfierro avatar xujin1982 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pylightgbm's Issues

is_unbalance

Hi Ardalan,

Thank you for making the changes for early stopping, max_bin and verbose. Would you mind adding the parameter is_unbalance as well?

Thanks again,

Chris,

Cannot train for n_iteration greater than 2600?

Hi.
It seems that I cannot train the model when setting num_iterations greater than 3000? Setting to 5000 throws error:
[LightGBM] [Info] 209.081899 seconds elapsed, finished iteration 2600
Traceback (most recent call last):
File "/home/lemma/miniconda2/lib/python2.7/site-packages/pylightgbm/models.py", line 143, in fit
with open(self.param['output_model'], mode='r') as file:
IOError: [Errno 2] No such file or directory: '/tmp/tmpNEm79g/LightGBM_model.txt'
Command exited with non-zero status 1

Is there a need to fix this? or no need because no matter how many num_iterations, the result will be the same?
Below 3000 is fine though.

Error in installing Via Ipython terminal

Hello,

Whenever I have tried installing it using the ipython terminal or the anaconda command prompt , it throws the following error, saying the procedure entry point SSL_COMP_free_compression_methods could not be located.

The screen shot is attached. Would you possibly know how to fix this error ?? Apparently, i have been unable to search it using keywords on google , so apologies if it's not entirely related. Any help is appreciated

error

Support init score

There is possibility to give init score (as array) in LightGBM in form additioonal file (train.txt.init).

Can you support this as well? As input to fit() function?

It is very suitable for regression task where init in form of zeros is not good and better choice is mean of target.

thx

max_bin and early_stopping_rounds

Thank you @ArdalanM for creating the wrapper which looks great!

Is it possible to:

  1. add max_bin to the parameters
  2. add a verbose/ silent flag to control if LightGBM's running message could be printed out, this can be of help for:
  3. extract the best rounds from the running message if early stopping is used.

Thanks,

PermissionError: [WinError 32] The process cannot access "LightGBM_model.txt"

The model trains and then breaks at the last instance.

The model does output a prediction when called to do so

[LightGBM] [Info] 0.018901 seconds elapsed, finished iteration 99
[LightGBM] [Info] 0.019084 seconds elapsed, finished iteration 100
[LightGBM] [Info] Finished training

... Lots of whitespace ...

---------------------------------------------------------------------------
PermissionError                           Traceback (most recent call last)
<ipython-input-11-f6bccd7e15ae> in <module>()
     17 #x_train, x_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2)
     18 
---> 19 clf.fit(X, y)
     20 print("Mean Square Error: ", metrics.mean_squared_error(y, clf.predict(X)))

C:\Anaconda\envs\py35\lib\site-packages\pylightgbm\models.py in fit(self, X, y, test_data)
    110         with open(self.param['output_model'], mode='r') as file:
    111             self.model = file.read()
--> 112             shutil.rmtree(tmp_dir)
    113 
    114         if test_data and self.param['early_stopping_round'] > 0:

C:\Anaconda\envs\py35\lib\shutil.py in rmtree(path, ignore_errors, onerror)
    486             os.close(fd)
    487     else:
--> 488         return _rmtree_unsafe(path, onerror)
    489 
    490 # Allow introspection of whether or not the hardening against symlink

C:\Anaconda\envs\py35\lib\shutil.py in _rmtree_unsafe(path, onerror)
    381                 os.unlink(fullname)
    382             except OSError:
--> 383                 onerror(os.unlink, fullname, sys.exc_info())
    384     try:
    385         os.rmdir(path)

C:\Anaconda\envs\py35\lib\shutil.py in _rmtree_unsafe(path, onerror)
    379         else:
    380             try:
--> 381                 os.unlink(fullname)
    382             except OSError:
    383                 onerror(os.unlink, fullname, sys.exc_info())

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\admin\\AppData\\Local\\Temp\\tmpehya862g\\LightGBM_model.txt'

Model prediction call,

[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading 100 models
[LightGBM] [Info] Finished initializing prediction
[LightGBM] [Info] Finished prediction

... Lots of whitespace ...

Mean Square Error:  668.323460051

Just wondering if this error affects the model or what can be done to stop the error being thrown? Any other info regarding this please specify. Thanks.

'LIGHTGBM_EXEC' environment variable, cannot be found

The examples show:

path_to_exec = "~/Documents/apps/LightGBM/lightgbm"

This path does not exist in the package

ls /Users/me/LightGBM/
CMakeLists.txt	README.md	docs		include		python-package	tests
LICENSE		build		examples	pmml		src		windows

...and when I try to build a model I get an error

from pylightgbm.models import GBMClassifier

df = pd.read_csv('my_data.csv')

params = {'exec_path': path_to_exec,
      'num_iterations': 1000, 'learning_rate': 0.01,
      'min_data_in_leaf': 1, 'num_leaves': 5,
      'metric': 'binary_error', 'verbose': False,
      'early_stopping_round': 20}

GBMClassifier(params).fit(df['X_var'], df['y_var'])

pyLightGBM is looking for 'LIGHTGBM_EXEC' environment variable, cannot be found.
exec_path will be deprecated in favor of environment variable

/Users/me/anaconda/lib/python2.7/site-packages/pylightgbm/models.pyc in fit(self, X, y, test_data, init_scores)
    129 
    130             process = subprocess.Popen([self.exec_path, "config={}".format(conf_filepath)],
--> 131                                        stdout=subprocess.PIPE, bufsize=1)
    132 
    133         else:

/Users/me/anaconda/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
    708                                 p2cread, p2cwrite,
    709                                 c2pread, c2pwrite,
--> 710                                 errread, errwrite)
    711         except Exception:
    712             # Preserve original exception in case os.close raises.

/Users/me/anaconda/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
   1333                         raise
   1334                 child_exception = pickle.loads(data)
-> 1335                 raise child_exception
   1336 
   1337 

OSError: [Errno 13] Permission denied

how can i use this package

Hello.
It's my first time to use cpp file and headfile for python.

If I want to use this package, May I put the LightGBM's src into LighGBM/lightgbm? And then, I can design a classifier.

ps. Is the directory wrong? LighGBM -> LightGBM?

FileNotFoundError after validation

I got the error after validation.
Here is the code.

import pandas as pd
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from pylightgbm.models import GBMClassifier
from sklearn.metrics import roc_auc_score

diabetes = load_diabetes()

split_ind = 200

X = diabetes['data']
y = diabetes['target']

X = pd.DataFrame(X).add_prefix('c')
y = pd.Series(y)
y = (y>150)*1

X_train, X_test = X[:split_ind], X[split_ind:]
y_train, y_test = y[:split_ind], y[split_ind:]

exec = "~/LightGBM/lightgbm"

clf = GBMClassifier(exec_path=exec, 
                    num_iterations=3000,
                    metric='auc',
                    early_stopping_round=20)

clf.fit(X_train, y_train, 
        test_data=[(X_test, y_test)])
clf.param['num_iterations'] = clf.best_round # Also .set_params wouldn't work
clf.fit(X_train, y_train)

The second 'clf.fit' occurs 'FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/~', since in 'def fit' the 'with process.stdout:' does not write LightGBM_model.

Set LightGBM path with a environment variable

At the moment you need to put LightGBM path in the constructor, which can be a bad idea since your code won't work on other environment.

One of the solution is to check the existence of an environment variable, like LIGHTGBM_PATH, which should contain the LightGBM path.

pyLightGBM is looking for LIGHTGBM_EXEC environment Variable

Hello,

I have been trying to run the code of pylightgbm but it raises the exception that

" pyLightGBM is looking for 'LIGHTGBM_EXEC' environment variable, cannot be found.
exec_path will be deprecated in favor of environment variable "

I have also tried specifying the path to the lightgbm package installed in the library as well as pointed the path towards the lightgbm or pylightgbm package files which were downloaded from their respective github sources, but none of the files seem to provide the 'LIGHTGBM_EXEC' file/folder.

That is why i think the exception is getting raised again and again, ?? Is there a workaround this Issue.

Also, please see that I have tried using the following link which i think could probably provide a work around, but the last two lines while using conda prompt are NOT clear. That is, pip install requirements points to which source, they are certainly not present in pep8 package that was downloaded. Why are we using setup.py at the end ?? What package does this try install considering pip was already used to install packages

https://github.com/ArdalanM/pyLightGBM/blob/master/.travis.yml

  • conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION
    • source activate test-environment
    • pip install pytest pytest-cov python-coveralls pytest-xdist coverage #we need this version of coverage for coveralls.io to work
    • pip install pep8 pytest-pep8
      _- pip install -r requirements.txt
    • python setup.py install_

Any suggestion, feedback is welcome

missing log output

even though verbose= True I will not get all the log output of lightgbm.
This would be helpful to better debug problems in the config of lightGBM.

[Improvement] pickle support

The xgboost sklearn wrapper for python has pickle support. It would be great if lightgbm models could be serialized just as easy.

file format issues

sometimes I get
input format error, should be LibSVMinput format error, should be LibSVM but have no real clue how to debug it. Do you have any suggestions?

The strange thing is that the training worked just fine:

[LightGBM] [Info] 3.972398 seconds elapsed, finished 59 iteration
[LightGBM] [Info] 4.042168 seconds elapsed, finished 60 iteration
[LightGBM] [Info] Finish train

The problem occurs when I try to predict new values.

result file not found

A clf.predict(X) seems to cause a
FileNotFoundError: [Errno 2] No such file or directory: 'pathToLightGBM/lightgbm_models/32028_1476974477/LightGBM_predict_result_32028_1476974523.txt' for me.
I just downloaded the package from github and created a GBMClassifier - not sure if install via pip ... would be required.

edit

I am using a mac / osx 10.11.6

but I can see the files in the finder:

edit2

I noticed, that no LightGBM_predict_result_32028_1476974809.txt result.txt but rather only 3 other files were created.

lightgbm_models

python example error

clf.fit(x_train, y_train, test_data=[(x_test, y_test)])
Traceback (most recent call last):
File "", line 1, in
File "/home/apps/duane_tmp/anaconda2/lib/python2.7/site-packages/pylightgbm/models.py", line 131, in fit
stdout=subprocess.PIPE, bufsize=1)
File "/home/apps/duane_tmp/anaconda2/lib/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/home/apps/duane_tmp/anaconda2/lib/python2.7/subprocess.py", line 1343, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

hello,when try the python example at clf.fit(x_train, y_train, test_data=[(x_test, y_test)]) this line,it response me that above error.could you help me fix that?tks!

IOError when calling fit()

IOError when calling fit()
OS = Windows 10 x64
Cloned and built it just a few hours ago.

est = GBMRegressor(exec_path="O:/Coding/LightGBM/", 
                   config='', 
                   application='regression', 
                   num_iterations=2500, 
                   learning_rate=0.1, 
                   num_leaves=127, 
                   tree_learner='serial', 
                   num_threads=4, 
                   min_data_in_leaf=100, 
                   metric='l2', 
                   feature_fraction=1.0, 
                   feature_fraction_seed=2, 
                   bagging_fraction=1.0, 
                   bagging_freq=0, 
                   bagging_seed=3, 
                   metric_freq=1, 
                   early_stopping_round=0)
est.fit(X, y, test_data=[(X_holdout, y_holdout)])

IOErrorTraceback (most recent call last)
in ()
----> 1 est.fit(X, y, test_data=[(X_holdout, y_holdout)])

C:\Users\ihopethiswillfi\Anaconda2\lib\site-packages\pylightgbm-0.2-py2.7.egg\pylightgbm\models.pyc in fit(self, X, y, test_data)
71 os.system("{} config={}".format(self.exec_path, self.config))
72
---> 73 with open(self.param['output_model'], mode='rb') as file:
74 self.model = file.read()
75

IOError: [Errno 2] No such file or directory: 'c:\users\ihopethiswillfi\appdata\local\temp\tmpksw8jv\LightGBM_model.txt'

GBM predict with returned non-zero exit status 1 error

Hello ArdalanM, it's great and simple to use GBM in this py wrapper code.

Now i am run your notebook regression_example_kaggle_allstate.ipynb , get one error in gbmr.predict line, the output message is

<ipython-input-10-0c40d9335a9b> in <module>()
     22 
     23 gbmr.fit(X_train, y_train, test_data=[(X_valid, y_valid)])
---> 24 print("Mean Square Error: ", metrics.mean_absolute_error(y_true=(np.exp(y_valid)-1), y_pred=(np.exp(gbmr.predict(X_valid))-1)))

/usr/local/lib/python2.7/dist-packages/pylightgbm/models.pyc in predict(self, X)
    122 
    123         process = subprocess.check_output([self.exec_path, "config={}".format(conf_filepath)],
--> 124                                           universal_newlines=True)
    125 
    126         if self.verbose:

/usr/lib/python2.7/subprocess.pyc in check_output(*popenargs, **kwargs)
    572         if cmd is None:
    573             cmd = popenargs[0]
--> 574         raise CalledProcessError(retcode, cmd, output=output)
    575     return output
    576 

CalledProcessError: Command '['/home/lyz/Workspace/Github/LightGBM/lightgbm', 'config=/tmp/tmphdAl6R/predict.conf']' returned non-zero exit status 1

Really appreciate your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.