indicodatasolutions / finetune Goto Github PK
View Code? Open in Web Editor NEWScikit-learn style model finetuning for NLP
Home Page: https://finetune.indico.io
License: Mozilla Public License 2.0
Scikit-learn style model finetuning for NLP
Home Page: https://finetune.indico.io
License: Mozilla Public License 2.0
Describe the bug
I can't get cached_predict()
to work properly. After the first call to predict()
, the inference time is lightning fast, but the last value from the first call is used for all inference from then on. (see attached screenshot)
Minimum Reproducible Example
model = Classifier().load("path_to_model_file")
with model.cached_predict():
print(model.predict_proba(["some text"]))
print(model.predict_proba(["some other text for different probs"]))
Expected behavior
The inference time is significantly faster starting with the second predict and it predicts for the given input.
Additional context
This seems to work on CPU, but not on GPU. This was run on several AWS instance types for CPU performance and a p2.xlarge for GPU testing.
I'm using the Deep Learning AMI (Ubuntu) Version 19.0 - ami-05bc59103c52af154
Then I am pulling the development branch of finetune.
I tried with and without trying to upgrade tensorflow with no difference in performance.
Screen Shot
Thanks for writing this library to wrap the OpenAI code!! I don't think the current codebase does this (I was looking in the base model's _train_loop and validation_hook) but I could be wrong.
Is your feature request related to a problem? Please describe.
When I am fitting the classifier, I would like to save the weights of the model with the best validation data loss. In particular -- I call:
model = finetune.Classifier(n_epochs=10,
test_size=.3,
verbose=True,
batch_size=8,
lm_loss_coef=0,
val_interval=25)
model.fit(train_data, train_labels)
I watch as the validation loss goes down, and then goes back up again...
...
Train loss: 0.8263243668572684 Validation loss: 0.9696971137060008
Train loss: 0.7893467535291913 Validation loss: 0.9315027973123593
Train loss: 0.7428553240288371 Validation loss: 0.8978088509162531
Train loss: 0.6532491042988868 Validation loss: 0.8657591236030956
Train loss: 0.5872814164979386 Validation loss: 0.8302381457714927
Train loss: 0.5188518893581162 Validation loss: 0.8055230206469525
Train loss: 0.46836010163226455 Validation loss: 0.7790362483260943
Train loss: 0.42655090732702466 Validation loss: 0.7963167028100855
Train loss: 0.37630249256415615 Validation loss: 0.7668756079191992
Train loss: 0.3127427928071437 Validation loss: 0.7535473302166583
Train loss: 0.26105406358681316 Validation loss: 0.752195528951159
Train loss: 0.2315318344735824 Validation loss: 0.74845844850349
Train loss: 0.20082835943056582 Validation loss: 0.7339546502212082
Train loss: 0.16591417155263366 Validation loss: 0.7224668118229861
Train loss: 0.16113676263676516 Validation loss: 0.71280302448007
Train loss: 0.1351519643111266 Validation loss: 0.7155087849984496
Train loss: 0.1122367973046417 Validation loss: 0.7097793683152129
Train loss: 0.08857986326043485 Validation loss: 0.7209393355041678
Train loss: 0.07383205768643403 Validation loss: 0.7290630419633354
Train loss: 0.05744698858779037 Validation loss: 0.737265006848874
Train loss: 0.04475240544373843 Validation loss: 0.7473114921713067
Train loss: 0.03484421883801572 Validation loss: 0.7718343255340118
...
Also -- the point at which the minimum is reached varies run-to-run, so I can't simply set the number of epochs apriori.
Describe the solution you'd like
It would be great to have an option to cache the temporary weights and then, at the end of the training loop, have the best weights according to validation loss reloaded automatically.
Describe alternatives you've considered
I could write my own training loop where I call fit (epochs=1) and save, but caching the weights and then reloading the best weights according to in a single call would be extremely convenient.
Describe the bug
Tensorboard event files are 800MB+ per run
Minimum Reproducible Example
import finetune
x = ["foo", "bar", "baz"]
y = [1, 0, 0]
classifier = finetune.Classifier(tensorboard_folder="./tb_test")
classifier.fit(x, y)
I'm not sure what exactly is contained in the log files - but this is pretty annoying because it means that after ~30 runs, tensorboard is taking 40+ Gigabyte of RAM for me. Looks like for some reason the graph definition is huge (reading the file with tf.train.summary_iterator('./events.out....')
)
Describe the bug
Hello,
By using the model.fit function, I have an unexpected index out of bound exeption. Following the whole trace:
len(df_train["text"])= 20 type= <class 'pandas.core.series.Series'>
len(df_train["label"])= 20 type= <class 'pandas.core.series.Series'>
IndexError Traceback (most recent call last)
<ipython-input-34-7fa88e2450d2> in <module>()
2 print("len(df_train[\"text\"])=", len(df_train["text"]), "type=", type(df_train["text"]))
3 print("len(df_train[\"label\"])=", len(df_train["label"]), "type=", type(df_train["label"]))
----> 4 model.fit(df_train['text'], df_train['label']) # Finetune base model on custom data
5 model.save("model_repairType.md5") # Serialize the model to disk
/home/nico/anaconda3/envs/py36/lib/python3.6/site-packages/finetune/base.py in fit(self, *args, **kwargs)
308 def fit(self, *args, **kwargs):
309 """ An alias for finetune. """
--> 310 return self.finetune(*args, **kwargs)
311
312 def _predict(self, Xs, max_length=None):
/home/nico/anaconda3/envs/py36/lib/python3.6/site-packages/finetune/classifier.py in finetune(self, X, Y, batch_size)
55 corresponds to the number of training examples provided to each GPU.
56 """
---> 57 return super().finetune(X, Y=Y, batch_size=batch_size)
58
59 def get_eval_fn(cls):
/home/nico/anaconda3/envs/py36/lib/python3.6/site-packages/finetune/base.py in finetune(self, Xs, Y, batch_size)
199 arr_encoded,
200 Y=Y,
--> 201 batch_size=batch_size,
202 )
203
/home/nico/anaconda3/envs/py36/lib/python3.6/site-packages/finetune/base.py in _training_loop(self, arr_encoded, Y, batch_size)
215 else:
216 Y = np.asarray(Y)
--> 217 train_Y = self.label_encoder.fit_transform(Y[train_idxs])
218 val_Y = self.label_encoder.transform(Y[val_idxs])
219 target_dim = self.label_encoder.target_dim
IndexError: index 26 is out of bounds for axis 1 with size 20
Minimum Reproducible Example
This is the code I run:
model = Classifier(n_epochs=2, tensorboard_folder='.tensorboard', chunk_long_sequences=True)
print("len(df_train[\"text\"])=", len(df_train["text"]), "type=", type(df_train["text"]))
print("len(df_train[\"label\"])=", len(df_train["label"]), "type=", type(df_train["label"]))
model.fit(df_train['text'], df_train['label'])
model.save("model_repairType.md5")
Expected behavior
classification of 'text' elements to 'label'
Additional context
no
Many thanks,
Nicolas
Describe the bug
A clear and concise description of what the bug is.
Minimum Reproducible Example
A short code snippet which reproduces the exception
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
from finetune import Classifier
model = Classifier.load(model_fname)
fails with No such file or directory: encoder_bpe_40000.json
Fixed by calling this once:
import finetune.download
finetune.download.download_data_if_required()
Describe the bug
I get the following warning both in training and prediction:
WARNING:tensorflow:From ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/function.py:986: calling Graph.create_op (from tensorflow.python.framework.ops) with compute_shapes is deprecated and will be removed in a future version.
Instructions for updating:
Shapes are always computed; don't use the compute_shapes as it has no effect.
Minimum Reproducible Example
[... train a classifier as in the examples and save it ...]
model = ft.Classifier.load(PATH)
model.predict(X_test)
Expected behavior
No warnings
Describe the bug
After I load a saved model, I'm unable to train it again. I've reproduced the bug on two different datasets.
ValueError: Operation name: "NoOp_1"
op: "NoOp"
is not an element of this graph.
Minimum Reproducible Example
from finetune import Classifier
from sklearn.datasets import fetch_20newsgroups
dataset = fetch_20newsgroups()
trainX, trainY = dataset.data[:100], dataset.target[:100]
model = Classifier(n_epochs=1)
model.fit(trainX, trainY)
model.save('repro')
model = Classifier.load('repro')
model.fit(trainX, trainY)
Describe the bug
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 24576 values, but the requested shape has 0
[[{{node OptimizeLoss/gradients/model/target/Sum_1_grad/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](OptimizeLoss/gradients/model/target/Abs_1_grad/mul, OptimizeLoss/gradients/model/target/Sum_1_grad/DynamicStitch/_2401)]]
[[{{node OptimizeLoss/control_dependency/_2705}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_15696_OptimizeLoss/control_dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "finetune/datasets/quora_similarity.py", line 51, in <module>
model.fit(list(zip(trainX1, trainX2)), trainY)
File "/root/code/indico/finetune/finetune/base.py", line 362, in fit
return self.finetune(*args, **kwargs)
File "/root/code/indico/finetune/finetune/classifier.py", line 69, in finetune
return super().finetune(X, Y=Y, batch_size=batch_size)
File "/root/code/indico/finetune/finetune/base.py", line 236, in finetune
estimator.train(train_input_fn, hooks=train_hooks, steps=num_steps)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1241, in _train_model_default
saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1471, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 671, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1156, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1255, in run
raise six.reraise(*original_exc_info)
File "/usr/local/lib/python3.5/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1240, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1312, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 1076, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 24576 values, but the requested shape has 0
[[node OptimizeLoss/gradients/model/target/Sum_1_grad/Reshape (defined at /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/layers/python/layers/optimizers.py:239) = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](OptimizeLoss/gradients/model/target/Abs_1_grad/mul, OptimizeLoss/gradients/model/target/Sum_1_grad/DynamicStitch/_2401)]]
[[{{node OptimizeLoss/control_dependency/_2705}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_15696_OptimizeLoss/control_dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'OptimizeLoss/gradients/model/target/Sum_1_grad/Reshape', defined at:
File "finetune/datasets/quora_similarity.py", line 51, in <module>
model.fit(list(zip(trainX1, trainX2)), trainY)
File "/root/code/indico/finetune/finetune/base.py", line 362, in fit
return self.finetune(*args, **kwargs)
File "/root/code/indico/finetune/finetune/classifier.py", line 69, in finetune
return super().finetune(X, Y=Y, batch_size=batch_size)
File "/root/code/indico/finetune/finetune/base.py", line 236, in finetune
estimator.train(train_input_fn, hooks=train_hooks, steps=num_steps)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1237, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/root/code/indico/finetune/finetune/model.py", line 154, in _model_fn
summaries=summaries
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/layers/python/layers/optimizers.py", line 239, in optimize_loss
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/optimizer.py", line 519, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py", line 630, in gradients
gate_gradients, aggregation_method, stop_gradients)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py", line 814, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py", line 408, in _MaybeCompile
return grad_fn() # Exit early
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py", line 814, in <lambda>
lambda: grad_fn(op, *out_grads))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_grad.py", line 83, in _SumGrad
grad = array_ops.reshape(grad, output_shape_kept_dims)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 6482, in reshape
"Reshape", tensor=tensor, shape=shape, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
...which was originally created as op 'model/target/Sum_1', defined at:
File "finetune/datasets/quora_similarity.py", line 51, in <module>
model.fit(list(zip(trainX1, trainX2)), trainY)
[elided 6 identical lines from previous traceback]
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/root/code/indico/finetune/finetune/model.py", line 100, in _model_fn
target_model_state = target_model_op(featurizer_state=featurizer_state, Y=Y, params=params, mode=mode)
File "/root/code/indico/finetune/finetune/model.py", line 73, in target_model_op
class_weights=weighted_tensor
File "/root/code/indico/finetune/finetune/comparison.py", line 55, in _target_model
featurizer_state["features"] = tf.abs(tf.reduce_sum(featurizer_state["features"], 1))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 1345, in reduce_sum
name=name))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 8389, in _sum
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 24576 values, but the requested shape has 0
[[node OptimizeLoss/gradients/model/target/Sum_1_grad/Reshape (defined at /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/layers/python/layers/optimizers.py:239) = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](OptimizeLoss/gradients/model/target/Abs_1_grad/mul, OptimizeLoss/gradients/model/target/Sum_1_grad/DynamicStitch/_2401)]]
[[{{node OptimizeLoss/control_dependency/_2705}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_15696_OptimizeLoss/control_dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Minimum Reproducible Example
Issue training the following model
model = Comparison(low_memory_mode=True, n_epochs=5, batch_size=32, early_stopping_steps=10000)
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
I am unable to reproduce this exception, but I've logged it here so we can build a picture of what is going on if it happens again.
Describe the bug
I use the following code to run a demo on SNLI dataset.
It keeps outputting '0it [00:00, ?it/s]'
The output file looks like this:
FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
Minimum Reproducible Example
def trim(string):
try:
string = ' '.join(string.split(' ')[:256]).rstrip('\n')
return string
except:
raise ValueError(f'{string}')
def read_file(file):
with open(file) as f:
lines=[]
for line in f:
line = trim(line)
lines.append(line)
return lines
if __name__ == "__main__":
trainX1 = read_file('premise_snli_1.0_train.txt')
trainX2 = read_file('hypothesis_snli_1.0_train.txt')
trainY = read_file('label_snli_1.0_train.txt')
testX1 = read_file('premise_snli_1.0_test.txt')
testX2 = read_file('hypothesis_snli_1.0_test.txt')
testY = read_file('label_snli_1.0_test.txt')
model = Entailment(verbose=True)
model.fit(trainX1, trainX2, trainY)
model.save('./saved_snli_model')
pred_result = model.predict(testX1, testX2)
premise_snli_1.0_train.txt
is a file where each line is a sentence.
In config.py file i set the max_length to be 258, batch_size to be 8
Is it possible to build a siamese model for fine tuning specially for the text inference task ?
Describe the bug
In these lines:
finetune/finetune/classifier.py
Lines 13 to 16 in 9a62687
It seems to me that the return statement should either be
return [Xs[i[0]] for i in idxs], [Y[i[0]] for i in idxs]
(Y instead of Ys)
or
return [Xs[i[0]] for i in idxs], Ys
Since as is you're basically applying the originalIndex->sampledIndex mapping twice
I'm not sure if this actually causes wrong labels but it looks like it could.
I am trying to run training with Nvidia-docker , I have cuda 8 and
getting the following error, when running training file ,
Here is the image of my terminal output : https://ibb.co/eRoj5p
Here is the image of nvidia-smi and nvidia-docker on terminal : https://ibb.co/jEtOy9
I am not able to figure out what the problem is I have docker and compatible verison of nvidia-docker, I have nvidia gpus and cuda 8, I have verified performing nvidia-smi test on nvidia-docker, but when I run the finetune docker image and then run training file , it gives the error mentioned in screenshot.
I would really appreicate if someone can point me towards a right direction to solve this.
Hi! I'm trying to run the SST classification example in a Jupyter notebook, but the kernel keeps dying as soon as the TensorFlow variables are initialized.
I guess that maybe the model is too big to fit in memory. But I tried to lower the batch size to 1 and the max_length to 10, and the kernel still died anyway.
I tried this on a machine with 16 Gb of RAM and 8 CPUs, as well as on the same machine but using 2 GeForce GTX 970 with each 4 Gb of memory.
Do I need more memory to be able to use the classification model ?
Describe the bug
Method Comparison.predict_proba
returns most likely class, but not probabilities of classes
Minimum Reproducible Example
A short code snippet which reproduces the exception
from finetune import Comparison model = Comparison.load(model_path) model.predict_proba("my sentence", "other sentence")
Expected behavior
{0: float, 1: float}
Additional context
[int]
Describe the bug
Problem deleting temp file while running in Windows 10.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "C:\Users\yang.liu\AppData\Local\conda\conda\envs\python36\lib\site-packages\finetune\base.py", line 537, in __del__
shutil.rmtree(file_or_folder)
File "C:\Users\yang.liu\AppData\Local\conda\conda\envs\python36\lib\shutil.py", line 494, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\Users\yang.liu\AppData\Local\conda\conda\envs\python36\lib\shutil.py", line 384, in _rmtree_unsafe
_rmtree_unsafe(fullname, onerror)
File "C:\Users\yang.liu\AppData\Local\conda\conda\envs\python36\lib\shutil.py", line 393, in _rmtree_unsafe
onerror(os.rmdir, path, sys.exc_info())
File "C:\Users\yang.liu\AppData\Local\conda\conda\envs\python36\lib\shutil.py", line 391, in _rmtree_unsafe
os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\yang.liu\\AppData\\Local\\Temp\\Finetunerxoy0m5o\\eval'
after running this code:
def train(X_train, Y_train):
model = Classifier()
model.fit(X_train, Y_train)
print('trained model')
out_dir = os.path.join('models', 'finetune.model')
model.save(out_dir)
return model
Expected behavior
Expected to delete the temp file if run as administrator
Additional context
Please don't ask why I use Windows. I know Linux works.
Hi,
Character based language modeling has its advantages over word level prediction, and I'm wondering if I'll be able to use this wrapper or not.
My plan is to train a model using Google's T2T as documented here. The model can be trained using subword encoding (default), character or word level encodings. If I were to use any of this options, would the model saved work out of the box with finetune
? Should I beware of any details when training the model?
The repo looks like it is very well made, I hope this would be seamless. Does anyone know?
Hi, seems great work done by the team.
According to the documentation, I understand that every model uses a pre-trained language model.
Can I use it for the following scenario, if yes how?:
Describe the bug
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_3' with dtype float
[[Node: Placeholder_3 = Placeholder[dtype=DT_FLOAT, shape=<unknown>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: Mean_3/_943 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3032_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 18, in <module>
model.fit(train_X, train_y)
File "/home/tingkai/finetune/finetune/lm_base.py", line 169, in fit
return self.finetune(*args, **kwargs)
File "/home/tingkai/finetune/finetune/lm_classifier.py", line 47, in finetune
return self._finetune(X, Y=Y, batch_size=batch_size)
File "/home/tingkai/finetune/finetune/lm_base.py", line 130, in _finetune
summary = self.sess.run(self.summaries, {self.X: xmb, self.M: mmb, self.Y: ymb})
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_3' with dtype float
[[Node: Placeholder_3 = Placeholder[dtype=DT_FLOAT, shape=<unknown>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: Mean_3/_943 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3032_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'Placeholder_3', defined at:
File "test.py", line 18, in <module>
model.fit(train_X, train_y)
File "/home/tingkai/finetune/finetune/lm_base.py", line 169, in fit
return self.finetune(*args, **kwargs)
File "/home/tingkai/finetune/finetune/lm_classifier.py", line 47, in finetune
return self._finetune(X, Y=Y, batch_size=batch_size)
File "/home/tingkai/finetune/finetune/lm_base.py", line 111, in _finetune
self._build_model(n_updates_total=n_updates_total, target_dim=self.target_dim)
File "/home/tingkai/finetune/finetune/lm_base.py", line 357, in _build_model
self._construct_graph(n_updates_total, target_dim, train=train)
File "/home/tingkai/finetune/finetune/lm_base.py", line 277, in _construct_graph
self._define_placeholders()
File "/home/tingkai/finetune/finetune/lm_base.py", line 386, in _define_placeholders
self.do_dropout = tf.placeholder(tf.float32) # 1 for do dropout and 0 to not do dropout
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1808, in placeholder
return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 4848, in placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
op_def=op_def)
File "/home/tingkai/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_3' with dtype float
[[Node: Placeholder_3 = Placeholder[dtype=DT_FLOAT, shape=<unknown>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
[[Node: Mean_3/_943 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3032_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Minimum Reproducible Example
I was running the example code with my own data.
train_X is a list of string,
train_y is numpy array of int label
model = LanguageModelClassifier()
model.fit(train_X, train_y)
predictions = model.predict(test_X)
model.save('test_model/model_0')
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
This makes that training ignores the most frequent class.
Reproduce example:
`import numpy
import pandas
from finetune.imbalance import compute_class_weights
np.random.seed(0)
y = pd.Series(np.random.choice(a=[0, 1, 2], size=1000, p=[0.3, 0.6, 0.1]))
print(y.value_counts(normalize=True))
print(compute_class_weights('log', y))`
A way to pass in .tsv or .csv file for custom dataset for NLI and scitail dataset
Is your feature request related to a problem? Please describe.
In order to use the classifier on different languages / specific domains it would be useful to be able to pretrain the language model.
Describe the solution you'd like
Calling .fit
on a corpus (i.e.) no labels should train the language model.
model.fit(corpus)
Describe alternatives you've considered
Use the original repo which doesn't have a simple to use interface.
Describe the bug
When I attempt to pretrain a model using unlabeled data, I get an error if I have 50 or more examples:
ValueError: Cannot feed value of shape (2, 0) for Tensor 'Placeholder_3:0', which has shape '(?, 1)'
I'm guessing this is related to validation:
:param val_size: Validation set size as a percentage of all training data. Validation will not be run by default if n_examples < 50.
Minimum Reproducible Example
It seems to be dataset independent, however this does the trick:
from sklearn.datasets import fetch_20newsgroups
dataset = fetch_20newsgroups()
from finetune import Classifier
model = Classifier()
model.fit(dataset.data[:50])
Note that if I instead take the slice dataset.data[:49]
training succeeds.
Hey! Thanks for the awsome work. I was wondering if I could use and update finetune to do the following:
Instead of using (Start, Text1, Delim, Text2, Extract)
and (Start, Text2, Delim, Text1, Extract)
as in the paper, can we use (Start, Text1, Extract)
and (Start, Text2, Extract)
separately through the transformer?
This could be thought of as obtaining sentence/document embeddings for Text1 and Text2 separately. Upon doing that, I would like to compare their similarity using a distance metric such as cosine distance. (i.e. train the transformer as a siamese network.)
Would you suggest I build such a model on top of a fork of finetune?
Looking at #107 there is a notebook reported order of 10k iterations per second on the GPU see this notebook I'm getting <10 iterations per second see this notebook
Any ideas of what is going on? I tried 0.3.1 but that didn't speed anything up.
Sorry if this is a stupid question..
I'm curious if it's possible to have numerical features using this model? The documentation says that X should be an array of text.
Thanks for your time and really nice project
def get_ema_if_exists(v, gvs):
name = v.name.split(':')[0]
ema_name = name+'/ExponentialMovingAverage:0'
ema_v = [v for v in gvs if v.name == ema_name]
if len(ema_v) == 0:
ema_v = [v]
return ema_v[0]
def get_ema_vars(*vs):
if tf.get_variable_scope().reuse:
gvs = tf.global_variables()
vs = [get_ema_if_exists(v, gvs) for v in vs]
if len(vs) == 1:
return vs[0]
else:
return vs
g, b = get_ema_vars(g, b)
I think g, b
is the original tensor, not the ema
I was just wondering if you've considered adding BERT as an additional backend (as an alternative to the OpenAI GPT), which seems to improve on the performance of the GPT in most tasks.
Their TensorFlow code is open source here: https://github.com/google-research/bert https://arxiv.org/abs/1810.04805
Hi all,
Is it possible to finetune in unsupervised way, like language model firstly/only?
I have a lot of unlabelled data and just a hundreds of labelled examples, so I'd like to firstly finetune LM in unsupervised way and then finetune on specific supervised task. Such process was described in ULMFit paper.
It might be also pretty useful for getting better deep representation if you don't have labelled examples at all.
flake8 testing of https://github.com/IndicoDataSolutions/finetune on Python 3.7.0
$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics
./finetune/base.py:312:21: E999 SyntaxError: invalid syntax
)
^
1 E999 SyntaxError: invalid syntax
1
Describe the bug
After fitting the Regressor model, calling predict returns None instead of list or array of predictions.
Minimum Reproducible Example
import numpy as np
from finetune import Regressor
x_test = np.array(['the quick fox jumped over the lazy brown dog'] * 100)
y_test = np.random.random(100)
model_test = Regressor(n_epochs=1, val_interval=100/2/3)
model_test.fit(x_test, y_test)
model_test.predict(x_test) # Returns None
Additional context
Finetune was installed from source, 0.3.1 master branch. Environment is google colab python3 (gpu).
I can't load a model on a different computer, because the home directory is different, which causes this:
Lines 84 to 89 in 8d0ecc2
to fail with Permission denied.
I worked around it with this script:
import joblib
import sys
p = sys.argv[1]
a, b = joblib.load(p)
b.config.tensorboard_folder = None
joblib.dump((a,b), p+".export")
but I feel like this case should be handled by the library.
Describe the bug
When attempting to train a classifier on a small dataset of 8,000 documents, I get an out of memory error and the script stops running.
Minimum Reproducible Example
Version of finetune
= 0.4.1
Version of tensorflow-gpu
= 1.8.0
Version of cuda
= release 9.0, V9.0.176
Windows 10 Pro
Load a dataset of documents (X_train) and labels (Y_train), where each document and label is simply a string.
model = finetune.Classifier(max_length = 256, batch_size = 1) #tried reducing the memory footprint
model.fit(X_train, Y_train)
Expected behavior
I expected the model to train, but it doesn't manage to start training.
Additional context
I get the following warnings in the jupyter notebook:
C:\Users...\Python35\site-packages\finetune\encoding.py:294: UserWarning: Some examples are longer than the max_length. Please trim documents or increase
max_length
. Fallback behaviour is to use the first 254 byte-pair encoded tokens
"Fallback behaviour is to use the first {} byte-pair encoded tokens".format(max_length - 2)
C:\Users...\Python35\site-packages\finetune\encoding.py:233: UserWarning: Document is longer than max length allowed, trimming document to 256 tokens.
max_length
C:\Users...\tensorflow\python\ops\gradients_impl.py: 100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
WARNING:tensorflow:From C:\Users...\tensorflow\python\util\tf_should_use.py:118: initialize_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Usetf.variables_initializer
instead.
And then I get the following diagnostic info showing up in the command prompt:
2018-10-04 17:26:36.920118: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-10-04 17:26:37.716883: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: Quadro M1200 major: 5 minor: 0 memoryClockRate(GHz): 1.148
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.35GiB
2018-10-04 17:26:37.725637: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-10-04 17:26:38.412484: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-04 17:26:38.417413: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-10-04 17:26:38.419392: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-10-04 17:26:38.421353: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/device:GPU:0 with 3083 MB memory) -> physical GPU (device: 0, name: Quadro M1200, pci bus id: 0000:01:00.0, compute capability: 5.0)
[I 17:28:26.081 NotebookApp] Saving file at /projects/language-models/Finetune Package.ipynb
2018-10-04 17:29:14.118663: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-10-04 17:29:14.123595: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-04 17:29:14.127649: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-10-04 17:29:14.135411: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-10-04 17:29:14.138698: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3083 MB memory) -> physical GPU (device: 0, name: Quadro M1200, pci bus id: 0000:01:00.0, compute capability: 5.0)
2018-10-04 17:30:06.881174: W T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 9.00MiB. Current allocation summary follows.
2018-10-04 17:30:06.900550: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (256):
Total Chunks: 60, Chunks in use: 60. 15.0KiB allocated for chunks. 15.0KiB in use in bin. 312B client-requested in use in bin.
2018-10-04 17:30:06.929551: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (512):
Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:06.964647: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (1024): Total Chunks: 2, Chunks in use: 2. 2.5KiB allocated for chunks. 2.5KiB in use in bin. 2.0KiB client-requested in use in bin.
2018-10-04 17:30:06.995394: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (2048): Total Chunks: 532, Chunks in use: 532. 1.56MiB allocated for chunks. 1.56MiB in use in bin. 1.56MiB client-requested in use in bin.
2018-10-04 17:30:07.031613: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.061013: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (8192): Total Chunks: 137, Chunks in use: 137. 1.39MiB allocated for chunks. 1.39MiB in use in bin. 1.39MiB client-requested in use in bin.
2018-10-04 17:30:07.093603: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (16384): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.130530: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (32768): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.170321: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (65536): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.212730: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (131072): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.246329: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (262144): Total Chunks: 2, Chunks in use: 2. 512.0KiB allocated for chunks. 512.0KiB in use in bin. 512.0KiB client-requested in use in bin.
2018-10-04 17:30:07.288640: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (524288): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.303248: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.332990: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (2097152): Total Chunks: 71, Chunks in use: 71. 159.75MiB allocated for chunks. 159.75MiB in use in bin. 159.75MiB client-requested in use in bin.
2018-10-04 17:30:07.364897: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (4194304): Total Chunks: 69, Chunks in use: 68. 466.99MiB allocated for chunks. 459.00MiB in use in bin. 459.00MiB client-requested in use in bin.
2018-10-04 17:30:07.396862: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (8388608): Total Chunks: 140, Chunks in use: 140. 1.23GiB allocated for chunks. 1.23GiB in use in bin. 1.23GiB client-requested in use in bin.
2018-10-04 17:30:07.428029: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (16777216): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.464813: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.494067: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (67108864): Total Chunks: 10, Chunks in use: 10. 1.17GiB allocated for chunks. 1.17GiB in use in bin. 1.17GiB client-requested in use in bin.
2018-10-04 17:30:07.524156: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.550345: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:630] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2018-10-04 17:30:07.578392: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:646] Bin for 9.00MiB was 8.00MiB, Chunk State:
2018-10-04 17:30:07.600123: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000801980000 of size 1280
2018-10-04 17:30:07.629493: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000801980500 of size 1280
2018-10-04 17:30:07.649189: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000801980A00 of size 125144064
2018-10-04 17:30:07.676965: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 00000008090D9600 of size 7077888
2018-10-04 17:30:07.699245: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 0000000809799600 of size 3072
2018-10-04 17:30:07.718738: I T:\src\github\tensorflow\tensorflow\core\common_runtime\bfc_allocator.cc:665] Chunk at 000000080979A200 of size 3072
...and so on. This is, in my opinion a pretty small dataset and I've made the max characters pretty small so I don't think this is a hardware limitation, but a bug.
Describe the bug
After saving a model on 5.10
using Classifier.save("my_model.bin")
, upgrading to 5.11
.
Loading using Classifier.load("my_model.bin")
results in KeyError: 'base_model_path'
Is your feature request related to a problem? Please describe.
Serving a trained model in production.
Describe the solution you'd like
I'd like to understand how to interface with tensorflow.
Describe alternatives you've considered
I'm able to save
and load
a model, but not sure how to restore and serve it using TF.
Thank you for your library, the supervised finetuning works very well. However, when I try to train on unlabelled data ( model.fit(unlabeledX) ), the training is much slower (9s/it) compared to supervised training (1.7s/it). This is on one K80 gpu. I am not sure why unsupervised training is slower, as doesn't the supervised training tune the language model as well?
After training a default classifier, saving and loading.
model.predict("lorem ipsum")
and model.predict_prob
take in average 14 seconds even on a hefty server such as AWS p3.16Xlarge.
Hello,
I've tried to use max_length more than 512 to featurize text:
model = finetune.Classifier()
trn_X_q_vecs = model.featurize(trn_X_q, max_length=1000)
But I got the following exception:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-d3d9e8b820e5> in <module>()
----> 1 trn_X_q_vecs = model.featurize(trn_X_q, max_length=1000)
/opt/conda/lib/python3.6/site-packages/finetune/classifier.py in featurize(self, X, max_length)
24 :returns: np.array of features of shape (n_examples, embedding_size).
25 """
---> 26 return super().featurize(X, max_length=max_length)
27
28 def predict(self, X, max_length=None):
/opt/conda/lib/python3.6/site-packages/finetune/base.py in featurize(self, *args, **kwargs)
386 These features are the same that are fed into the target_model.
387 """
--> 388 return self._featurize(*args, **kwargs)
389
390 @classmethod
/opt/conda/lib/python3.6/site-packages/finetune/base.py in _featurize(self, Xs, max_length)
371 warnings.filterwarnings("ignore")
372 max_length = max_length or self.config.max_length
--> 373 for xmb, mmb in self._infer_prep(Xs, max_length=max_length):
374 feature_batch = self.sess.run(self.features, {
375 self.X: xmb,
/opt/conda/lib/python3.6/site-packages/finetune/base.py in _infer_prep(self, Xs, max_length)
400 def _infer_prep(self, Xs, max_length=None):
401 max_length = max_length or self.config.max_length
--> 402 arr_encoded = self._text_to_ids(Xs, max_length=max_length)
403 n_batch_train = self.config.batch_size * max(len(self.config.visible_gpus), 1)
404 self._build_model(n_updates_total=0, target_dim=self.target_dim, train=False)
/opt/conda/lib/python3.6/site-packages/finetune/base.py in _text_to_ids(self, Xs, Y, max_length)
156 else:
157 encoder_out = self.encoder.encode_multi_input(Xs, Y=Y, max_length=max_length)
--> 158 return self._array_format(encoder_out)
159
160
/opt/conda/lib/python3.6/site-packages/finetune/base.py in _array_format(self, encoded_output)
421 for i, seq_length in enumerate(seq_lengths):
422 # BPE embedding
--> 423 x[i, :seq_length, 0] = encoded_output.token_ids[i]
424 # masking: value of 1 means "consider this in cross-entropy LM loss"
425 mask[i, 1:seq_length] = 1
ValueError: cannot copy sequence with size 667 to array axis with dimension 512
Describe the bug
It doesn't seem like the default model is pretrained in the latest development branch.
In prior versions, the default model generated coherent text with generate_text()
using a wide variety of seed words. With the current default model I haven't been able to generate any coherent text at all. This includes seeding with many different words.
I'm mostly trying to use this as a sanity check that things are working. I don't mean that the generated text would need to be the same as prior versions, but this is giving me the impression that either the model is no longer pretrained or something went wrong in loading the model. Is it still expected that the default model is pretrained?
Minimum Reproducible Example
>>> import finetune
>>> finetune.__version__
'0.5.9'
The current version outputs things along these lines, regardless of seed word:
>>> from finetune import Classifier
>>> model = Classifier()
>>> model.generate_text('potatoes')
`'_start_potatoes " \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n greyson \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n greyson \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n , \n \n \n \n \n greyson \n \n greyson greyson greyson greyson greyson greyson greyson greyson greyson \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n greyson \n \n \n \n \n \n \n \n '`
Expected behavior
>>> import finetune
>>> finetune.__version__
'0.4.1'
This older version would generate text along these lines with a wide variety of seed words:
>>> from finetune import Classifier
>>> model = Classifier()
>>> model.generate_text('potatoes')
`'_start_potatoes , " she said . \n " i do n\'t know what you mean . " \n " you \'re the one who said you wanted to be a chef . " \n " i did ? " \n " yes . " \n " i do n\'t know what you \'re talking about . " \n " you do n\'t have to . " \n " i do n\'t ? " \n " no . " \n " i do n\'t ? " \n " no . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not ? " \n " because i do n\'t want to . " \n " why not '`
I ran quora_similarity.py, and didn't modify any codes, then I got 0.77 accuracy for 0.4 class balance, any ideas for this? Thanks.
I am hoping to implement gradual unfreezing while finetuning for an article classification task. I see the config setting called num_layers_trained
. I thought I could change this setting after each epoch of finetuning, but it seems like the setting is only used during initialization. Is there a recommended way to accomplish this? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.