mahmoudwahdan / dialog-nlu Goto Github PK
View Code? Open in Web Editor NEWTensorflow and Keras implementation of the state of the art researches in Dialog System NLU
License: Apache License 2.0
Tensorflow and Keras implementation of the state of the art researches in Dialog System NLU
License: Apache License 2.0
.The code is stucked at
read data ...
vectorize data ...
Any solution?
Would be great if model can incremental train by using previous saved model
hello,
is there easier method to convert the trained .h5 to checkpoint? Or is it much faster to rebuild the model in tensorflow and saved to checkpoint file?
Thanks in advance and i am looking forward to your reply.
Noticed something interesting but probably not related to your implementation. When you train the model, it seems to be VERY case-sensitive. So training the model with:
i love you -> LoveIntent
I hate you -> HateIntent
and then trying to predict (notice the lower-case 'i'):
i hate you -> yields prediction: LoveIntent
I suspect that this is inherent in the BERT & ALBERT model but thought I would bring it to your attention.
Add more examples in jupyter notebooks form.
Try to illustrate the concepts of optimizing the models and trade-off between optimized model accuracy and size/performance.
Support Layer-Pruning for better training and serving time and performance in production.
[Paper] Poor Man’s BERT: Smaller and Faster Transformer Models
https://arxiv.org/abs/2004.03844
The idea is to add Arabic NLU Dataset and examples, opening the opportunity for more examples in non-English languages and support Arabic users.
Add support to Joint XLNet from the huggingface transformers
Hello there,
I have trained a trans model. When I want to run bert_nlu_basic_api.py file, it does not work. Then, I realized that the Trans model classes are not imported in the file. I have imported the JointTransBertModel
class and loaded the model with JointTransBertModel.load(load_folder_path)
in bert_nlu_basic_api.py At the end, I have got following error:
ValueError: Unknown layer: Custom>TFBertMainLayer
Traceback (most recent call last):
File "bert_nlu_basic_api.py", line 105, in <module>
initialize()
File "bert_nlu_basic_api.py", line 46, in initialize
model = JointTransBertModel.load(load_folder_path)
File "/home/hakan/Documents/HB/dialog-nlu/models/joint_trans_bert.py", line 42, in load
return BaseJointTransformerModel.load_model_by_class(BaseJointTransformerModel, load_folder_path, 'joint_bert_model.h5')
File "/home/hakan/Documents/HB/dialog-nlu/models/base_joint_trans.py", line 121, in load_model_by_class
new_model.model = tf.keras.models.load_model(os.path.join(load_folder_path, trans_model_name))
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 182, in load_model
return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 178, in load_model_from_hdf5
custom_objects=custom_objects)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/saving/model_config.py", line 55, in model_from_config
return deserialize(config, custom_objects=custom_objects)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py", line 175, in deserialize
printable_module_name='layer')
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 358, in deserialize_keras_object
list(custom_objects.items())))
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 617, in from_config
config, custom_objects)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 1204, in reconstruct_from_config
process_layer(layer_data)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 1186, in process_layer
layer = deserialize_layer(layer_data, custom_objects=custom_objects)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py", line 175, in deserialize
printable_module_name='layer')
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 347, in deserialize_keras_object
config, module_objects, custom_objects, printable_module_name)
File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 296, in class_and_config_for_serialized_keras_object
raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
ValueError: Unknown layer: Custom>TFBertMainLayer
I'm getting this error when I tried to run the script for training after building the library :
ModuleNotFoundError: No module named 'dialognlu.models'
Checked the contents of dialognlu library that got installed and found just these files :
Is there a problem while building the library or Am I doing something wrong here?
Please help.
I realized that the Tensorflow Lite does not support inference with using Nvidia GPU. I have a device of Nvidia Jetson Xavier. My current inference is made with unoptimized transformers model on GPU. It is faster than inference with TF Lite model on CPU.
After my research, I have found 2 types of model optimization such as TensorRT or TF-TRT. I have made some trials to achieve the conversion of fine-tuned transformers model to TensorRT but I could not achieve. It would be better if the dialog-nlu supports TensorRT conversion and serving feature.
This will lead to extensive performance enhancement.
I've decided to add spaces before and after any quote and question mark as follows:
For example: "Can you book me a table for 4 people at Core?" becomes "Can you book me a table for 4 people at Core ? "
This will yield the following error:
output[i][j] = data[i][idx]
IndexError: index 5 is out of bounds for axis 0 with size 5
Trying to train using https://huggingface.co/deepset/bert-large-uncased-whole-word-masking-squad2
python3 train_joint_trans.py --train=data/snips/train --val=data/snips/valid --save=saved_models/joint_trans_model --epochs=3 --batch=64 --cache_dir=transformers_cache_dir --trans=deepset/bert-large-uncased-whole-word-masking-squad2 --from_pt=false
but I get error:
Traceback (most recent call last):
File "train_joint_trans.py", line 100, in
model = create_joint_trans_model(config)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/models/trans_auto_model.py", line 65, in create_joint_trans_model
model, trans_type = get_transformer_model(pretrained_model_name_or_path, cache_dir, from_pt, layer_pruning)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/models/trans_auto_model.py", line 42, in get_transformer_model
layer_pruning=layer_pruning)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/compression/commons.py", line 384, in from_pretrained
return from_pretrained_detailed(model_class, pretrained_model_name_or_path, *model_args, config=config, **kwargs)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/compression/commons.py", line 185, in from_pretrained_detailed
raise EnvironmentError(msg)
OSError: Can't load weights for 'deepset/bert-large-uncased-whole-word-masking-squad2'. Make sure that:
'deepset/bert-large-uncased-whole-word-masking-squad2' is a correct model identifier listed on 'https://huggingface.co/models'
or 'deepset/bert-large-uncased-whole-word-masking-squad2' is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.
Assalam o Alaikum brother i need islamic questions and answers dataset can u provide me?
Support tflite conversion and serving
Add support to Joint RoBERTa from the huggingface transformers
Hi,
after I import BertCrfNLU, I can't train with this module.
would you please fix the problem
nlu_model = TransformerNLU.load(save_path, quantized=True, num_process=1)
utterance = "add sabrina salerno to the grime instrumentals playlist"
result = nlu_model.predict(utterance)
print(result)
for item in result['slots']:
print(item['value'])
Loading quantized model in 1 processes
Model Loaded, process id: 300
After loading the tflite model, it shows processing only and not giving the prediction for the utterance
The idea is to re-design the code to be a library
there is a typo in line number 30.
if type_ == 'albert':
It should be
if type_ == 'bert':
Creating a PR for this small error will not make sense hence creating an issue.
Thanks
Add support to Joint trans model in bert_nlu_basic_api.py
Any plans to upgrade this to TF2?
And perhaps implement some sort of Estimator and/or serving support.
Many thanks.
Hi,
recently I also focus on the research in jointly training of slots filling and intent classification with BERT. May i ask you, why do you save the trained model in 'xx.h5'. what does that mean and what's the different with the normal checkpoint?
Thanks advance!
Ye
Add support to Joint ALBERT from the huggingface transformers
Is there anyway to figure out the confidence level for an intent prediction? Often the system will return an Intent that is inaccurate (for different reasons of course) so it would be nice to get an accuracy reading for each prediction.
Tried few ways to convert the joint-albert keras (.h5) model in the output directory to TFLite.
model = tf.keras.models.load_model('/content/drive/My Drive/joint_albert_model/joint_bert_model.h5',custom_objects={'KerasLayer':hub.KerasLayer})
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
Error Log :
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2292: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`Model.state_updates` will be removed in a future version. '
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:1377: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`layer.updates` will be removed in a future version. '
INFO:tensorflow:Assets written to: /tmp/tmp7g7xnu4t/assets
INFO:tensorflow:Assets written to: /tmp/tmp7g7xnu4t/assets
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py in _import_graph_def_internal(graph_def, input_map, return_elements, validate_colocation_constraints, name, producer_op_list)
496 results = c_api.TF_GraphImportGraphDefWithResults(
--> 497 graph._c_graph, serialized, options) # pylint: disable=protected-access
498 results = c_api_util.ScopedTFImportGraphDefResults(results)
InvalidArgumentError: Input 2 of node model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/transformer/StatefulPartitionedCall was passed float from Func/model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/input/_106:0 incompatible with expected resource.
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
12 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py in _import_graph_def_internal(graph_def, input_map, return_elements, validate_colocation_constraints, name, producer_op_list)
499 except errors.InvalidArgumentError as e:
500 # Convert to ValueError for backwards compatibility.
--> 501 raise ValueError(str(e))
502
503 # Create _DefinedFunctions for any imported functions.
ValueError: Input 2 of node model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/transformer/StatefulPartitionedCall was passed float from Func/model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/input/_106:0 incompatible with expected resource.
model = tf.keras.models.load_model('/content/drive/My Drive/joint_albert_model/joint_bert_model.h5',custom_objects={'KerasLayer':hub.KerasLayer})
tf.saved_model.save(model, 'joint_albert_savedmodel')
converter = tf.lite.TFLiteConverter.from_saved_model('joint_albert_savedmodel')
tflite_model = converter.convert()
with tf.io.gfile.GFile(os.path.join("./", 'joint_albert.tflite'), 'wb') as f:
f.write(tflite_model)
The above conversion worked and I got the tflite model, but when I tried inference noticed that the conversion is messed and the tflite model's input_details and output_details are wrong.
Code for inference for the tflite model got above :
with tf.io.gfile.GFile("joint_albert.tflite", 'rb') as f:
model_content = f.read()
interpreter = tf.lite.Interpreter(model_content=model_content)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)
Output :
[{'name': 'serving_default_input_type_ids:0', 'index': 0, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'serving_default_input_mask:0', 'index': 1, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'serving_default_valid_positions:0', 'index': 2, 'shape': array([ 1, 1, 440], dtype=int32), 'shape_signature': array([ -1, -1, 440], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'serving_default_input_word_ids:0', 'index': 3, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'StatefulPartitionedCall:1', 'index': 945, 'shape': array([ 1, 1, 440], dtype=int32), 'shape_signature': array([ -1, -1, 440], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:0', 'index': 941, 'shape': array([ 1, 53], dtype=int32), 'shape_signature': array([-1, 53], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
This input and output details for the tflite model is wrong, as you can see "shape: array([1, 1], dtype=int32)" for index:0,1 for input and etc.
Is there a way to convert the joint-albert model to tflite and run inference on it? Or is this model not supported yet?
Please help.
Hello there,
I realized while I train a joint transformer model, GPU is not used. How can I make it an active?
provide documentation.
Add unit tests
/usr/local/lib/python3.6/dist-packages/dialognlu/compression/commons.py in from_pretrained_detailed(model_class, pretrained_model_name_or_path, *model_args, **kwargs)
160 pretrained_model_name_or_path,
161 filename=(WEIGHTS_NAME if from_pt else TF2_WEIGHTS_NAME),
--> 162 use_cdn=use_cdn,
163 )
164
TypeError: hf_bucket_url() got an unexpected keyword argument 'use_cdn'
Hey,
Thanks for the great repo! I am currently trying to train a NLU and tested out your using_bert_crf_nlu
example notebook with your provided data.
I have tested both Bert and Albert embeddings. However, I always only get 14% accuracy (so random classification), while you have ~98% accuracy. I am currently using python 3.11 - the only major compatibility issue I had was with sentencepiece
(dependency from transformers
), where I now use the newest version instead of your specified version range.
I can't imagine this being the case for the model not working anymore. I will continue investigating this, but wanted to ask whether you or someone else already had this issue and has ideas!
Thanks!
I trained a model from scratch using:
python3 train_joint_trans.py --train=data/custom/train --val=data/custom/valid --save=saved_models/joint_trans_model --epochs=3 --batch=64 --cache_dir=transformers_cache_dir --trans=deepset/bert-large-uncased-whole-word-masking-squad2 --from_pt=true
and then attempted to incrementally train the model using:
#incremental training
python3 train_joint_trans.py --train=data/hp-custom/train --val=data/hp-custom/valid --save=saved_models/joint_trans_model2 --epochs=3 --batch=64 --cache_dir=transformers_cache_dir --trans=deepset/bert-large-uncased-whole-word-masking-squad2 --from_pt=true --model=saved_models/joint_trans_model
This resulted in the following error:
File "train_joint_trans.py", line 107, in
epochs=epochs, batch_size=batch_size, id2label=id2label)
File "/Users/jv/dev/project/test/ai/tensorflow/BERT-Concierge/dialog-nlu/models/base_joint_trans.py", line 79, in fit
callbacks=callbacks)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in call
result = self._call(*args, **kwds)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 846, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
ctx=ctx)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Received a label value of 40 which is outside the valid range of [0, 39).
Not sure why this is happening.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.