GithubHelp home page GithubHelp logo

mahmoudwahdan / dialog-nlu Goto Github PK

View Code? Open in Web Editor NEW
97.0 9.0 40.0 825 KB

Tensorflow and Keras implementation of the state of the art researches in Dialog System NLU

License: Apache License 2.0

Python 27.42% Jupyter Notebook 72.58%
tensorflow keras nlu nlu-engine nlp dialogue-systems bert joint-models transfer-learning natural-language-processing

dialog-nlu's Introduction

Dialog System NLU

Dialog NLU library that contains Tensorflow and Keras Implementation of the state of the art researches in Dialog System NLU.

It is built on Tensorflow 2 and Huggingface Transformers library for better models coverage and other languages support.

Implemented Papers

NLU Papers

Model Compression Papers

BERT / ALBERT for Joint Intent Classification and Slot Filling

Joint BERT

Poor Man’s BERT: Smaller and Faster Transformer Models

Layer-dropping Strategies

Supported data format:

  • Data format as in the paper Slot-Gated Modeling for Joint Slot Filling and Intent Prediction (Goo et al):
    • Consists of 3 files:
      • seq.in file contains text samples (utterances)
      • seq.out file contains tags corresponding to samples from seq.in
      • label file contains intent labels corresponding to samples from seq.in

Datasets included in the repo:

  • Snips Dataset (Snips voice platform: an embedded spoken language understanding system for private- by-design voice interfaces )(Coucke et al., 2018), which is collected from the Snips personal voice assistant.
    • The training, development and test sets contain 13,084, 700 and 700 utterances, respectively.
    • There are 72 slot labels and 7 intent types for the training set.

Integration with Huggingface Transformers library

Huggingface Transformers has a lot of transformers-based models. The idea behind the integration is to be able to support more architectures as well as more languages.

Supported Models Architecture:

Model Pretrained Model Example Layer Prunning Support
TFBertModel bert-base-uncased Yes
TFDistilBertModel distilbert-base-uncased Yes
TFAlbertModel albert-base-v1 or albert-base-v2 Not yet
TFRobertaModel roberta-base or distilroberta-base Yes
TFXLNetModel xlnet-base-cased No
And more models integration to come

Installation

You may choose to create python environment before installation.

git clone https://github.com/MahmoudWahdan/dialog-nlu.git
cd dialog-nlu
pip install .

Examples:

We provide examples of how to use the library. You can find Jupyter notebboks under notebooks and python scripts examples of how to use the library

Training, Evaluation, and simple API script:

We provide scripts to train, incremental training, and simple flask API.

Quick tour

# imports
from dialognlu import TransformerNLU, AutoNLU
from dialognlu.readers.goo_format_reader import Reader

# reading datasets
train_path = "data/snips/train"
val_path = "data/snips/valid"
train_dataset = Reader.read(train_path)
val_dataset = Reader.read(val_path)

# configurations of the model
config = {
    "pretrained_model_name_or_path": "distilbert-base-uncased",
    "from_pt": False,
}
# create a joint NLU model from configurations
nlu_model = TransformerNLU.from_config(config)

# training the model
nlu_model.train(train_dataset, val_dataset, epochs=3, batch_size=64)

# saving model
save_path = "saved_models/joint_distilbert_model"
nlu_model.save(save_path)

# loading the model and do incremental training

# loading model
nlu_model = AutoNLU.load(save_path)

# Continue training
nlu_model.train(train_dataset, val_dataset, epochs=1, batch_size=64)

# evaluate the model
test_path = "../data/snips/test"
test_dataset = Reader.read(test_path)
token_f1_score, tag_f1_score, report, acc = nlu_model.evaluate(test_dataset)
print('Slot Classification Report:', report)
print('Slot token f1_score = %f' % token_f1_score)
print('Slot tag f1_score = %f' % tag_f1_score)
print('Intent accuracy = %f' % acc)

# do prediction
utterance = "add sabrina salerno to the grime instrumentals playlist"
result = nlu_model.predict(utterance)

Use Layer Pruning with NLU model

It is supported only in transformer-based NLU models

# imports
from dialognlu import TransformerNLU, AutoNLU
from dialognlu.readers.goo_format_reader import Reader

# reading datasets
train_path = "data/snips/train"
val_path = "data/snips/valid"
train_dataset = Reader.read(train_path)
val_dataset = Reader.read(val_path)

# configurations of the model
config = {
    "pretrained_model_name_or_path": "distilbert-base-uncased",
    "from_pt": False,
	"layer_pruning": {
        "strategy": "top",
        "k": 2
    }
}
# create a joint NLU model from configurations
nlu_model = TransformerNLU.from_config(config)

# training the model
nlu_model.train(train_dataset, val_dataset, epochs=3, batch_size=64)

dialog-nlu's People

Contributors

mahmoudwahdan avatar matherialist avatar redrussianarmy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dialog-nlu's Issues

Determining confidence level for intent prediction

Is there anyway to figure out the confidence level for an intent prediction? Often the system will return an Intent that is inaccurate (for different reasons of course) so it would be nice to get an accuracy reading for each prediction.

Add examples notebooks

Add more examples in jupyter notebooks form.
Try to illustrate the concepts of optimizing the models and trade-off between optimized model accuracy and size/performance.

Incremental training not working on Huggingface Trans model

I trained a model from scratch using:

python3 train_joint_trans.py --train=data/custom/train --val=data/custom/valid --save=saved_models/joint_trans_model --epochs=3 --batch=64 --cache_dir=transformers_cache_dir  --trans=deepset/bert-large-uncased-whole-word-masking-squad2 --from_pt=true

and then attempted to incrementally train the model using:

#incremental training
python3 train_joint_trans.py --train=data/hp-custom/train --val=data/hp-custom/valid --save=saved_models/joint_trans_model2 --epochs=3 --batch=64 --cache_dir=transformers_cache_dir  --trans=deepset/bert-large-uncased-whole-word-masking-squad2 --from_pt=true --model=saved_models/joint_trans_model

This resulted in the following error:

File "train_joint_trans.py", line 107, in
epochs=epochs, batch_size=batch_size, id2label=id2label)
File "/Users/jv/dev/project/test/ai/tensorflow/BERT-Concierge/dialog-nlu/models/base_joint_trans.py", line 79, in fit
callbacks=callbacks)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 780, in call
result = self._call(*args, **kwds)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 846, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 550, in call
ctx=ctx)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Received a label value of 40 which is outside the valid range of [0, 39).

Not sure why this is happening.

islamic dataset

Assalam o Alaikum brother i need islamic questions and answers dataset can u provide me?

how to convert .h5 to checkpoint

hello,

is there easier method to convert the trained .h5 to checkpoint? Or is it much faster to rebuild the model in tensorflow and saved to checkpoint file?

Thanks in advance and i am looking forward to your reply.

Training is extremely case sensitive

Noticed something interesting but probably not related to your implementation. When you train the model, it seems to be VERY case-sensitive. So training the model with:
i love you -> LoveIntent
I hate you -> HateIntent

and then trying to predict (notice the lower-case 'i'):
i hate you -> yields prediction: LoveIntent

I suspect that this is inherent in the BERT & ALBERT model but thought I would bring it to your attention.

unexpected keyword argument 'use_cdn' error

/usr/local/lib/python3.6/dist-packages/dialognlu/compression/commons.py in from_pretrained_detailed(model_class, pretrained_model_name_or_path, *model_args, **kwargs)
160 pretrained_model_name_or_path,
161 filename=(WEIGHTS_NAME if from_pt else TF2_WEIGHTS_NAME),
--> 162 use_cdn=use_cdn,
163 )
164

TypeError: hf_bucket_url() got an unexpected keyword argument 'use_cdn'

Add Arabic NLU Dataset and examples

The idea is to add Arabic NLU Dataset and examples, opening the opportunity for more examples in non-English languages and support Arabic users.

Out of bounds error when using adding spaces to ' or ?

I've decided to add spaces before and after any quote and question mark as follows:

For example: "Can you book me a table for 4 people at Core?" becomes "Can you book me a table for 4 people at Core ? "

This will yield the following error:

output[i][j] = data[i][idx]

IndexError: index 5 is out of bounds for axis 0 with size 5

Prediction function for tflite model not working

nlu_model = TransformerNLU.load(save_path, quantized=True, num_process=1)
utterance = "add sabrina salerno to the grime instrumentals playlist"
result = nlu_model.predict(utterance)
print(result)
for item in result['slots']:
print(item['value'])

Loading quantized model in 1 processes
Model Loaded, process id: 300

After loading the tflite model, it shows processing only and not giving the prediction for the utterance

the saved model

Hi,
recently I also focus on the research in jointly training of slots filling and intent classification with BERT. May i ask you, why do you save the trained model in 'xx.h5'. what does that mean and what's the different with the normal checkpoint?

Thanks advance!
Ye

Way lower accuracy than example notebooks

Hey,
Thanks for the great repo! I am currently trying to train a NLU and tested out your using_bert_crf_nlu example notebook with your provided data.
I have tested both Bert and Albert embeddings. However, I always only get 14% accuracy (so random classification), while you have ~98% accuracy. I am currently using python 3.11 - the only major compatibility issue I had was with sentencepiece (dependency from transformers), where I now use the newest version instead of your specified version range.
I can't imagine this being the case for the model not working anymore. I will continue investigating this, but wanted to ask whether you or someone else already had this issue and has ideas!

Thanks!

Support TensorRT conversion and serving feature

I realized that the Tensorflow Lite does not support inference with using Nvidia GPU. I have a device of Nvidia Jetson Xavier. My current inference is made with unoptimized transformers model on GPU. It is faster than inference with TF Lite model on CPU.

After my research, I have found 2 types of model optimization such as TensorRT or TF-TRT. I have made some trials to achieve the conversion of fine-tuned transformers model to TensorRT but I could not achieve. It would be better if the dialog-nlu supports TensorRT conversion and serving feature.

bert_nlu_basic_api.py does not work with Trans Model

Hello there,
I have trained a trans model. When I want to run bert_nlu_basic_api.py file, it does not work. Then, I realized that the Trans model classes are not imported in the file. I have imported the JointTransBertModel class and loaded the model with JointTransBertModel.load(load_folder_path) in bert_nlu_basic_api.py At the end, I have got following error:

ValueError: Unknown layer: Custom>TFBertMainLayer

Traceback (most recent call last):

  File "bert_nlu_basic_api.py", line 105, in <module>
    initialize()
  File "bert_nlu_basic_api.py", line 46, in initialize
    model = JointTransBertModel.load(load_folder_path)
  File "/home/hakan/Documents/HB/dialog-nlu/models/joint_trans_bert.py", line 42, in load
    return BaseJointTransformerModel.load_model_by_class(BaseJointTransformerModel, load_folder_path, 'joint_bert_model.h5')
  File "/home/hakan/Documents/HB/dialog-nlu/models/base_joint_trans.py", line 121, in load_model_by_class
    new_model.model = tf.keras.models.load_model(os.path.join(load_folder_path, trans_model_name))
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 182, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 178, in load_model_from_hdf5
    custom_objects=custom_objects)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/saving/model_config.py", line 55, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py", line 175, in deserialize
    printable_module_name='layer')
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 358, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 617, in from_config
    config, custom_objects)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 1204, in reconstruct_from_config
    process_layer(layer_data)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 1186, in process_layer
    layer = deserialize_layer(layer_data, custom_objects=custom_objects)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py", line 175, in deserialize
    printable_module_name='layer')
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 347, in deserialize_keras_object
    config, module_objects, custom_objects, printable_module_name)
  File "/home/hakan/.local/share/virtualenvs/dialog-nlu-TCo_F89F/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 296, in class_and_config_for_serialized_keras_object
    raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
ValueError: Unknown layer: Custom>TFBertMainLayer

Difficulty in TFLite Conversion

Tried few ways to convert the joint-albert keras (.h5) model in the output directory to TFLite.

  1. TF version : Stable (2.3.0) , tf-nightly(2.4.0-dev20201005) [Both produced same error logs]
    Code :
model = tf.keras.models.load_model('/content/drive/My Drive/joint_albert_model/joint_bert_model.h5',custom_objects={'KerasLayer':hub.KerasLayer})
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

Error Log :

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:2292: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`Model.state_updates` will be removed in a future version. '
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py:1377: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`layer.updates` will be removed in a future version. '
INFO:tensorflow:Assets written to: /tmp/tmp7g7xnu4t/assets
INFO:tensorflow:Assets written to: /tmp/tmp7g7xnu4t/assets
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py in _import_graph_def_internal(graph_def, input_map, return_elements, validate_colocation_constraints, name, producer_op_list)
    496         results = c_api.TF_GraphImportGraphDefWithResults(
--> 497             graph._c_graph, serialized, options)  # pylint: disable=protected-access
    498         results = c_api_util.ScopedTFImportGraphDefResults(results)

InvalidArgumentError: Input 2 of node model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/transformer/StatefulPartitionedCall was passed float from Func/model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/input/_106:0 incompatible with expected resource.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
12 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py in _import_graph_def_internal(graph_def, input_map, return_elements, validate_colocation_constraints, name, producer_op_list)
    499       except errors.InvalidArgumentError as e:
    500         # Convert to ValueError for backwards compatibility.
--> 501         raise ValueError(str(e))
    502 
    503     # Create _DefinedFunctions for any imported functions.

ValueError: Input 2 of node model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/transformer/StatefulPartitionedCall was passed float from Func/model/AlbertLayer/StatefulPartitionedCall/StatefulPartitionedCall/StatefulPartitionedCall/albert_transformer_encoder/StatefulPartitionedCall/input/_106:0 incompatible with expected resource.
  1. TF version : tf-nightly(2.4.0-dev20201005)
    Code :
model = tf.keras.models.load_model('/content/drive/My Drive/joint_albert_model/joint_bert_model.h5',custom_objects={'KerasLayer':hub.KerasLayer})
tf.saved_model.save(model, 'joint_albert_savedmodel')
converter = tf.lite.TFLiteConverter.from_saved_model('joint_albert_savedmodel')
tflite_model = converter.convert()
with tf.io.gfile.GFile(os.path.join("./", 'joint_albert.tflite'), 'wb') as f:
    f.write(tflite_model)

The above conversion worked and I got the tflite model, but when I tried inference noticed that the conversion is messed and the tflite model's input_details and output_details are wrong.

Code for inference for the tflite model got above :

with tf.io.gfile.GFile("joint_albert.tflite", 'rb') as f:
    model_content = f.read()

interpreter = tf.lite.Interpreter(model_content=model_content)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print(input_details)
print(output_details)

Output :

[{'name': 'serving_default_input_type_ids:0', 'index': 0, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'serving_default_input_mask:0', 'index': 1, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'serving_default_valid_positions:0', 'index': 2, 'shape': array([  1,   1, 440], dtype=int32), 'shape_signature': array([ -1,  -1, 440], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'serving_default_input_word_ids:0', 'index': 3, 'shape': array([1, 1], dtype=int32), 'shape_signature': array([-1, -1], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

[{'name': 'StatefulPartitionedCall:1', 'index': 945, 'shape': array([  1,   1, 440], dtype=int32), 'shape_signature': array([ -1,  -1, 440], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'StatefulPartitionedCall:0', 'index': 941, 'shape': array([ 1, 53], dtype=int32), 'shape_signature': array([-1, 53], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

This input and output details for the tflite model is wrong, as you can see "shape: array([1, 1], dtype=int32)" for index:0,1 for input and etc.

Is there a way to convert the joint-albert model to tflite and run inference on it? Or is this model not supported yet?

Please help.

ModuleNotFoundError

I'm getting this error when I tried to run the script for training after building the library :
ModuleNotFoundError: No module named 'dialognlu.models'

Checked the contents of dialognlu library that got installed and found just these files :

  1. init.py
  2. pycache
  3. auto_nlu.py
  4. compression
  5. nlu_components.py
  6. utils

Is there a problem while building the library or Am I doing something wrong here?
Please help.

Upgrade to TF 2

Any plans to upgrade this to TF2?
And perhaps implement some sort of Estimator and/or serving support.
Many thanks.

Typo error in bert_nlu_basic_api.py

there is a typo in line number 30.
if type_ == 'albert':
It should be
if type_ == 'bert':

Creating a PR for this small error will not make sense hence creating an issue.

Thanks

Cannot load huggingface model

Trying to train using https://huggingface.co/deepset/bert-large-uncased-whole-word-masking-squad2

python3 train_joint_trans.py --train=data/snips/train --val=data/snips/valid --save=saved_models/joint_trans_model --epochs=3 --batch=64 --cache_dir=transformers_cache_dir --trans=deepset/bert-large-uncased-whole-word-masking-squad2 --from_pt=false

but I get error:

Traceback (most recent call last):
File "train_joint_trans.py", line 100, in
model = create_joint_trans_model(config)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/models/trans_auto_model.py", line 65, in create_joint_trans_model
model, trans_type = get_transformer_model(pretrained_model_name_or_path, cache_dir, from_pt, layer_pruning)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/models/trans_auto_model.py", line 42, in get_transformer_model
layer_pruning=layer_pruning)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/compression/commons.py", line 384, in from_pretrained
return from_pretrained_detailed(model_class, pretrained_model_name_or_path, *model_args, config=config, **kwargs)
File "/Users/aa/dev/project/hoverpin/ai/tensorflow/BERT-Concierge/dialog-nlu/compression/commons.py", line 185, in from_pretrained_detailed
raise EnvironmentError(msg)
OSError: Can't load weights for 'deepset/bert-large-uncased-whole-word-masking-squad2'. Make sure that:

  • 'deepset/bert-large-uncased-whole-word-masking-squad2' is a correct model identifier listed on 'https://huggingface.co/models'

  • or 'deepset/bert-large-uncased-whole-word-masking-squad2' is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.