GithubHelp home page GithubHelp logo

Comments (4)

MahmoudWahdan avatar MahmoudWahdan commented on May 25, 2024

Converting NLU models that are based on Transformers library to tflite is available now in version 0.1.0 of the library
An Example can be found here
We will provide built-in support for serving tflite models

from dialog-nlu.

redrussianarmy avatar redrussianarmy commented on May 25, 2024

First of all, this is great feature. I converted my Transformers Joint BERT Model to TFLite model. I have a problem when inference is made. I get following error. How can I solve it? and do you have any code for inference of TFLite model?

P.S.: I did not install tf-nightly and its estimator because they requires the >CUDA 11.0 that I dont have. My cuda version is 10.1, and I have used Tensorflow==2.3.1 to convert the model.

Code starts with these lines:

# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="relevant_path")
interpreter.allocate_tensors()

And the error:

Traceback (most recent call last):
  File "/home/hb/Documents/dialog-nlu/examples/infer_model.py", line 6, in <module>
    interpreter.allocate_tensors()
  File "/home/hb/.local/share/virtualenvs/dialog-nlu-9wkHWo7b/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 243, in allocate_tensors
    return self._interpreter.AllocateTensors()
RuntimeError: tensorflow/lite/kernels/reshape.cc:55 stretch_dim != -1 (0 != -1)Node number 0 (RESHAPE) failed to prepare. 

from dialog-nlu.

MahmoudWahdan avatar MahmoudWahdan commented on May 25, 2024

@redrussianarmy Thanks for your good encouraging words!
Please note the following:
1- Based on your environment, TF Ops may be available in your version. Please, check https://www.tensorflow.org/lite/guide/ops_select#python
TF Ops support is mandatory for serving in python.

Note: TensorFlow Lite with select TensorFlow ops are available in the TensorFlow pip package version since 2.3 for Linux and 2.4 for other environments.

2- Serving fp16_quantization for example is not supported in PC GPU. You need to disable GPU for Tensorflow in the beginning of your serving script

os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

3- Shapes in the model that use None should not be used, however I forgot to handle this in previous release.

in_id = Input(shape=(None,), name='input_word_ids', dtype=tf.int32)
in_mask = Input(shape=(None,), name='input_mask', dtype=tf.int32)
in_segment = Input(shape=(None,), name='input_type_ids', dtype=tf.int32)
in_valid_positions = Input(shape=(None, self.slots_num), name='valid_positions')

4- last important note As per documentation here for version 2.3.0,

It is possible to use this interpreter in a multithreaded Python environment, but you must be sure to call functions of a particular instance from only one thread at a time.

So, that's why I left the issue as open. Shortly, I'll provide a production ready tf lite serving feature that is thread safe and utilize python's multiprocessing. Please, stay tuned!

from dialog-nlu.

MahmoudWahdan avatar MahmoudWahdan commented on May 25, 2024

Hi @redrussianarmy
Good news!
I just released the new features that support tflite conversion and serving 676c690
Release https://github.com/MahmoudWahdan/dialog-nlu/releases/tag/v0.2.0

You may refer to examples for concrete examples of how to save tflite model and using it in prediction and evaluation.

nlu = TransformerNLU.from_config(config)
nlu.train(train_dataset, val_dataset, epochs, batch_size)
nlu.save(save_path, save_tflite=True, conversion_mode="hybrid_quantization")
nlu = TransformerNLU.load(model_path, quantized=True, num_process=4)
# prediction
result = nlu.predict(utterance)
# evaluation
token_f1_score, tag_f1_score, report, acc = nlu.evaluate(test_dataset)

from dialog-nlu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.