tensorflow / hub Goto Github PK

View Code? Open in Web Editor NEW

3.5K 151.0 1.7K 13.32 MB

A library for transfer learning by reusing parts of TensorFlow models.

Home Page: https://tensorflow.org/hub

License: Apache License 2.0

Python 93.88% Shell 0.75% Starlark 5.37%

tensorflow machine-learning transfer-learning embeddings image-classification python ml

hub's Introduction

TensorFlow Hub has moved to Kaggle Models

Starting November 15th 2023, links to tfhub.dev redirect to their counterparts on Kaggle Models. tensorflow_hub will continue to support downloading models that were initially uploaded to tfhub.dev via e.g. hub.load("https://tfhub.dev/<publisher>/<model>/<version>"). Although no migration or code rewrites are explicitly required, we recommend replacing tfhub.dev links with their Kaggle Models counterparts to improve code health and debuggability. See FAQs here.

As of March 18, 2024, unmigrated model assets (see list below) were deleted and retrieval is no longer possible. These unmigrated model assets include:

tensorflow_hub

This GitHub repository hosts the tensorflow_hub Python library to download and reuse SavedModels in your TensorFlow program with a minimum amount of code, as well as other associated code and documentation.

Getting Started

Introduction
The asset types of tfhub.dev
- SavedModels for TensorFlow 2 and the Reusable SavedModel interface.
- Deprecated: Models in TF1 Hub format and their Common Signatures collection.
Using the library
Tutorials

Contributing

If you'd like to contribute to TensorFlow Hub, be sure to review the contribution guidelines. To contribute code to the library itself (not examples), you will probably need to build from source.

This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs.

License

Apache License 2.0

hub's People

Contributors

Stargazers

Watchers

Forkers

codeaudit kislerdm giantman1989 lp249839965 fooway shuolongbj radovankavicky gapdata ml-lab cclauss zhongkailv ducklingll sdabic venkatesh-sakthivel vbardiovskyg bwarfsman emechebe prudhviraj12 syed-ahmed vmarkovtsev ajeetdixit006 nanangarsyad shyamalschandra nikolayvoronchikhin chaipat-ncm pandinosaurus libardo1 whuzzh lf2055828 whitepoplar022 goungoun gaureshshashank ishay2b skondrashov aaronmbrown jackthailand cralyoniii zhengshunjie carlwilson12 nordkehre zengzhgzz sevenkrater zwc311800 anilujohn huihuizhao oleg-1 chichak wyp19930313 zhongshuiping kyriekk sakamotomichael fifigith ajay-sreeram tianyangzhang1 hirsk akazorro incenter2016 lady-ariel 01024grail nadol012 jianshijim godfatherace pkd018 jostosh pricel3ss codeteo zhongleiwang bearsprogrammer zhongjiyongshi nicolas-ivanov ludabai singhcpt navneetrao gehaoyu mingbibo carolyang0227 jsvisa bomeng stevenlol raghavadhanya zmikulski seprisutniot pabi235 xmutizabal redransil zubinengg j-fo-s vinaylocharulu pjpan uppering chunghyup gemacjr adilhussain540 muhammadriz alphasue yeungzijing001 andresmaca augmen deelmind tingtingxuyilongwang

hub's Issues

downloading universal sentence encoder pre-trained model

I am using universal sentence encoder pre-trained model using below command:
embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/1")

Why do I need to get the model from cloud everytime?

I tried wget https://tfhub.dev/google/universal-sentence-encoder/1
Tried opening above link in browser, it's redirecting me to https://www.tensorflow.org/hub/modules/google/universal-sentence-encoder/1
How to download the pre-trained model?

Difficulty in reusing model

I am trying to create a new module instance under a reused variable scope. A minimalistic example looks something like this,

import tensorflow as tf
import tensorflow_hub as hub

with tf.variable_scope('abc'):
    elmo_train = hub.Module("https://tfhub.dev/google/elmo/1", trainable=True)

with tf.variable_scope('abc', reuse=True):
    elmo_test = hub.Module("https://tfhub.dev/google/elmo/1", trainable=True)

This gives me this RunTimeError.

Is this the correct way to re-use TF Hub modules? What's a good work-around?

Questions about TF hub

Hello!
I read docs and source code of TF hub and have questions:

https://github.com/tensorflow/hub/blob/master/docs/hosting.md protocol is very small. Is it intended, or do you plan to extend it? I would like to see API to get module specs without downloading the tarball, and some module discovery mechanism.
Do you have plans about versioning mechanism? Looks like version is just part of a module URL and it could be missing if module is deployed differently.
Is there a docker image or maybe an instruction on how to do a local deployment of TF hub?

Thank you in advance.

Allow custom weight_decay in TF-Hub Image modules

Summary: Currently there is no way to provide weight decay to Image modules of TF-Hub - it is hardcoded inside the model. This curtails the customisability of the model as weight decay is one of the core hyperparameters in any model.

Reference: Stackoverflow discussion : link

Details:
Current implementation of image modules are using TF-Slim - and the weight decay is hardcoded (for e.g. for Inception V3 it is 0.00004). Currently i'm implementing transfer learning for a geology dataset with Inception - and would really appreciate the capability to tune weight decay.

p.s: TF-Hub is an awesome idea - kudos to the TF Hub developers. Special thanks to @arnoegw for swift clarifications!

Keras integration

Hi,
Any news on whether I can integrate hub models with Keras ones?
Thanks

Failed to download model

From couple hours ago, I tried to download a new model to make classification... but it gave me this message
`

INFO:tensorflow:Using /tmp/tfhub_modules to cache modules.
INFO:tensorflow:Downloading TF-Hub Module 'https://tfhub.dev/google/imagenet/mobilenet_v2_140_224/classification/1'.
Traceback (most recent call last):
File "retrain.py", line 1333, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "retrain.py", line 1017, in main
module_spec = hub.load_module_spec(FLAGS.tfhub_module)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/native_module.py", line 99, in load_module_spec
path = compressed_module_resolver.get_default().get_module_path(path)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/resolver.py", line 385, in get_module_path
return self._get_module_path(handle)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/resolver.py", line 467, in _get_module_path
return resolver.get_module_path(handle)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/resolver.py", line 385, in get_module_path
return self._get_module_path(handle)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/compressed_module_resolver.py", line 105, in _get_module_path
self._lock_file_timeout_sec())
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/resolver.py", line 313, in atomic_download
download_fn(handle, tmp_dir)
File "/home/mido/.local/lib/python2.7/site-packages/tensorflow_hub/compressed_module_resolver.py", line 101, in download
response = url_opener.open(request)
File "/usr/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1241, in https_open
context=self._context)
File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 101] Network is unreachable>
'

I tried to download an existed model to test the problem, but it gives me the same error!
Any help ??
Note :I tried the image modules links on another PC and Internet connection... The links are not responding!
could anyone try to test the module download process, which have been done by Retrain.py code ?

Notebook companion of universal-sentence-encoder does not run

In the link provided by the universal sentence encoder module, the notebook seems to be unable to use hub.Module correctly. This particular line

embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/1")

yields

KeyError: "The name 'global_step:0' refers to a Tensor which does not exist. The operation, 'global_step', does not exist in the graph."

Actually, the whole graph is not imported. I hope it can help.

Progan-128 implementation

I find the TF Hub project commendable and I’m looking forward to seeing more modules being available.

One remark that I have now is that I can’t seem to find a module’s implementation. Are there any plans to release these? I’m especially interested in the progressive GAN source code.

Transfer Learning for Neural Network

Hi Experts,

I have a model consisting of neural network, LSTM and memory network. However, It takes almost six hours to complete the training and whenever I want to add new use case's data it needs to retrain for all the data.

Can we take help of transfer learning to reduce the training time for the new use case data?
if yes, please guide me. if no, is there any other way to achieve this.

Thanks,
Sachin B. Ichake

Module download

Why it is downloading the module every time I restart the system? Can't I save that file in my local system?

Request: Module Download Progress

Would it be possible to provide a bit more information on module download progress? I think this would be helpful for larger modules like the USE, because otherwise it seems like they're stalling out. I'd be happy to do the implementation myself and submit a pull request if the authors can point me to an example they'd be happy with.

Thanks!

Universal sentence encoder training

I want to use the universal sentence encoder for clustering. But the problem is I am having a different domain of data. So Is it possible to train my data to generate the sentence vectors?

Enable 'serve' tag-set in create_module_spec function call

It would be useful to export modules in a format that can be consumed by TensorFlow Serving. Servables would increase code reuse and further enable distributed workloads.

Example Usage (note passing "serve" instead of "train"):

hub.create_module_spec(
    module_fn,
    tags_and_args=[({"serve"}, {"is_training":False})],
    drop_collections=None)

Retrain image classification networks with images of different size

I think it would be useful, to allow one to use a pre-trained image classification network as a feature generator for images with other shapes (greater width/height). At the moment this is forbidden as the images then do not have the same shape as get_expected_image_size() returns, which results in an exception.

S3D (I3D light)

Can we have the lighter separable convolution version with less parameters?
See also tensorflow/tensorflow#7278

/cc @andresusanopinto

Universal Sentence Embedding Inference Speed

More of a general question, but is there a way to speed up multiple calls to the Universal Sentence embedding module when trying to run inference on batches of 1?

For example:
After creating a session and initializing global variables-

%%time 
r2 = session.run(embed(['and this is another test'], signature='default'))
CPU times: user 10.9 s, sys: 8 ms, total: 10.9 s
Wall time: 10.9 s

And again with new data

%%time 
r2 = session.run(embed(['test it a second time'], signature='default'))
CPU times: user 10.6 s, sys: 8 ms, total: 10.6 s
Wall time: 10.6s

Is there something being reloaded at ever call?
Outside of running on a GPU is it possible to speed up the inference for subsequent calls?

When profiling the tensorflow code with Timeline and viewing the trace with chrome trace, the total execution time of all the ops is less than 1 second. Any thoughts?

Can not get reproducible training of modules

I'm training my own module and that works great, but when I rerun my experiment I get different results. That while I set the tf_random_seed parameter in the RunConfig of the estimator that trains the module. Even if I explicitly set the seed within the module, the training is not reproducible:

   def module_fn(self):
        (seed,_) = tf.get_seed(None)
        tf.set_random_seed(seed)
        ...

But I would expect that the default behavior would be to use the seed of the estimator. Any ideas on how to fix this?

Thanks

Jonas

Unable to use an already downloaded model

def make_estimator(model_diry):

    embedded_text_feature_column = hub.text_embedding_column(key="text",module_spec="https://tfhub.dev/google/nnlm-en-dim128/1")

    return tf.estimator.DNNClassifier(
        n_classes=2,
        feature_columns=[embedded_text_feature_column],
        hidden_units=[500, 100],
        model_dir=model_diry)
if __name__ == "__main__":
     MODEL_DIR = os.getcwd() + "/tmp/"
     estimator_from_file = make_estimator(MODEL_DIR)

INFO:tensorflow:Using /var/folders/99/p2j71d856w9gk0rzfwyms3nj56r_4w/T/tfhub_modules to cache modules.
INFO:tensorflow:Module 'https://tfhub.dev/google/nnlm-en-dim128/1' already being downloaded by 'RAPP-YSun.local.25445.f8e7b9b2dd764ca1b1b13e8cbe92e2b1'. Waiting.
INFO:tensorflow:Module 'https://tfhub.dev/google/nnlm-en-dim128/1' already being downloaded by 'RAPP-YSun.local.25445.f8e7b9b2dd764ca1b1b13e8cbe92e2b1'. Waiting.
INFO:tensorflow:Module 'https://tfhub.dev/google/nnlm-en-dim128/1' already being downloaded by 'RAPP-YSun.local.25445.f8e7b9b2dd764ca1b1b13e8cbe92e2b1'. Waiting.
INFO:tensorflow:Module 'https://tfhub.dev/google/nnlm-en-dim128/1' already being downloaded by 'RAPP-YSun.local.25445.f8e7b9b2dd764ca1b1b13e8cbe92e2b1'. Waiting.
INFO:tensorflow:Module 'https://tfhub.dev/google/nnlm-en-dim128/1' already being downloaded by 'RAPP-YSun.local.25445.f8e7b9b2dd764ca1b1b13e8cbe92e2b1'. Waiting.

and it just keep showing the Waiting message

Feature idea - int Tensors with space delimiter

Hi,
It would be very nice to have a text embedding model to accept a batch of int Tensors as inputs, instead of plain strings, with each value representing a character. The only restriction would be that the ints must correspond to the same character indices in the alphabet used to train the model (e.g. <space>=0, a=1, b=2, ..., <zero_pad_value> = 42), so that the words can be recognised and segmented properly.

Unless this is too soon 🐤
Thanks a lot for this project.

Fail to convert retrained MobileNet with fake quantization to TF Lite

Hi,
I try to generate a quantized mobilenet for TFlite using the flowers demo.

Retraining seems to be OK but I don't manage to convert it to quantized tflite with toco.

retraining with fake quantization:

python retrain.py --how_many_training_steps 200 --image_dir ~/flower_photos/ --tfhub_module https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/quantops/feature_vector/1

conversion with toco:

cat to_tflite_quant_flowers_demo2.sh
./bazel-bin/tensorflow/contrib/lite/toco/toco \
  --input_file=/tmp/output_graph.pb \
  --output_file=/tmp/output_graph.lite \
  --input_format=TENSORFLOW_GRAPHDEF \
  --output_format=TFLITE \
  --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE \
  --inference_input_type=QUANTIZED_UINT8 \
  --inference_type=QUANTIZED_UINT8 \
  --input_shapes="1,224, 224,3" \
  --input_arrays=input \
  --output_arrays=final_result \
  --mean_values=128 \
  --std_values=128 \
  --default_ranges_min=0 \
  --default_ranges_max=6

Which gives me the following error.

2018-05-11 11:45:18.264214: F tensorflow/contrib/lite/toco/graph_transformations/quantize.cc:555] Check failed: is_rnn_state_array
to_tflite_quant_flowers_demo2.sh: line 16: 19971 Abort trap: 6           ./bazel-bin/tensorflow/contrib/lite/toco/toco --input_file=/tmp/output_graph.pb --output_file=/tmp/output_graph.lite --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --inference_input_type=QUANTIZED_UINT8 --inference_type=QUANTIZED_UINT8 --input_shapes="1,224, 224,3" --input_arrays=input --output_arrays=final_result --mean_values=128 --std_values=128 --default_ranges_min=0 --default_ranges_max=6

When I try to generate a FLOAT tflite with not Fake quantization, it works.

My stack:

tensorflow '1.8.0'
tensorflow_hub '0.1.0'

View model as graph

Hello, are there any pointers on how to view the tf.hub models as a graph? The usual boilerplate I use results in colaboratory notebooks dropping the data since it is so massive. If it matters the model I'm trying to view is the generative images module. Been trying to access the signatures and keys to figure out how to parse what I don't need for the graph, but any pointers before sinking more time into it would be great. Thank you

This is the usual boilerplate I use:

import numpy as np
from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script src="//cdnjs.cloudflare.com/ajax/libs/polymer/0.3.3/platform.js"></script>
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))```

How to fine tuning Universal sentence encoder model

Hi, I am wondering how to fine tuning universal sentence encoder model, I have set the trainable to True, and print all the trainable_varaibles, It seems like no variable in the use model appears.

Best regards,
Hao

create module from existing graph

I'm trying to create a module from a existing graph. From the module creation documentation I need to completely construct the entire graph.

Is there a way to load an existing graph from file and convert it to a module.

My use case is I have retrained an existing model and created a new retrained model / graph stored on disk. I would like to now create a module and publish it for use with tensorflow hub

incorporating non default signature into tf.estimator

A few questions around combining tensorflow_hub and tf.estimator. I am using a non default signature to embed words with ELMO. Is this the best practice for doing so?

elmo is instantiated in the model function:

def model_fn(features, labels, mode, params):
    #tf.logging.set_verbosity(tf.logging.ERROR) 
    elmo = hub.Module("https://tfhub.dev/google/elmo/1", trainable=False)
    
    with tf.variable_scope("elmo"+ str(np.random.rand())):
            
    #Elmo requires padded raw text
    words = tf.string_split(features['text'])
    dense_words = tf.sparse_to_dense(sparse_indices=words.indices, 
                                     sparse_values=words.values,
                                     default_value="", 
                                     output_shape=[params.batch_size, 
                                                   params.max_seq_len])
    #(B x W_t, 1024)
    elmo_embeddings = elmo(
        inputs={
            "tokens": dense_words,
            "sequence_len": tf.cast(tf.reshape(features['word_pad'], 
                                  shape=[BATCH_SIZE,]), tf.int32)
        },
        signature="tokens",
        as_dict=True)["elmo"]

This does not use hub.text_embedding_column.
This also breaks viewing the graph in tensorboard.

Is this the best way to use hub in estimator?
Is there a way to view the graph in tensorboard?

How do I get the vocabulary used in universal-sentence-encoder?

I would like to check if a few words are included in the vocabulary. Thanks.

Universal sentence encoder speed

Hi,

I am trying the sample code provided with tensorflow hub. I want to use the sentence embedding vectors at runtime. It is consistently taking about 4 seconds on CPU for one sentence. Is there something i can do to speed it up ?

import tensorflow as tf
import tensorflow_hub as hub
import time

# Import the Universal Sentence Encoder's TF Hub module
embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/1")

sentence = "I am a sentence for which I would like to get its embedding."
messages = [sentence]

with tf.Session() as session:
  session.run([tf.global_variables_initializer(), tf.tables_initializer()])
  t1 = time.time()
  message_embeddings = session.run(embed(messages))
  print time.time() - t1

DataLossError: Checksum does not match:

Hi I'm having some problems running TensorFlow Hub models. I am getting DataLossError when running sess.run. I am using Ubuntu 16.04, Python3.6 with tensorflow==1.8.0 and tensorflow_hub=0.1.0.

However, when I run it on a macOS machine, the script below works without problems.

Thanks

In [1]: import tensorflow as tf
   ...: import tensorflow_hub as hub
   ...: 

In [2]: from tensorflow.python.client import device_lib
   ...: print(device_lib.list_local_devices())
   ...: 
2018-04-30 13:53:30.576282: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFl
ow binary was not compiled to use: AVX2 FMA
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 9094741255132568752
]

In [3]: print("TensorFlow version: {}".format(tf.VERSION))
   ...: print("Eager execution: {}".format(tf.executing_eagerly()))
   ...: 
TensorFlow version: 1.8.0
Eager execution: False

In [4]: ENGLISH_WORD2VEC = 'https://tfhub.dev/google/nnlm-en-dim128/1'
   ...: 

In [5]: # switch here
   ...: embedding_module = ENGLISH_WORD2VEC
   ...: embed = hub.Module(embedding_module, trainable=False)
   ...: 
INFO:tensorflow:Using /tmp/tfhub_modules to cache modules.
INFO:tensorflow:Initialize variable module/embeddings/part_0:0 from checkpoint b'/tmp/tfhub_modules/32f2b2259e1cc8ca58c87692174836
1283e73997/variables/variables' with embeddings

In [6]: with tf.Session() as sess:
   ...:     sess.run(tf.global_variables_initializer())
   ...:     sess.run(tf.local_variables_initializer())
   ...:     sess.run(tf.tables_initializer())
   ...: 
   ...:     print(sess.run(embeddings))
   ...:     


DataLossError: Checksum does not match: stored 3328590665 vs. calculated on the restored bytes 94957053
         [[Node: checkpoint_initializer = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](ch$
ckpoint_initializer/prefix, checkpoint_initializer/tensor_names, checkpoint_initializer/shape_and_slices)]]

Caused by op 'checkpoint_initializer', defined at:
  File "~/venv/bin/ipython", line 11, in <module>
    sys.exit(start_ipython())
  File "~/venv/lib/python3.6/site-packages/IPython/__init__.py", line 119, in start_ipython
    return launch_new_instance(argv=argv, **kwargs)
  File "~/venv/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "~/venv/lib/python3.6/site-packages/IPython/terminal/ipapp.py", line 355, in start
    self.shell.mainloop()
  File "~/venv/lib/python3.6/site-packages/IPython/terminal/interactiveshell.py", line 493, in mainloop
    self.interact()
  File "~/venv/lib/python3.6/site-packages/IPython/terminal/interactiveshell.py", line 484, in interact
    self.run_cell(code, store_history=True)
  File "~/venv/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "~/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
    if self.run_code(code, result):
  File "~/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-68aeaedb27fc>", line 4, in <module>
    embed = hub.Module(embedding_module, trainable=False)
  File "~/venv/lib/python3.6/site-packages/tensorflow_hub/module.py", line 126, in __init__
    tags=self._tags)
  File "~/venv/lib/python3.6/site-packages/tensorflow_hub/native_module.py", line 282, in _create_impl
    name=name)
  File "~/venv/lib/python3.6/site-packages/tensorflow_hub/native_module.py", line 338, in __init__
    tf.train.init_from_checkpoint(self._checkpoint_path, self._variable_map)
  File "~/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 221, in i
nit_from_checkpoint
    _set_variable_or_list_initializer(var, ckpt_file, tensor_name_in_ckpt)
  File "~/venv/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 335, in _
set_variable_or_list_initializer
    _set_checkpoint_initializer(v, ckpt_file, tensor_name, slice_info.spec)
  File "~/venv/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 299, in _
set_checkpoint_initializer
    ckpt_file, [tensor_name], [slice_spec], [base_type], name=name)[0]
  File "~/venv/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1463, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "~/venv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _a
pply_op_helper
    op_def=op_def)
  File "~/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "~/venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

DataLossError (see above for traceback): Checksum does not match: stored 3328590665 vs. calculated on the restored bytes 94957053
         [[Node: checkpoint_initializer = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](che
ckpoint_initializer/prefix, checkpoint_initializer/tensor_names, checkpoint_initializer/shape_and_slices)]]

Multi-label retraining

Hi,
Would you be so kind for all community and add an example of multi-label image classification retraining?
Thanks in advance :)

Serving Hub Module with Django

I'm trying to deploy a application using Django + Text Module.
I have a view with the logic using the module, but I need to keep the module on memory, to avoid the time the module need to load in every request.

When I try to re-use my session I receive:

File "/home/venv/local/lib/python2.7/site-packages/tensorflow_hub/module.py", line 191, in call
"Module must be applied in the graph it was instantiated for.")
RuntimeError: Module must be applied in the graph it was instantiated for.

Any way to solve this problem?

Retraining locally saved model in Tensorflow hub

I'm using windows and Tensorflow 1.7.

I'm retraining a pretrained mobilenet model on my own data as below,

python retrain.py --image_dir training_imgs --tfhub_module https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/1 --bottleneck_dir C:\\tmp\\bottleneck --saved_model_dir model_save --final_tensor_name final_tensor

In above command the model is being fetched from URL "https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/1 ". Is there a way to provide
local model here instead of fetching the model from web?

After retraining, the model is saved into directory 'model_save' which I'm passing as an argument into the command.

I came until this step successfully and now I want to retrain my locally saved model from previous step on some more data. When I try to load this model and retrain as below,

So now I'm passing locally saved model as an argument to the 'tfhub_module' line
python retrain.py --image_dir C: ...\\code\\cnn_time_series\\ford\\crop\\test --tfhub_module C: ...Desktop\\code\\saved_mode\\model_save--bottleneck_dir C:\\tmp\\bottleneck --saved_model_dir model_save2
I get below error,
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: C: ....model_save\tfhub_module.pb : The system cannot find the file specified.
;

Can any one please suggest where I'm doing wrong?

Learning Rate fine-tuning

Hello,

I would like to experiment with TF Hub to retrain an image classifier.
The retrain example is a good starting point for that purpose.
In the example, you have the ability to fix a learning rate for the final layer.
Also, using hub.Module(..., trainable=True), you can also let the pre-trained weights to be updated.
My question is: which learning rate will be applied in that case (inherited from the one specified on the final layer?), and how to change it if possible, and use a different one from the one in the final layer.

Thanks in advance!

How to use TF Hub on a distributed setting?

I want to use the ResNet-101-v2 feature vectors to do some transfer learning. I am training with the Estimator API on GCP, I call the hub Module at the beggining of the model_fn.

module_url = "https://tfhub.dev/google/imagenet/resnet_v2_101/feature_vector/1"
module = hub.Module(module_url)
height, width = hub.get_expected_image_size(module)
images = tf.image.resize_images(input_tensor, [height, width])
feature_vectors = module(images)

When I run in a single node ("basic-gpu") all is well, however, when I run the same code in distributed mode ("standard-1") I get this error:

The replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): [...] File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/checkpoint_utils.py", line 337, in _set_variable_or_list_initializer _set_checkpoint_initializer(variable_or_list, ckpt_file, tensor_name, "") File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/checkpoint_utils.py", line 299, in _set_checkpoint_initializer ckpt_file, [tensor_name], [slice_spec], [base_type], name=name)[0] File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1458, in restore_v2 shape_and_slices=shape_and_slices, dtypes=dtypes, name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3290, in create_op op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1654, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access InvalidArgumentError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to get matching files on /tmp/tfhub_modules/e0c607f95a3d67bc8928a5c20d09d1915322cfcb/variables/variables: Not found: /tmp/tfhub_modules/e0c607f95a3d67bc8928a5c20d09d1915322cfcb/variables; No such file or directory [[Node: checkpoint_initializer_537 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:ps/replica:0/task:1/device:CPU:0"](checkpoint_initializer_537/prefix, checkpoint_initializer_537/tensor_names, checkpoint_initializer_537/shape_and_slices)]] [[Node: init/NoOp_3_S22 = _Recvclient_terminated=false, recv_device="/job:master/replica:0/task:0/device:CPU:0", send_device="/job:ps/replica:0/task:1/device:CPU:0", send_device_incarnation=-7983147897712139617, tensor_name="edge_3296_init/NoOp_3", tensor_type=DT_FLOAT, _device="/job:master/replica:0/task:0/device:CPU:0"]] To find out more about why your job exited please check the logs: ....

How should I structure my code for TF Hub to work with the Estimator API for distributed training?

AttributeError

I'm trying to follow a tutorial by Tensorflow on image classification. I've gotten up to the step where you retrain the model. I'm trying to run this retraining script using terminal, but I'm getting the following error:

Tutorial: https://www.tensorflow.org/tutorial
Retraining script: https://github.com/tensorflow/hub/blob/master/examples/image_retraining/retrain.py
Terminal: https://github.com/llSourcell/tensorflow_image_classifier/blob/master/src/train.sh

Universal Sentence Encoder not using GPU

I've been trying to figure out why is it that the https://tfhub.dev/google/universal-sentence-encoder/1 module is choking the GPU ram and not using it.

On one of a local machine with CUDA 9.0 Ubuntu 16.04, it's just not using any GPU compute when I did:

import tensorflow as tf
import tensorflow_hub as hub

model_name_dan = 'https://tfhub.dev/google/universal-sentence-encoder/1'
embed = hub.Module(model_name_dan)

with tf.Session() as session:
    session.run([tf.global_variables_initializer(), tf.tables_initializer()])

On nvidia-smi:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.26                 Driver Version: 387.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    On   | 00000000:02:00.0 Off |                  N/A |
| 27%   37C    P2    37W / 180W |   7754MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    On   | 00000000:03:00.0 Off |                  N/A |
| 27%   34C    P2    37W / 180W |   7722MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1080    On   | 00000000:81:00.0 Off |                  N/A |
| 27%   34C    P2    40W / 180W |   7722MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 1080    On   | 00000000:82:00.0 Off |                  N/A |
| 27%   33C    P2    36W / 180W |   7722MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

When tracking the htop, it looks like it's using the CPU instead of the GPU but somehow the RAM on the GPU is choked to the max when starting the session.

Is this the expected behavior? Anyone else have the same experience?

Clarify module licensing

i can't find any information about license for the models. maybe the apache 2.0?

Set part of the model retrainable.

Hi,
I'm looking at elmo model on Tensorflow hub (https://www.tensorflow.org/hub/modules/google/elmo/1)
using the example code

elmo = hub.Module("https://tfhub.dev/google/elmo/1", trainable=True)
tokens_input = [["the", "cat", "is", "on", "the", "mat"],
                ["dogs", "are", "in", "the", "fog", ""]]
tokens_length = [6, 5]
embeddings = elmo(
    inputs={
        "tokens": tokens_input,
        "sequence_len": tokens_length
    },
    signature="tokens",
    as_dict=True)["elmo"]

They mentioned that they set trainable=True when creating the module so that the 4 scalar weights (as described in the paper) can be trained. In this setting, the module still keeps all other parameters fixed.

However, when I print set of trainable parameters from tf.trainable_variables(), I see the full list of parameters (i.e. 75m parameters). I would expect only 4 scalar weights as trainable. Could you please explain this ?

"Illegal instruction" when importing

When importing tensorflow_hub python crashes with Illegal instruction. No stack trace is given.
. I installed both tensorflow-gpu and tensorflow-hub via pip.

python version: 3.5.2
tensorflow version: tensorflow-gpu 1.7.0
tensorflow_hub version: 0.1.0
cuda version: 9.0
cudnn version: 7.0.5
OS: Ubuntu 16.04 "xenial" x86_64
CPU: Intel i5 750

Cannot download model

Hello,

I'm not able to download any pre-trained mode. Neither from https://github.com/tensorflow/hub/blob/master/docs/modules/image.md nor using retrain.py script.
Is there some network issue?

Image retrain - classification module gets better accuracy than feature_vector

I was retraining MobileNet_v1_1.0_224 for image classification using the retrain script in this repo, I was using the same dataset (LSVRC 2012). I noticed that the final test accuracy of feature_vector module was around 55%, while the result of classification module can reach 70%.

I was using same parameters as below:

--learning_rate=0.005
--testing_percentage=5
--validation_percentage=20
--flip_left_right True
--random_scale=30
--random_brightness=30
--how_many_training_steps=10000

I wonder to know why there is such a huge difference between the two modules, also I supposed we should use the feature_vector one by reading related documents, and in the examples (in comments) in retrain script, we also used feature_vector module by default.

Can someone explain the reasons of this difference? Is it a problem of bottleneck since I distorted images?

Image normalization in retrain.py script with Mobilenet_v1_1.0_224 module

Can anyone please help me understand how the retrain.py script normalize the image input data in TensorFlow Hub?

In the previous version of retrain.py (before TensorFlow Hub), it'll subtract pixels by 127.5 and then divide by 127.5, for a pixel in range [0, 255], this normalized its input to range [-1, 1], which seems to be correct.

In current retrain.py script, image was resized to 224x224 but the pixels were kept in range [0, 255], I know we have a module_apply_default/hub_input node in the graph to preprocess the input, however it seems that only multiples values by 2 and subtracts by 1.

I bet that we also divided the inputs by 255.0 somewhere, if so, the input will become [0, 1] before feeding into the module, and will be multiple by 2 and subtract by 1, and result a range [-1, 1].

My question is I didn't see where we divide the inputs by 255.0, so can anyone please clarify this?

Thanks!

`hub.Module()` fails in python3 with tensorflow>=1.8.0rc0

This is due to "prepend_name_scope" failing silently when given a "bytes" instead of "string".

It fails with an error like:

WARNING:tensorflow:cannot use a string pattern on a bytes-like object
KeyError: "The name 'global_step:0' refers to a Tensor which does not exist. The operation, 'global_step', does not exist in the graph."

Tensorflow Hub: Support multi-GPU training in Keras or Estimator

In my project I use Tf-Hub with estimators. However when I try to use multi GPU's (single machine) using tf.contrib.estimator.replicate_model_fn, I get the following error:

variable_scope was unused but the corresponding ". "name_scope was already taken.

Probably it is from this source line : link

Any help is much appreciated - received with thanks.

CC: @arnoegw

Module download freezes if TFHUB_CACHE_DIR is a GS Bucket

Hi! As discussed in #48, I tried to set TFHUB_CACHE_DIR to a gcp bucket and while there is no error, the code freezes here while downloading the module. For example, if you have this setup

bash

export TFHUB_CACHE_DIR="gs://<SOME_BUCKET>/tfhub_cache_dir"
python main.py

main.py

import tensorflow_hub as hub
#resnet_v2_101 for example
hub.Module("https://tfhub.dev/google/imagenet/resnet_v2_101/feature_vector/1")

the code should get stuck after this log

Using gs://<SOME_BUCKET>/tfhub_cache_dir to cache modules.

colab notebooks are not working

Error when running retrain

I get the following output when I run the command:
python retrain.py --image_dir hub-master\flower_photos

output:

Traceback (most recent call last):
  File "retrain.py", line 1341, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "C:\Users\Johar\AppData\Roaming\Python\Python35\site-packages\tensorflow\python\platform\app.py", line 126, in run
    _sys.exit(main(argv))
  File "retrain.py", line 1025, in main
    module_spec = hub.load_module_spec(FLAGS.tfhub_module)
  File "C:\Users\Johar\Anaconda2\envs\TF\lib\site-packages\tensorflow_hub\native_module.py", line 103, in load_module_spec
    module_def_proto.ParseFromString(f.read())
  File "C:\Users\Johar\AppData\Roaming\Python\Python35\site-packages\tensorflow\python\lib\io\file_io.py", line 120, in read
    self._preread_check()
  File "C:\Users\Johar\AppData\Roaming\Python\Python35\site-packages\tensorflow\python\lib\io\file_io.py", line 80, in _preread_check
    compat.as_bytes(self.__name), 1024 * 512, status)
  File "C:\Users\Johar\AppData\Roaming\Python\Python35\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: C:\Users\Johar\AppData\Local\Temp\tfhub_modules\11d9faf945d073033780fd924b2b09ff42155763\tfhub_module.pb : The system cannot find the file specified.
; No such file or directory

I'm running windows 10, python 3.5.5 , TF 1.8.0

About an hour ago it was working fine then I tried using profiling and it stopped working

ImportError: cannot import name 'module_def_pb2

Stacktrace is below. Honestly, this was working last week and I didn't update anything. Any ideas?

I've looked for the referenced import in the TF Hub project, but I can't actually find it, otherwise I would have tried to fix it and submit a pull request.

`---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
in ()
1 import tensorflow as tf
----> 2 import tensorflow_hub as hub
3 import matplotlib.pyplot as plt
4 import numpy as np
5 import os

~\PycharmProjects\hub-master\tensorflow_hub_init_.py in ()
59 from tensorflow_hub.estimator import LatestModuleExporter
60 from tensorflow_hub.estimator import register_module_for_export
---> 61 from tensorflow_hub.feature_column import image_embedding_column
62 from tensorflow_hub.feature_column import text_embedding_column
63 from tensorflow_hub.image_util import get_expected_image_size

~\PycharmProjects\hub-master\tensorflow_hub\feature_column.py in ()
23 import tensorflow as tf
24 from tensorflow_hub import image_util
---> 25 from tensorflow_hub import module
26
27 # TODO(b/73987364): It is not possible to extend feature columns without

~\PycharmProjects\hub-master\tensorflow_hub\module.py in ()
21 import tensorflow as tf
22 from tensorflow_hub import module_spec
---> 23 from tensorflow_hub import native_module
24 from tensorflow_hub import tensor_info
25

~\PycharmProjects\hub-master\tensorflow_hub\native_module.py in ()
25
26 from tensorflow_hub import compressed_module_resolver
---> 27 from tensorflow_hub import module_def_pb2
28 from tensorflow_hub import module_impl
29 from tensorflow_hub import module_spec

ImportError: cannot import name 'module_def_pb2'`

Is there a module for Chinese text classifier ?

Hello, every one:
I am working on the Chinese text classifier model, since the TF-HUB is just like the Caffe model zoo, so I am wondered if is there a Chinese text classifier model for TensorFlow. Like the fastText, it provide a Chinese Word2Vec result as a directory for Chinese.
Hope repsonsed.

Log/Print download progress for tf hub modules

I'm just beginning using tensorflow_hub modules and am slightly annoyed by the fact that download progress is not displayed. For the larger modules, I am always checking my Activity Monitor to see if the download is actually progressing.

I would love to see an implementation that displays (rather it be logging info or simply prints to the console) download progress of modules. Could we see this feature anytime soon?

How to feed tf.hub output to LSTM

Hello There,

Since tf.hub output is [?,512] and LSTM need [Batch_Size,Time_Frame,512]. How do we feed tf.hub output to LSTM.