GithubHelp home page GithubHelp logo

sensenet's Introduction

BigML Sense/Net

Sense/Net is a BigML interface to Tensorflow, which takes a network specification as a dictionary (read from BigML's JSON model format) and instantiates a TensorFlow compute graph based on that specification.

Entry Points

The library is meant, in general, to take a BigML model specification as a JSON document, and an optional map of settings and return a lightweight wrapper around a tf.keras.Model based on these arguments. The wrapper creation function can be found in sensenet.models.wrappers.create_model

Pretrained Networks

Often, BigML trained deepnets will use networks pretrained on ImageNet either as a starting point for fine tuning, or as the base layers under a custom set of readout layers. The weights for these networks are stored in a public s3 bucket and downloaded as needed for training or inference (see the sensenet.pretrained module). If the pretrained weights are never needed, no downloading occurs.

By default, these are downloaded to and read from the directory ~/.bigml_sensenet (which is created if it is not present). To change the location of this directory, clients can set the environment variable BIGML_SENSENET_CACHE_PATH.

Model Instantiation

To instantiate a model, pass the model specification and the dict of additional, optional settings to models.wrappers.create_model. For example:

model = wrappers.create_model(a_dict, settings={'image_path_prefix': 'images/path/'})

Again, a_dict is typically a downloaded BigML model, read into a python dictionary via json.load or similar. You may also pass the path to a file containing such a model:

model = wrappers.create_model('model.json', settings=None)

A similar function, models.wrappers.create_image_feature_extractor, allows clients to create a model object that returns instead the outputs of the final global pooling or flattening layer of the image model, given an image as input:

extractor = create_image_feature_extractor("resnet18", None)
extractor("path/to/image.jpeg").shape # (1, 512)

Note that this only works for networks with at least one image input, and does not work for bounding box models, as there is no global pooling or flattening step in those models.

For both create_image_feature_extractor and create model, settings can either be None (the default) or a dict of optional settings which may contain any of the settings arguments listed below.

Settings Arguments

These arguments can be passed to models.wrappers.create_image_feature_extractor or models.wrappers.create_model to change the input or output behavior of the model. Note that the settings specific to bounding box models are ignored if the model is not of the bounding box type.

  • bounding_box_threshold: For object detection models only, the minimal score that an object can have and still be surfaced to the user as part of the output. The default is 0.5, and lower the score will have the effect of more (possibly spurious) boxes identified in each input image.

  • color_space: A string which is one of ['rgb', 'rgba', 'bgr', 'bgra']. The first three letters give the order of the color channels (red, blue, and green) in the input tensors that will be passed to the model. The final presence or absence of an 'a' indicates that an alpha channel will be present (which will be ignored). This can be useful to match the color space of the output model to that provided by another library, such as open CV. Note that TensorFlow uses RGB ordering by default, and all files read by TensorFlow are automatically read as RGB files. This argument is generally only necessary if input_image_format is 'pixel_values', and will possibly break predictions if specified when the input is a file.

  • iou_threshold: A threshold indicating the amount of overlap boxes predicting the same class should have before they are considered to be bounding the same object. The default is 0.5, and lower values have the effect of eliminating boxes which would otherwise have been surfaced to the user.

  • max_objects: The maximum number of bounding boxes to return for each image in bounding box models. The default is 32.

  • rescale_type: A string which is one of ['warp', 'pad', 'crop']. If 'warp', input images are scaled to the input dimensions specified in the network, and their aspect ratios are not preserved. If 'pad', the image is resized to the smallest dimensions such that the image fits into the input dimensions of the network, then padded with constant pixels either below or to the right to create an appropriately sized image. For example, if the input dimensions of the network are 100 x 100, and we attempt to classify a 300 x 600 image, the image is first rescaled to 50 x 100 (preserving its aspect ratio) then padded on the right to create a 100 x 100 image. If 'crop', the image is resized to the smallest dimension such that the input dimensions fit in the image, then the image is centrally cropped to make the specified sizes. Using the sizes in previous example, the image would be rescaled to 100 x 200 (preserving its aspect ratio) then cropped by 50 pixels on the top and bottom to create a 100 x 100 image.

While these are not the only settings possible, these are the ones most likely to be useful to clients; other settings are typically only useful for very specific client applications.

Model Formats and Conversion

The canonical format for sensenet models is the JSON format downloadable from BigML. However, as the JSON is fairly heavyweight, time-consuming to parse, and not consumable from certain locations, SenseNet offers a conversion utility, sensenet.models.wrappers.convert, which takes the JSON format as input and can output the following formats:

  • tflite will export the model in the Tensorflow lite format, which allows lightweight prediction on mobile devices.

  • tfjs exports the model to the format read by Tensorflow JS to do predictions in the browser and server-side in node.js. The library needed to do this export, tensorflowjs, is not available in all architectures, so this feature may not always work.

  • smbundle exports the model to a (proprietary) lightweight wrapper around the TensorFlow SavedModel format. The generated file is a concatenation of the files in the SavedModel directory, with some additional information written to the assets sub-directory. If this file is passed to create_model, the bundle is extracted to a temporary directory, the model instantiated, and the temporary files deleted. To extract the bundle without instantiating the model, see the functions in sensenet.models.bundle.

  • h5 exports the model weights only to the Keras h5 model format (i.e., via use of the TensorFlow function tf.keras.Model.save_weights) To use these, you'd instantiate the model from JSON and load the weights separately using the corresponding TensorFlow load_weights function.

Usage

Once instantiated, you can use the model to make predictions by using the returned model as a function, like so:

prediction = model([1.0, 2.0, 3.0])

The input point or points must be a list (or nested list) containing the input data for each point, in the order implied by model._preprocessors. Categorical and image variables should be passed as strings, where the image is either a path to the image on disk, or the raw compressed image bytes.

For classification or regression models, the function returns a numpy array where each row is the model's prediction for each input point. For classification models, there will be a probability for each class in each row. For regression models, each row will contain only a single entry.

For object detection models, the input should always be a single image (again, either as a file path, compressed byte string, or an array of pixel values, depending on the settings map, and the result will be list of detected boxes, each one represented as a dictionary. For example:

In [5]: model('pizza_people.jpg')
Out[5]:
[{'box': [16, 317, 283, 414], 'label': 'pizza', 'score': 0.9726969599723816},
 {'box': [323, 274, 414, 332], 'label': 'pizza', 'score': 0.7364346981048584},
 {'box': [158, 29, 400, 327], 'label': 'person', 'score': 0.6204285025596619},
 {'box': [15, 34, 283, 336], 'label': 'person', 'score': 0.5346986055374146},
 {'box': [311, 23, 416, 255], 'label': 'person', 'score': 0.41961848735809326}]

The box array contains the coordinates of the detected box, as x1, y1, x2, y2, where those coordinates represent the upper-left and lower-right corners of each bounding box, in a coordinate system with (0, 0) at the upper-left of the input image. The score is the rough probability that the object has been correctly identified, and the label is the detected class of the object.

sensenet's People

Contributors

charleslparker avatar unmonoqueteclea avatar mmerce avatar javinp avatar kendian avatar

Watchers

 avatar  avatar Oscar Rovira avatar Chee Sing Lee avatar James Cloos avatar Sergio avatar  avatar  avatar  avatar  avatar

sensenet's Issues

Tensorflow and tensorflow-gpu dependencies

Although tensorflow and tensorflow-gpu are the same package, pip is not able to know.
We are requiring tensorflow so, if user installed tensorflow-gpu (for instance GPU Tensorflow Dockerfiles come
with tensorflow-gpu installed), pip will try to install tensorflow package too.

This doesn't seem to have an easy solution, but at least we could consider some of the solutions
proposed here: tensorflow/tensorflow#7166

Tensorflow 2.9 Upgrade

Starting with tensorflow 2.9.x, they've started using setting compiler flag _GLIBCXX_USE_CXX11_ABI by default, which is causing linker errors on the CI builds on github, but works for me locally on a mac, and works on the wintermute linux build server. Specifically, on the github CI builds, everything appears to build fine. But when we run pytest -sv tests/test_tree.py we get the following error:

  ==================================== ERRORS ====================================
  _____________________ ERROR collecting tests/test_tree.py ______________________
  tests/test_tree.py:9: in <module>
      import sensenet.importers
  sensenet/importers.py:42: in <module>
      bigml_tf_module = tensorflow.load_op_library(treelib[0])
  /tmp/tmp.8RK8W1x4vs/venv/lib/python3.8/site-packages/tensorflow/python/framework/load_library.py:54: in load_op_library
      lib_handle = py_tf.TF_LoadLibrary(library_filename)
  E   tensorflow.python.framework.errors_impl.NotFoundError: /tmp/tmp.8RK8W1x4vs/venv/lib/python3.8/site-packages/bigml_tf_tree.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb

The problem here is that the custom tensorflow extension that deals with the internal trees sometimes generated by deepnets has been built with "old ABI" compaibility, whereas TF 2.9.x uses the "new ABI" (see https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html).

It's odd, because I verified that the correct flag gets passed to the compile step here (when using TF 2.9.x):

https://github.com/charleslparker/sensenet/blob/master/setup.py#L54

and purposely overriding it (by replacing the =1 with =0 for the flag in compile_args) causes the same test to break with other linker errors on my local and the linux server. So it's something strange going on with the compile step on github specifically. Maybe the dockers used by CIBuildWheel on github have an old version of libstdc++?

The exact linker error we get is documented here:
https://pgaleone.eu/tensorflow/bazel/abi/c++/2021/04/01/tensorflow-custom-ops-bazel-abi-compatibility/

where they say you have to rebuild tensorflow to fix it. I refuse to believe this!

Popping up the stack a bit; this op is only used when deepnets generate these internal trees (e.g., when "tree embedding = True" when you train a deepnet, or you do "Automatic structure search"). This extension has been such a pain so many times that maybe we should remove it.

Prediction Difference Between Sensenet and bigml.com

Sensenet produces predictions for image models that can be different from what you'd get remotely from bigml.com. This is due to a variety of differences between the production and local settings, enumerate in rough order of importance below.

1.) The BigML prod environment resizes the image to a maximum of 512 x 512 before storing it, using bicubic interpolation. If clients to not do this resizing, or do it using something other than bicubic interpolation, the image will be different.

2.) JPEG compression is applied (quality 90%) to the source when it is stored. When used to make a prediction, the source is decompressed. Because JPEG compression is lossy, the values are bound to be different.

3.) The JPEG standard is underspecified, so the same image decompressed by two different software packages, or even two different versions of libjpeg might have small differences (https://photo.stackexchange.com/questions/83745/is-a-jpg-guaranteed-to-produce-the-same-pixels#:~:text=https%3A//photo.stackexchange.com/a/83892). The version used by tensorflow, for example, does not by default match the output of the version used by Java, and requires special options to be set (https://stackoverflow.com/questions/44514897/difference-between-the-return-values-of-pil-image-open-and-tf-image-decode-jpeg/60079341#60079341). Pillow's output is also different. So even aside from the rescaling/recompression issues, the input images are unlikely to be exactly the same. I've done tests and the difference is enough to shift the results in a classification model by 1%. Because of this, even apart from the rescaling/recompression issue, the input image will still be different in the case of JPEGs because of the initial decompression.

4.) Tensorflow running on different hardware can give different results (tensorflow/tensorflow#19200 (comment)). This is not just a CPU vs. GPU problem, but can also occur with different builds of Tensorflow. The central problem is that there are so many operations in a deep neural network that even errors in the least significant bit accrue over time to something significant, especially because we're only using 32-bits for the math. Our test suites have examples where the same test does not give the same output on the same TF version on mac and linux.

Whether or not these things merit "fixing" is beyond the scope of the issue, but clearly at least some of them could be mitigated with additional compute time if desired.

Missing dependency importlib_resources

As importlib.resources is only available in Python3.7+, you are doing this in pretrained.py

   try:
        import importlib.resources as pkg_resources
    except ImportError:
        import importlib_resources as pkg_resources

However, setup.py is not including the importlib_resources dependency, which is not part of Python's standard library

Error during installation

This line is giving me an error during sensenet installation, it's the first time I see a from without an import. Is it correct?

from sensenet.settings

File "/Users/unmonoqueteclea/.virtualenvs/slpr/lib/python3.8/site-packages/bigml_sensenet-0.2.4-py3.8-macosx-10.15-x86_64.egg/sensenet/image.py", line 3 from sensenet.settings ^ SyntaxError: invalid syntax

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.