background dots

As AI/ML is maturing the community keeps adding tools to make it easy to start using AI. Using the right tool can make all the difference.

blog thumbnail

TensorFlow Hub is a great collection of models. Any developer looking for ready to use tensorflow models is already familiar with the hub, if not you are missing out on this gem of a collection for TensorFlow models. The models are organized by domain and versions making it very easy for you to find the models that you are looking for. But before we get into the details of how Tiyaro simplifies the problem of how you can simply start using these models, lets go through the normal workflow of a developer trying to use models from TensorFlow Hub.

What do you need to run these models?

  1. Find the model and download the saved model
  2. Run the model
  3. Find out the 'Function Signature' of the model so you can use it in your application.

Find the model and download the saved model

Just head over to model search on TensorFlow Hub and you can search for the models. Screenshot from 2022-07-20 13-45-25.png

After you find the model of your choice, you download the saved model for it. e.g. If you were looking for an image classification model say imagenet/efficientnet you can download it here

Run the model

Now that you have a saved model downloaded. You have a couple of options to run it.

  1. Run model locally for test and dev use
  2. Run model for Production use
  3. Here is an excellent writeup to run models locally for test and dev.

TensorFlow serving as an excellent robust solution for running models in Production . It integrates really well with the saved model format and just like every other tensorflow feature you will find tons of tutorials on running Tensorflow serving.

Running models on GPU

There are many use cases where the additional performance and latency benefits derived from running inference on a GPU is not just a nice to have, it is a must have. This is a nice tutorial on enabling GPU support for tensorflow serving, it includes steps to download the CUDA libraries, re-compiling tensorflow serving with nVidia GPU support to running and testing the GPU support.

Real Issues serving a model

  1. Infrastructure and DevOps. - The real issue for developers is not the application to serve the model, whether it is TensorFlow Serving or some other solution. The DevOps required to serve the model in production and manage the infrastructure is the big burden. Without a dedicated team to handle this a developer has to spend their precious time and resources running the infrastructure to serve and debug the serving infra.
  2. Steep learning curve for REST API support of Tensorflow serving. Tensorflow serving supports both GRPC and REST interfaces to the models being served. But the REST API endpoints and the payload that is honored by the model is somewhat opaque. You need to decipher a lot of documentation to understand the serving function signatures and in many cases resort to trial and error to figure out what payloads actually work.
  3. Endpoint URLs are messy. Not a huge issue but even the various 'predict, classify, regress' APIs need the user to learn and understand the specifics of the TensorFlow Serving semantics and the model signatures.
  4. Request and Response documentation is non-standard. Where most REST APIs are documented using some sort of open standard like OpenAPI Spec or Swagger Spec. There is no standard documentation of the tensorflow serving request response formats.
  5. Adding GPU support is an extra step which will incur additional cost and time.

Find out the 'Function Signature' of the model, so you can use it in your application.

This one is easy to understand for developers. If you need to invoke a model you need to know what input(s) that model takes, the format of the input parameters and the output that the model gives. Tensorflow toolchain has done a great job of providing some of the basic tooling required to do this. For instance, you can use the ' saved_model_cli ' to see the default signature supported by the model. e.g. the imagenet efficientnet model has the following signature

$ saved_model_cli show --dir . --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
   inputs['input_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, -1, -1, 3)
      name: serving_default_input_1:0

The given SavedModel SignatureDef contains the following output(s):
   outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1000)
      name: StatefulPartitionedCall:0

Method name is: tensorflow/serving/predict

You can use this information in conjunction with the Tensorflow serving documentation to figure out the inputs required by this model and the output generated by it.

Real Issues with function signature

  1. Steep learning curve to understand the spec. As seen above the documentation of the function signature is very Tensorflow specific.
  2. A lot of the models take 'tensors' as input. Most developers are dealing with inputs like images, audio files, text. There is now a learning curve to map all those inputs to the inputs expected by these models.
  3. Lastly not all models have implemented all the metadata for the saved_model_cli to show you the serving signature.

2 Easy Steps - TensorFlow hub models to API with Tiyaro

Tiyaro allows you to rephrase the question that developers ask. Instead of asking "What do you need to run these models?", developers should be asking

What do you need to use these models? With Tiyaro you just need the following 2 steps

  1. Find the model
  2. Use the model Let's use the same model ( imagenet effecientnet ) from above as an example

Find the model

Simply search for the model in Tiyaro console Screenshot from 2022-07-21 10-26-38.png

Click on the search result to see the model card Screenshot from 2022-07-21 11-42-44.png

Use the Model

The model card includes

  • API endpoint for this model

    Screenshot from 2022-07-21 11-31-20.png
  • The OpenAPI spec for the model

    So you know exactly what are the inputs and output from this model Screenshot from 2022-07-21 11-32-39.png
  • Sample code to try the model

    Screenshot from 2022-07-21 11-35-34.png
  • GPU support is built-in

    Running a model on GPU is simply a matter of selecting the FelxGPU service tier in the model card and using the gpuflex url for that API as shown below. Screenshot from 2022-07-25 11-00-39.pngScreenshot from 2022-07-25 11-01-10.pngThat's it! Within minutes, if not seconds, you will go from a tensorflow hub model to using it in your app. Give it a shot. Get started!

Start Today.

© 2023 Tiyaro, Inc.