background dots

We evaluate the baseline performance of various Text Recognition models in MMOCR using test images from IIIT 5k-word dataset

blog thumbnail

Introduction

Optical Character Recognition (OCR) is the task of detecting and recognizing text / characters from images, fax documents, scanned paper documents to digital format for further consumption. OCR has wide range of use-cases and is used across different industries like - Banking, Insurance, Legal, Retail, etc., Whatever may be the use case, it is important to identify the correct pre-trained OCR model to create the solution. The task of finding and evaluating a solution can take not only weeks but months, but by using Tiyaro the effort is reduced merely to the matter of minutes. Screenshot 2022-08-10 at 2.36.30 AM.pngSearch of 'mmocr' models in Tiyaro Explore Screenshot 2022-08-10 at 2.37.18 AM.pngOr click on Try on Tiyaro from MMOCR GitHub project . Screenshot 2022-08-23 at 8.03.39 AM.png

Accelerating the rate of model evaluation using: Tiyaro Experiments

Tiyaro Experiments allows us a quicker way to compare different pre-trained machine learning models along with the SaaS Vendor APIs helping us accelerate the evaluation process. To compare models, all of them must be of the same Model Type, in this case: optical-character-recognition. Example MMOCR IIIT 5k dataset public experiment Once we have completed searching and using demos for different models we can head on over to create experiments. There are various ways to create experiments but we are simply going over to the experiments tab. Where we start a new experiment. Screenshot 2022-08-10 at 2.39.11 AM.pngAfter that, we can select our experiment model type, which in our case would be Optical Character Recognition. Screenshot 2022-08-10 at 2.40.22 AM.pngOnce completed we can select the models to train the experiments on, the model selection provides us with various filters. Screenshot 2022-08-10 at 2.45.49 AM.pngAfter selecting the MMOCR models, you can upload your custom dataset. There are various formats to upload the OCR dataset on Tiyaro for the experiment. Easiest is zipping the CSV file containing one column as the input. The sample IIIT 5k-word dataset used in the experiment can be downloaded from Experiment config's Data section. Screenshot 2022-08-10 at 2.46.35 AM.pngAfter running the experiment we can see the results tab show up, here you would also be able to see the latency of the models as well as the results of the experiment, for our use case we are able to see the table containing our input and the respective model predictions. We can also download the result in a zip file. Screenshot 2022-08-10 at 4.11.14 PM.pngThe result below is shown in the table, the first column containing the actual input image, and the subsequent columns being the OCR results. Screenshot 2022-08-10 at 2.52.27 AM.pngAs seen from the above results, depending upon model training and test dataset, we have to evaluate the results, and choose the model that best suits the use-case.

A note on MMOCR models:

From MMOCR Docs , Text Recognition, Text Detection + Text Recognition and KIE models are of model-type: optical-character-recognition in Tiyaro. You can find these end2end ocr models on Tiyaro as well. From Tiyaro Docs - ocr response signature :

  • response.text simply returns the recognized text by the model
  • response.raw_response returns the complete response from model. Based on model provider, it varies a lot. While we try our best to keep Tiyaro Models updated with MMOCR updates, there might be some delays. In such cases, if you don't find a particular MMOCR model on Tiyaro, or any other model for that matter, simply raise a request in Tiyaro EasyServe . You can share the experiment with your coworkers and on social media. Also, you can make a copy of the given experiment to enhance or modify the particular experiment. Wish to create one? Head on over to Tiyaro !

Start Today.

© 2023 Tiyaro, Inc.