Edge Optimization - Quantization

Quantization is the process of converting floating point to fixed point integer.

IMAGIMOB Studio allows you to quantize your model with a simple click. The quantization in Studio is a hybrid quantized method, meaning that only the network is quantized but not the preprocessor. A list of all the supported Tensorflow layers that can be quantized in IMAGIMOB Studio can be found at Supported layers and functions (opens in a new tab).

If you are interested in delving into the maths of quantization, the theory behind it, and its implementation in IMAGIMOB Studio, refer to quantization white paper here documentation (opens in a new tab).

Building Edge model with quantization

  1. Double click the .h5 model file.


  2. Click the Code Gen tab on the left pane and configure the following parameters-

    • In Architecture select the target architecture for optimization. Select Any(ANSI C99) to compile for any platform.

    • In Output name enter the name of the output.

    • In Output directory browse the directory where you want to save the optimized output. By, default Gen is selected as the default folder.

    • In Timestamps API enable the checkbox to track corresponding input time for each output prediction. To know about the edge API, refer to

    • In C Prefix enter the prefix for the function names in the optimized C API.

    • In Build select the build type as Preprocessor and Quantized Network.

    • Click Quantization Options button. The Quantization Options window appears.


    • Configure the following parameters-

      Under Representative data set,

      For model quantization, you need to have the training data and/or validation data as the representative data. This should be a set of data that is large enough to represent typical values and it is recommended that you use all of the training data.

      • In Use Project file (.improj), enable the checkbox if you want to use all the data from the Imagimob project file to quantize.

      • In Project file, browse to select the project file.
        OR

      • In Recursive directory search enable the checkbox if you want to use the data stored in the selected directory.

      • In Root directory, browse to select folder where you saved the training data

      • In File filter enter the correct file filter to fetch the right data files. Enable the CSV file has header row and CSV file has timestamps (first column) check boxes if the data files are in the CSV format and the files have a header row as well as timestamps as the first column.

      Under Check Introduced Quantization Error,

      • Enable the Check argmax errors check box and set the maximum allowed argmax errors in perctange.

      • Enable the Check max mean error check box and set the maximum allowed difference.

      • Enable the Check max absolute error check box and set the maximum allowed difference.

  3. Click OK.

  4. Click Generate Code.

Since the quantized model might behave differently from the original model, you need to make sure that the predictions are similar. You can check the abs error, the mean error, and the argmax error of the quantized model's prediction compared to the original prediction. You can use the default or customize values. The choice of the error metric depends on the application. For example, a regression model is more sensitive to the abs error or mean error but for a classifier, the argmax error is more important.