I want to use a generator to quantize a LSTM model.
I start with the question as this is quite a long post.
I actually want to know if you have manged to quantize (int8) a LSTM model with post training quantization.
I tried it different TF versions but always bumped into an error. Below are some of my tries. Maybe you see an error I made or have a suggestion.
The input is expected as (batch,1,45). Running inference with the un-quantized model runs fine. The model and csv can be found here:
csv file: https://mega.nz/file/5FciFDaR#Ev33Ij124vUmOF02jWLu0azxZs-Yahyp6PPGOqr8tok
import tensorflow as tf import numpy as np import pathlib as path import pandas as pd def reshape_for_Lstm(data): timesteps=1 samples=int(np.floor(data.shape/timesteps)) data=data.reshape((samples,timesteps,data.shape)) #samples, timesteps, sensors return data if __name__ == '__main__': #GET DATA import pandas as pd data=pd.read_csv('./test_x_data_OOP3.csv', index_col=) data=np.array(data) data=reshape_for_Lstm(data) #LOAD MODEL saved_model_dir= path.Path.cwd() / 'model' / 'singnature_model_tf_2.7.0-dev20210914' model=tf.keras.models.load_model(saved_model_dir) # INFERENCE [yhat,yclass] = model.predict(data) Yclass=[np.argmax(yclass[i],0) for i in range(len(yclass))] # get final class print('all good')
The shape and dtypes of the variable
Where it goes wrong
Now I want to quantize the model. But depending on the TensorFlow version I run into different errors.
The code options I use are merged as follows:
converter=tf.lite.TFLiteConverter.from_saved_model('./model/singnature_model_tf_2.7.0-dev20210914') converter.representative_dataset = batch_generator converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.experimental_new_converter = False #converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS] #converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS] #converter._experimental_lower_tensor_list_ops = False converter.target_spec.supported_types = [tf.int8] quantized_tflite_model = converter.convert()
Using TF 2.2 as often suggested in Git, I run into non-supported operators from tflite. Using a tf2.2 created model to assure version-support. Here, only TOCO conversion is supported.
Some of the operators in the model are not supported by the standard
TensorFlow Lite runtime and are not recognized by TensorFlow.
The error does not depend on
converter.target_spec.supported_ops options. I could not find a solution therefore. allow_custom_ops only shifts the problem.
There are quite some git issues(just some examples) on this out there, but all suggested options did not work.
One is to try the new MILR converter, however, in 2.2 the integer only conversion for MILR was not done yet.
So lets try a newer version
Then I tried a well vetted version. Here, no matter the
converter.target_spec.supported_ops I run in following error using the MLIR conversion:
in the calibrator.py
ValueError: Failed to parse the model: pybind11::init(): factory
function returned nullptr.
The solution on Git is to use TF==2.2.0 version.
With TOCO conversion, I get the following error:
tensorflow/lite/toco/allocate_transient_arrays.cc:181] An array,
still does not have a known data type after all graph transformations
have run. Fatal Python error: Aborted
I did not find anything on this error.
Maybe it is solved in 2.6
Here, no matter which
converter.target_spec.supported_ops I use, I run into the following error:
ValueError: Failed to parse the model: Only models with a single
subgraph are supported, model had 5 subgraphs.
The model is a five layer model. So it seems that each layer is seen as a subgraph. I did not find an answer on how to merge them into one subgraph. The issue is apparently with 2.6.0 and is solved in 2.7 So, let’s try the nightly build.
TensorFlow 2.7-nightly (tried 2.7.0-dev20210914 and 2.7.0-dev20210921)
Here we have to use Python 3.7 as 3.6 is no longer supported
Here we have to use
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
However, even it is stated that
converter._experimental_lower_tensor_list_ops = False
should be set, it does not seem necessary.
The problem here is that, to my knowledge,
tf.lite.OpsSet.SELECT_TF_OPS calls the
calibrator.py. In the
representative_dataset is expecting specific generator data. From line 93 onwards in the
_feed_tensor() function the generator wants either a dict, list or tuple.
tf.lite.RepresentativeDataset function description or tflite class description, it states that the dataset should look the same as the input for the model. Which in my case (most cases) is just an numpy array in the correct dimensions.
Here I could try to convert my data into a tuple, however, this does not seem right.
Or is that actually the way to go?
Thanks so much for reading all this. If I find an answer, I will of course update the post
Source: Python Questions