I want to use a generator to quantize a LSTM model.

**Questions**

I start with the question as this is quite a long post.

I actually want to know if you have manged to quantize (int8) a LSTM model with post training quantization.

I tried it different TF versions but always bumped into an error. Below are some of my tries. Maybe you see an error I made or have a suggestion.

Thanks

**Working Part**

The input is expected as (batch,1,45). **Running inference with the un-quantized model runs fine**. The model and csv can be found here:

csv file: https://mega.nz/file/5FciFDaR#Ev33Ij124vUmOF02jWLu0azxZs-Yahyp6PPGOqr8tok

modelfile: https://mega.nz/file/UAMgUBQA#oK-E0LjZ2YfShPlhHN3uKg8t7bALc2VAONpFirwbmys

```
import tensorflow as tf
import numpy as np
import pathlib as path
import pandas as pd
def reshape_for_Lstm(data):
timesteps=1
samples=int(np.floor(data.shape[0]/timesteps))
data=data.reshape((samples,timesteps,data.shape[1])) #samples, timesteps, sensors
return data
if __name__ == '__main__':
#GET DATA
import pandas as pd
data=pd.read_csv('./test_x_data_OOP3.csv', index_col=[0])
data=np.array(data)
data=reshape_for_Lstm(data)
#LOAD MODEL
saved_model_dir= path.Path.cwd() / 'model' / 'singnature_model_tf_2.7.0-dev20210914'
model=tf.keras.models.load_model(saved_model_dir)
# INFERENCE
[yhat,yclass] = model.predict(data)
Yclass=[np.argmax(yclass[i],0) for i in range(len(yclass))] # get final class
print('all good')
```

The shape and dtypes of the variable `data`

are `(20000,1,45), float64`

**Where it goes wrong**

Now I want to quantize the model. But depending on the TensorFlow version I run into different errors.

The code options I use are merged as follows:

```
converter=tf.lite.TFLiteConverter.from_saved_model('./model/singnature_model_tf_2.7.0-dev20210914')
converter.representative_dataset = batch_generator
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.experimental_new_converter = False
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS]
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
#converter._experimental_lower_tensor_list_ops = False
converter.target_spec.supported_types = [tf.int8]
quantized_tflite_model = converter.convert()
```

**TensorFlow 2.2**

Using TF 2.2 as often suggested in Git, I run into non-supported operators from tflite. Using a tf2.2 created model to assure version-support. Here, only TOCO conversion is supported.

Some of the operators in the model are not supported by the standard

TensorFlow Lite runtime and are not recognized by TensorFlow.

The error does not depend on `converter.target_spec.supported_ops`

options. I could not find a solution therefore. allow_custom_ops only shifts the problem.

There are quite some git issues(just some examples) on this out there, but all suggested options did not work.

One is to try the new MILR converter, however, in 2.2 the integer only conversion for MILR was not done yet.

So lets try a newer version

**TensorFlow 2.5.0**

Then I tried a well vetted version. Here, no matter the `converter.target_spec.supported_ops`

I run in following error using the MLIR conversion:

`in the calibrator.py`

ValueError: Failed to parse the model: pybind11::init(): factory

function returned nullptr.

The solution on Git is to use TF==2.2.0 version.

With TOCO conversion, I get the following error:

tensorflow/lite/toco/allocate_transient_arrays.cc:181] An array,

StatefulPartitionedCall/StatefulPartitionedCall/model/lstm/TensorArrayUnstack/TensorListFromTensor,

still does not have a known data type after all graph transformations

have run. Fatal Python error: Aborted

I did not find anything on this error.

Maybe it is solved in 2.6

**TensorFlow 2.6.0**

Here, no matter which `converter.target_spec.supported_ops`

I use, I run into the following error:

ValueError: Failed to parse the model: Only models with a single

subgraph are supported, model had 5 subgraphs.

The model is a five layer model. So it seems that each layer is seen as a subgraph. I did not find an answer on how to merge them into one subgraph. The issue is apparently with 2.6.0 and is solved in 2.7 So, let’s try the nightly build.

**TensorFlow 2.7-nightly** (tried 2.7.0-dev20210914 and 2.7.0-dev20210921)

Here we have to use Python 3.7 as 3.6 is no longer supported

Here we have to use

```
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
```

However, even it is stated that

```
converter._experimental_lower_tensor_list_ops = False
```

should be set, it does not seem necessary.

The problem here is that, to my knowledge, `tf.lite.OpsSet.SELECT_TF_OPS`

calls the `calibrator.py`

. In the `calibrator.py`

the `representative_dataset`

is expecting specific generator data. From line 93 onwards in the `_feed_tensor()`

function the generator wants either a dict, list or tuple.

In the `tf.lite.RepresentativeDataset`

function description or tflite class description, it states that the dataset should look the same as the input for the model. Which in my case (most cases) is just an numpy array in the correct dimensions.

Here I could try to convert my data into a tuple, however, this does not seem right.

Or is that actually the way to go?

**Thanks so much for reading all this. If I find an answer, I will of course update the post**

Source: Python Questions