How to get access to tokenzier after loading a custom pre-trained BERT model?

  bert-language-model, keras, python, tensorflow, tokenize

I am working on Intent classification problem and need your help.

I fine-tuned one of the BERT model for text classification. Trained and evaluated it on a small dataset for detecting five intents. I used the following code Intent Recognition with BERT using Keras and TensorFlow 2 It is working fine!
enter image description here
I have saved the model, so that I can use later on without retraining the model again in future.

# Save the entire model as a SavedModel.
!mkdir -p saved_model
model.save('saved_model/intentclassifiermodel')

And zipped it and downloaded it to use it separately

!zip -r saved_model.zip saved_model/

Now, I am trying to use this model to predict the intent recognition. For that I created another google colab notebook and loaded the model

from google.colab import drive
drive.mount('/content/gdrive')

!pip install tensorflow==2.2

!pip install bert-for-tf2 >> /dev/null

import bert

from tensorflow import keras
model = keras.models.load_model('/content/gdrive/MyDrive/NLPMODELS/saved_model/intentclassifiermodel')

model.summary()

enter image description here

The model is loaded successfully, now I want to predict. For that I am using following code snippet (it was the same code in base code)

sentences = [
  
  "are you a bot?",
  "how to create a bot"
]

pred_tokens = map(tokenizer.tokenize, sentences)
pred_tokens = map(lambda tok: ["[CLS]"] + tok + ["[SEP]"], pred_tokens)
pred_token_ids = list(map(tokenizer.convert_tokens_to_ids, pred_tokens))

pred_token_ids = map(lambda tids: tids +[0]*(data.max_seq_len-len(tids)),pred_token_ids)
pred_token_ids = np.array(list(pred_token_ids))

predictions = model.predict(pred_token_ids).argmax(axis=-1)

for text, label in zip(sentences, predictions):
  print("text:", text, "nintent:", classes[label])
  print()

**However, this code fails because I am not sure how to access the tokenizer here. **

Here is the error
enter image description here

Can you please help me how to get the tokenizer?

Thanks and Regards,
Rohit Dhamija

Source: Python Questions

LEAVE A COMMENT