Improve the model prediction time in huggingface transformer models without GPU

I am using huggingface transformers models for quite a few tasks, it works good but the only problem is the response time. It takes around 6-7 seconds to generate result while some times it even takes around 15-20 seconds. I tried on google collab using GPU, the performance in GPU is too fast within just a seconds it processes the result. As there is a limitation of GPU on my current server is there any way to increase the response time of the models using CPU only.

Currently using Text Summarization using GooglePegasus Model.

And Parrot Paraphrasing : which internally uses bert model from transformers

slight improvement will also help!

Source: Python Questions