Should data feed into Universal Sentence Encoder be normalized?

  artificial-intelligence, nlp, python, tensorflow

I am currently working with Tensor Flow’s Universal Sentence Encoder (https://arxiv.org/pdf/1803.11175.pdf) for my B.Sc. thesis where I study extractive summarisation techniques.
In the vast majority of techniques for this task (like https://www.aaai.org/ocs/index.php/IJCAI/IJCAI15/paper/view/11225/10855), the sentences are first normalized (lowercasing, stop word removal, lemmantisation), but I couldn’t find a hint whether sentences feed into the USE should first be normalized. Is that the case? Does is matter?

Source: Python Questions

LEAVE A COMMENT