I am currently working with Tensor Flow’s Universal Sentence Encoder (https://arxiv.org/pdf/1803.11175.pdf) for my B.Sc. thesis where I study extractive summarisation techniques.
In the vast majority of techniques for this task (like https://www.aaai.org/ocs/index.php/IJCAI/IJCAI15/paper/view/11225/10855), the sentences are first normalized (lowercasing, stop word removal, lemmantisation), but I couldn’t find a hint whether sentences feed into the USE should first be normalized. Is that the case? Does is matter?
Source: Python Questions