Category : stemming

I have tweet dataset (taken from NLTK) which is currently in a pandas dataframe, but I need to stem it. I have tried many different ways and get some different errors, such as AttributeError: ‘Series’ object has no attribute ‘lower’ and KeyError: ‘text’ I dont understand the KeyError as the column is definitely called ‘text’, ..

Read more

I’m trying to build the TFIDF matrix for the following trivial example, using the code: from sklearn.feature_extraction.text import TfidfVectorizer import nltk import re from nltk.stem.snowball import SnowballStemmer stemmer = SnowballStemmer(‘english’) corpus = [ "I work in a consulting agency and I am an IT consultant.", "They’re certainly consultants.", ] vectorizer = TfidfVectorizer(stop_words=nltk.corpus.stopwords.words(‘english’), tokenizer=tokenize_and_stem) X = ..

Read more

I am trying to stem a dataframe column values enter image description here and want to append values on bases of stem in a new column data[‘category’] I am getting the error. Kindly help to resolve it: ValueError: Length of values does not match length of index words= [‘wedding’,’property’,’house’,’university’,’education’,’car’] for word in words: print(english_stemmer.stem(word)) from ..

Read more

I have a dictionary like this. random = {‘1’: {‘A’: [‘Its raining here’, ‘Hello’]}, ‘2’: {‘B’: [‘are you fine’, ‘I was in a hurry’]}} Want the inner key -> value to be stemmed. Example : ‘A’ : [‘Its raining here’, ‘Hello’] -> ‘A’ : [‘it rain here’, ‘hello’] I used the below code but couldn’t ..

Read more