Category : nltk

So I’m relatively new to Python, and I’m trying to clean and summarize a ‘.txt’ file, and keep getting this warning after trying to create a similarity matrix:- /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:4: RuntimeWarning: invalid value encountered in double_scalars after removing the cwd from sys.path. This is the similarity matrix code:- similarity_matrix = np.zeros((len(cleaned_sentences),len(cleaned_sentences))) for i in range(0,len(cleaned_sentences)): for ..

Read more

I’m trying to remove stopwords from each row of my dataframe and put it into a new dataframe column S. I’ve tried below code but it doesn’t seem to work… from nltk.corpus import stopwords stopwords = stopwords.words(‘english’) df[‘S’] = df.apply(lambda row: (word for word in row[‘remarks_tokenized’] if word.lower() not in stopwords), axis=1) Source: Python..

Read more

I have a list of sentences in a csv file. Now I need to lemmatize these sentences and extract those containing certain keywords. import wordnet, nltk from nltk.stem import WordNetLemmatizer from nltk.corpus import wordnet from nltk import word_tokenize import pandas as pd import csv # define lemmatizer lemmatizer = WordNetLemmatizer() result = [] # define ..

Read more

I have the following chatbot which reads a text file and uses NLTK, then outputs text from this text file accordingly. But I want to use an excel file with multiple columns and hundreds of rows. Each row should have a question, its answer, and its answer source. How can I implement Excel into this ..

Read more