I have a personal Python project where I am trying to tokenize tweets. I am using NLTK’s TweetTokenizer to break up these tweets. I am running into an issue where contractions incorrectly get broken up EX "can’t" -> ["can", "’", "t"] I am struggling to find any documentation on this error. I have pasted relevant ..
I downloaded nltk package via pip , but when I run my code , i get a messeage, "Resource stopwords not found, please use the NLTK downloader to obtain the resource …." from nltk.corpus import stopwords stop_words = set(stopwords.words(‘english’)) Do I need any other packages installed, my understanding is stopwords is included in the NLTK ..
My problem has two parts, here is the first: I have a list of lists like this: big_list = [[‘i have an apple’,’i have a pear’,’i am a monkey’, ‘tell me about apples’], [‘tell me about cars’,’tell me about trucks’,’tell me about planes’]] I also have a list of words like this: words = [‘i’,’have’,’monkey’,’tell’,’me’,’about’] ..
I want to extract information from different sentences so i’m using nltk to divide each sentence to words, I’m using this code: words= for i in range(len(sentences)): words.append(nltk.word_tokenize(sentences[i])) words it works pretty good but i want something little bit different .. for example i have this sentence : ‘[‘Jan 31 19:28:14 nginx: 10.0.0.0 – – ..
I am trying to tokenize text in a single column of a dataframe but the code does not run to completion or output an error message. The hourglass icon in Jupyter Notebook is present after more than an hour. The csv file loaded into Jupyter notebooks is about 54 MB. I have tried other word ..
I have a dataframe with text in one of its columns. I have listed some predefined keywords which I need for analysis and words associated with it (and later make a wordcloud and counter of occurrences) to understand topics /context associated with such keywords. Use case: df.text_column() keywordlist = [coca , food, soft, aerated, soda] ..
I have just started learning NLP and for that purpose I installed nltk package using pip install nltk in the cmd terminal of VS Code. After I installed it, I tried importing it in the command line itself and I was successful but in the main window where we write the code, I tried from ..
When upgrading an IIS flask app from python 3.8.6 to 3.9.5, I am suddenly unable to get the services to work since they are unable to retrieve the nltk_data directory. I have deleted and redownloaded the corpora I needed for the app with python 3.9, no luck. I have checked to make sure it is ..
I am trying to create an application that identifies the emotion of the face in the image using deep learning. Later I want to create a response to that emotion and display it using Tkinter and NLP in python. Like when we ask the chatbot something it gives a response, I want to apply it ..
OK I’m working on a project (OPENSUBTITLES) So I have the en subs folder which contains (446612) files So the question is I want to read the data and i wanted to clean the data aswell (tokenize data + remove punctuation + remove stopwords + remove white spaces) i want to apply same algorithm on ..