Category : pandas

I’m trying to identify all the possible languages in the dataframe. Here is the sample of my dataframe import pandas as pd import pycld2 as cld2 dataload = [[‘AB1’,"Machine learning isn’t difficult"],[‘AB2′,’O aprendiz ado de máquina não é tão difíci كما يظن الناس’]] dfTest=pd.DataFrame(dataload, columns=[‘UID’,’TXT’]) UID TXT AB1 Machine learning isn’t difficult AB2 O aprendiz ..

Read more

I have a fixed-width flat file. To make matters worse, each line can either be a new record or a subrecord of the line above, identified by the first three character on each line: fwf file: 6 00604000000000227960010000000000000 040620211544157004410298500771004100336KK 05454220545401 1 6 106040000000002279600002000000300000000030000020000000000 6306040000000002279600010406202115441504010206570000000704106813674700010000001900000000190001800000000000290Y 6306040000000002279600020406202115441504010202180000000869078491299000010000001100000000110001800000000000168Y 6 00604000000000228357010020000000000 040620211549287004410298500771004100337KK 05454220545401 1 6 106040000000002283570002000000260000000026000010000000000 6306040000000002283570010406202115492804010202470000000406470043530800010000001500000000150001800000000000229Y 6306040000000002283570020406202115492804010202180000000869078491299000010000001100000000110001800000000000168Y I’d ..

Read more

The following link shows how to add multiple EntityRuler with spaCy. The code to do that is below: import spacy import pandas as pd from spacy.pipeline import EntityRuler nlp = spacy.load(‘en_core_web_sm’, disable = [‘ner’]) ruler = nlp.add_pipe("entity_ruler") flowers = ["rose", "tulip", "african daisy"] for f in flowers: ruler.add_patterns([{"label": "flower", "pattern": f}]) animals = ["cat", "dog", ..

Read more

I have the following code and I obtain ‘TypeError: ‘tuple’ object is not callable'(in new_time) but I dont understand why. I wrote it based on this tutorial https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/ and https://github.com/getalp/Flaubert My code : #torch == 1.8.1 #numpy == 1.20.2 #pandas == 1.0.3 #transformers == 4.6.1 from transformers import logging logging.set_verbosity_warning() import numpy as np import ..

Read more

I would like to multiply a column (or create a new one with the multiplied values) based on two conditions. So I tried : c1 = df[‘Mean’]==’SEPA’ and df[‘Engagement’] == ‘M’ c2 = df[‘Mean’]!=’SEPA’ and df[‘Engagement’] == ‘M’ df.loc[c1, [‘Amount Eq Euro’]] *= 62 df.loc[c2, [‘Amount Eq Euro’]] *= 18 Here is the dataframe Mean ..

Read more

I have dataframe in the below format . Input df.head(3) groupId Gourpname totalItemslocations 7494732 A {‘code’: ‘DEHAM’, ‘position’: {‘lat’: 53.551085, ‘lon’: 9.993682}} 7494733 B {‘code’: ‘DEHAM’, ‘position’: {‘lat’: 53.551086, ‘lon’: 9.993687}} 7494734 A {‘code’: ‘DEHAM’, ‘position’: {‘lat’: 53.552084, ‘lon’: 9.993682}} Expected Output roupId Gourpname totalItemslocations.code totalItemslocations.position.lat totalItemslocations.position.lon 7494732 A DEHAM 53.551085 9.993682 7494733 B DEHAM ..

Read more