I am trying to get all the Nobel prize winners that won more than once since 1901 – 2016. I tried pandas
duplicate() method but it return all the duplicates once except the one row or item. I am getting duplicates based on
full_name column in DataFrame. I have tried different combinations of parameters but got the same result. I know I can remove that one row manually, but what is happening wrong here. My code is given as
lucky_winners = df[df.duplicated(['full_name'])]
lucky_winners = df[df.duplicated(['full_name'], keep='first')]
lucky_winners = df[df.duplicated(['full_name'], keep='last')]
lucky_winners.full_name 62 Marie Curie, née Sklodowska 215 Comité international de la Croix Rouge (Intern... 340 Linus Carl Pauling 348 Comité international de la Croix Rouge (Intern... 424 John Bardeen 505 Frederick Sanger 523 Office of the United Nations High Commissioner...
The duplicated entity is
Comité international de la Croix Rouge (International Committee of the Red Cross). I even checked them for Boolean Comparison and get
True. Checked it using
lucky_winners.iloc.full_name == lucky_winners.iloc.full_name
I can’t get that where is the actual problem.
Source: Python Questions