Highlight rows in Python

  excel, highlight, python

I have data in an excel file and I need a way to identify if there are some rows that are identicals.
My data are more than 1000 rows and I want to highlight the double rows in the excel document.

This is the code that I use to identify the double lines:


file = pd.read_excel('Path'.xlsx)

for i in range(1112):
    for y in range(1113):
        if file.iloc[i].equals(file.iloc[y]) & (i != y) & (i<y) :
            print(file.iloc[i])
            print(file.iloc[y])

file.to_excel(r'Path2', index = False)

The problem is that I also need to highligth the double row and I am not managing to do that.

This is a small example that I found on-line and I tried to change some parts in order to adapt the script to my needs. It does not work, but I do not know how to fix it.


import pandas as pd

def highlight_max(x):
    
    for i in range(3):
        for y in range(4):
            if ((x.iloc[i]) == (x.iloc[y])) & (i != y) & (i<y) :
                return ['background-color: yellow' if x.iloc[i] == x.iloc[y] else '' for a in x]


cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4','Toyota Corolla'],
        'Price': [22000,25000,27000,35000,25000], 'Year': [2000,2003,2005,2007,2003]}

df = pd.DataFrame(cars, columns = ['Brand', 'Price','Year'])

df.style.apply(highlight_max)

        
file.to_excel(r'path', index = False)

Source: Python Questions

LEAVE A COMMENT