python merge and unmatched records also need to exists

  dataframe, merge, pandas, python, python-2.7

I have two files

input.csv

11/13/2020 07:41:09 TREE count1: id1 green001
11/13/2020 07:43:09 TREE count1: id1 black001
11/13/2020 07:45:09 TREE count1: id2 black001
11/13/2020 07:45:09 PLAN count1: id3 green002
11/13/2020 07:45:09 PLAN count1: id4 green004

lookup.csv

ID,item,message
id1,item1,message 1
id2,item2,message 2
id3,item3,message 3

I am trying to merge both files and expected below output
Expected output:

Time,Type,counts,id,item,message,colour
11/13/2020 07:41:09,TREE,count1,id1,item1,message 1,green001
11/13/2020 07:43:09,TREE,count1,id1,item1,message 1,black001
11/13/2020 07:45:09,TREE,count1,id2,item2,message 2,black001
11/13/2020 07:45:09,PLAN,count1,id3,item3,message 3,green002
11/13/2020 07:45:19,PLAN,count1,id4,     ,         ,green004

I am able to achieve that able to merge when ID value present in lookup file.
Code:

import pandas as pd

# read input and remove spurious : at end of count
input = pd.read_csv("input.csv", sep=' ',
         names=["date","time", "tree","count","ID", "info"])
input["count"] = input["count"].apply(lambda s:s[:-1])

# read lookup and merge
lookup = pd.read_csv("lookup.csv")
merged = input.merge(lookup, on="ID")

# collapse time and date to single column
merged["time"] = merged["date"] + " " + merged["time"]
del merged["date"]

# output
print(merged)
merged.to_csv("testme.csv", index=False)

Code is working fine if all ID values in input.csv exists in lookup.csv file, but fails when ID value not exists in lookup.csv file

Any suggestions will be helpful.

Source: Python Questions

LEAVE A COMMENT