I wrote a program to process some CSV files in batches, and extract the ‘Volume’ value from CSV and then calculate it, but there is a problem that when I run the program, it prompts me "ZeroDivisionError: division by zero" error.
I checked the program and found that the data is added to the dataframe after the ‘for name in files’ loop finished, but the dataframe is empty before the ‘Inday_Vol = Inday_Vol.append’ line, so the values of dataframe are all 0, so the part of ‘/ Inday_Vol.Volume.sum() *100’ will get a zero divisor, although there is already had data in the dataframe when the last sentence is executed, python obviously doesn’t think so.
I can arbitrarily define a value other than 0 for the dataframe at the beginning, which does not affect the accuracy of the calculation results, but I don’t think this is the best solution. So is there a better solution to handle this kind of example?
here are part code of my program
import pandas as pd import os def Extract(filename): ... Inday_List =  Inday_data = pd.DataFrame(columns=('Code', 'Volume')) path = r"i:datas" for root, dirs, files in os.walk(path): for name in files: #Process all files and extact volume datas Volume = Extract(name) Inday_List.append([name[0:6], Volume]) #Use list but dataframe by append method to improve speed Inday_Vol = Inday_Vol.append(pd.DataFrame(Inday_List,columns=['Code','Volume'])) Crowding_Factor = Inday_Vol.sort_values(by="Volume",ascending=False).Volume .iloc[0:int(len(Inday_Vol)/20)].sum() / Inday_Vol.Volume.sum() *100
Source: Python Questions