Correct multiple values in CSV with Python

  csv, json, pandas, python

I have a script that loads several CSVs into different SQL Server tables depending on a string that is located in the filename.

As part of this process, I want to replace certain values that appear in different columns, within different CSVs since there are 100s. I’ve got this working with the use of a JSON file that acts as a config. I read the files in a For loop, and then use regex to match the string from the filename to an entry in JSON.

Here are my scripts:

# Open the most recent JSON file
with open('C:UsersMEDesktopPythonlookup.json') as f:
    translation_table = json.load(f)

for filename in gl.glob("Concat_*.csv"):
    df = pd.read_csv(filename, dtype=object)

    filekey = re.match(r"^Concat_([A-Z0-9-]+_[0-9A-z]+)_[0-9]{8}.csv$", filename).group(1)

    # Use the translation tables to apply any error fixes
    if filekey in translation_table["error_lookup"]:
        tablename = translation_table["error_lookup"][filekey]
        df[tablename[0]] = df[tablename[0]].replace({tablename[1]: tablename[2]})

    # Save New file
    outfile = ('Fixed_{}_' + todaysdate + '.csv').format(filekey)
    df.to_csv(outfile, index=False)
    print('Saved new file: ' + outfile)


    "error_lookup": {
        "A450_22": ["NAME", "XXX", -5555],
        "B123_01": ["CODE", "700'", -5555],
        "D600_09": ["COSTPRICE", "--", -5555],

My question is what is the most pythonic way to correct multiple values for different columns in each CSV? Currently I’m limited at just fixing one value, per column with the above code although I’d need to have more coverage, otherwise my CSVs won’t load.

I did try the following in my JSON file, although I receieved the below error:

"D600_09": [["COSTPRICE","RUNPRICE","TOTALPRICE"], "--", -5555]

Error encountered:

df[tablename[0]] = df[tablename[0]].replace({tablename[1]: tablename[2]})
TypeError: unhashable type: 'list'

Source: Python Questions