How can I split histogram into multiple plots based on two columns?

  dataframe, histogram, pandas, plot, python

I am having trouble with subplots in pandas. My goal is to create a number of histograms from some accuracy data I have collected. I would like to plot these histograms on separate axes based on their ‘id’ and ‘barcode’. I have 2 barcodes and 5 id variants in total so would like to plot 10 histograms. The data looks something like this except there are around a million rows.

df = pd.DataFrame{'id': 'a', 'b', 'c', 'barcode': '2', '3', '4', 'accuracy': '99', '98', '99'}

I tried:

fig, ax = plt.subplots(5, 2,sharey='none', sharex='none',figsize=(20,20)) 
df.groupby('id')['accuracy'].hist(by=df['barcode'])

This only splits the histograms by barcode so I tried:

df['accuracy'].hist(by=df['id', 'barcode'])

Which gave the a key error for ‘[‘id’, ‘barcode’]’

Perhaps there is a better module to do this with rather than pandas?

Any help would be greatly appreciated.

Source: Python Questions

LEAVE A COMMENT