pyspark matplotlib integration with Zeppelin

  apache-zeppelin, matplotlib, pyspark, python

I’m trying to draw histogram using pyspark in Zeppelin notebook. Here is what I have tried so far,

%pyspark

import matplotlib.pyplot as plt
import pandas
...
x=dateDF.toPandas()["year(CAST(_c0 AS DATE))"].values.tolist()
y=dateDF.toPandas()["count(year(CAST(_c0 AS DATE)))"].values.tolist()
plt.plot(x,y)
plt.show()

This code run without no errors but this does not give the expected plot. So I googled and found this documantation,
enter image description here

According to this, I tried to enable angular flag as follows,

x=dateDF.toPandas()["year(CAST(_c0 AS DATE))"].values.tolist()
y=dateDF.toPandas()["count(year(CAST(_c0 AS DATE)))"].values.tolist()
plt.close()
z.configure_mpl(angular=True,close=False)
plt.plot(x,y)
plt.show()

But now I’m getting an error called No module named 'mpl_config' and I have no idea how to enable angular without this. If you can suggest how to resolve this it will be greatly appriciated

Source: Python Questions

LEAVE A COMMENT