I have a big data.frame with roughly 100 columns and try to plot all the time-series in one graph. Is there an easy way to deal with it, without specifying every y-axis manually?
This would be a simple example with these time-series: 02K W, 03K W, and 04K W :
import pandas as pd
import matplotlib.pyplot as plt
df1 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05'],
'index':[0, 1, 2, 3, 4],
'02K W':[3.5, 0.1, 3, 'nan', 0.2],
'03K W':[4.2, 5.2, 2.5, 3.0, 0.6],
'04K W':[1.5, 2.6, 8.2, 4.2, 5.3]})
df1['Date'] = pd.to_datetime(df1['Date'])
df1 = df1.set_index('index')
So far, I manually specify all y-axis to plot the individual time-series.
plt.plot(df1['Date'], df1['02K W'])
plt.plot(df1['Date'], df1['03K W'])
plt.plot(df1['Date'], df1['04K W'])
Is there a more elegant way to specify the relevant columns for the plot? Thank you very much for your suggestions :)
Is there a more elegant way to specify the relevant columns for the plot?
Use DataFrame.plot
with Date
as the index and filter by the desired columns
:
columns = ['02K W', '03K W', '04K W']
df1.set_index('Date')[columns].plot()
Note that you have a string 'nan'
in your sample data. If this is true in your real data, you should convert it to a real np.nan
, eg, with pd.to_numeric
or DataFrame.replace
.
import pandas as pd import matplotlib.pyplot as plt df1 = pd.DataFrame({ 'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05'], 'index':[0, 1, 2, 3, 4], '02K W':[3.5, 0.1, 3, 'nan', 0.2], '03K W':[4.2, 5.2, 2.5, 3.0, 0.6], '04K W':[1.5, 2.6, 8.2, 4.2, 5.3]}) df1['Date'] = pd.to_datetime(df1['Date']) df1 = df1.set_index('index') for col in df1.colums[1:]: plt.plot(df1['Date'], df1[col])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.