Pandas DataFrame-将月份转换为日期时间，并从多个列中反复选择数据进行绘图

Question

Say I have a pandas DataFrame with the format: 假设我有一个格式为pandas的DataFrame：

     Month Thing1 Thing2       Tot
0   Jan-12      A      Z  0.005880
1   Jan-12      A      Z  0.024500
...
20  Jan-12      B      Y  0.001533
21  Jan-12      C      X  0.003892
22  Jan-12      C      X  0.001680
23  Jan-12      C      X  0.001680
24  Jan-12      C      X  0.001680
25  Jan-12      C      X  0.001680
26  Jan-12      A      W  0.001680
27  Jan-12      D      V  0.013440
28  Jan-12      E      U  0.001680
...

The Month column goes unitl Apr-14. 月列将统一为14年4月。 I am trying to plot line graphs for the monthly totals for each item in Thing1 and Thing2 . 我正在尝试为Thing1和Thing2每个项目的每月总计绘制折线图。

I am attempting this using groupby : 我正在尝试使用groupby ：

a=pd.read_csv('all2.csv')
sums=a.groupby([u'Month',u'Thing1',u'Thing2']).sum()

which gives me: 这给了我：

Apr-12 A      W         6.427773
              Z         4.347471
       B      T         7.062425
              Y        17.183562
       C      X        14.583337
       D      V         0.114450
       E      U         0.008050
       F      Q         0.000490
              R         0.004468
       G      P         0.010932
       ...

However the months come up alphabetically. 但是，按字母顺序显示月份。 My questions are: 我的问题是：

How can I get Pandas to consider the month column as a datetime object? 如何让Pandas将月份列视为日期时间对象？

How can I iterate through Thing1 column and plot time series monthly totals for each item in Thing2 ? 如何遍历Thing1列并绘制Thing2每个项目的每月时间序列总计？

I imagine there is a way to reorganise the Dataframe such that a simple call to plot() will do the job? 我想象有一种重组Dataframe的方法，这样对plot()的简单调用就可以完成工作？

Answer 1

This is because your 'Month' column is not in the right dtype . 这是因为您的“月”列不在正确的dtype 。 You can get the intended result by firstly converting the Month column to datetime format: 您可以通过首先将“ Month列转换为日期时间格式来获得预期的结果：

df['Month']=pd.to_datetime(df.Month) , before calling df.groupby([u'Month',u'Thing1',u'Thing2']).sum() df['Month']=pd.to_datetime(df.Month) ，然后调用df.groupby([u'Month',u'Thing1',u'Thing2']).sum()

But careful, Pandas doesn't know whether Jan-12 means 2014-01-12 or 2012-01 , by default it convert you data to the former. 但请注意， Pandas不知道Jan-12意味着2014-01-12还是2012-01 ，默认情况下会将您的数据转换为前者。 To get the latter, supply .to_datetime with format='%b-%y' argument. 要获取后者，请为.to_datetime提供format='%b-%y'参数。

For your second question, you can get the level of Thing1 by dfgb.index.get_level_values(1) . 对于第二个问题，可以通过dfgb.index.get_level_values(1)获得Thing1的级别。 where dfgb is the DataFrame from groupby . 其中dfgb是groupby的DataFrame 。 Then you can plot the time series by: 然后可以通过以下方式绘制时间序列：

for item in dfgb.index.get_level_values(1):
    dfgb.xs(item, level=1).plot(kind='bar') #for bar graph

Pandas DataFrame-将月份转换为日期时间，并从多个列中反复选择数据进行绘图

问题描述

1 个解决方案

解决方案1
1 2014-04-17 14:50:17

Pandas DataFrame-将月份转换为日期时间，并从多个列中反复选择数据进行绘图

问题描述

1 个解决方案

解决方案1 1 2014-04-17 14:50:17

解决方案1
1 2014-04-17 14:50:17