简体   繁体   English

Pandas:根据新值编辑索引值并重新分组

[英]Pandas: Edit index values and re-groupby, according to new values

I have my index set to 'ShiftId' which looks like this: 201912240 (the date followed by a 0 or 1 that indicates day or night shift).我将索引设置为“ShiftId”,如下所示:201912240(日期后跟 0 或 1,表示日班或夜班)。 I have my df grouped by index values that return, as expected, something like this:我的 df 按索引值分组,按预期返回,如下所示:

           col1 col2
201912240  NaN  23
201912241  44   75
201912250  12   NaN
201912251  46   91

I want to regroup this dataframe to take the mean of each day (ignoring NaN values), then it will look like this我想重新组合这个数据框以取每天的平均值(忽略 NaN 值),然后它看起来像这样

           col1 col2
20191224   44   49
20191225   29   91 

But I can't get the current index values to be grouped.但我无法将当前索引值分组。 I have tried我试过了

    days_frame.index = days_frame.index.map(lambda x: str(x)[:-1])
    days_frame.groupby(days_frame.index).mean()

But this doesn't even change anything in the df?但这甚至没有改变 df 中的任何内容?

Please help请帮忙

Your solution for me working, maybe you forget assign output to variable like df here:您为我工作的解决方案,也许您忘记将输出分配给像df这样的变量:

days_frame.index = days_frame.index.map(lambda x: str(x)[:-1])
df = days_frame.groupby(days_frame.index).mean()
print (df)
          col1  col2
20191224  44.0  49.0
20191225  29.0  91.0

Another solution rename index first and then use mean per index values:另一种解决方案的重命名索引第一,然后使用mean每个索引值:

df = days_frame.rename(lambda x: str(x)[:-1]).mean(level=0)
print (df)
          col1  col2
20191224  44.0  49.0
20191225  29.0  91.0

Or convert index to strings, remove last value and pass to groupby with aggregate mean :或者将索引转换为字符串,删除最后一个值并通过聚合mean传递给groupby

df = days_frame.groupby(days_frame.index.astype(str).str[:-1]).mean()
print (df)
          col1  col2
20191224  44.0  49.0
20191225  29.0  91.0

EDIT:编辑:

If want avoid truncating all columns without A column use this solution before write to file:如果想避免截断没有A列的所有列,请在写入文件之前使用此解决方案:

df = pd.DataFrame({'A':[.41,1.5,.2,2,.3],
                   'B':['a'] * 5,
                   'C':[3,4,5,4,5],
                   'D':[1.0,3,4,5,6]})

cols = df.columns.difference(['A'])

df[cols] = df[cols].applymap(lambda x: '%.0f' % x if isinstance(x, (float, int)) else x)
print (df)

      A  B  C  D
0  0.41  a  3  1
1  1.50  a  4  3
2  0.20  a  5  4
3  2.00  a  4  5
4  0.30  a  5  6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM