[英]pandas group and count
I need help with assigning date index (DayCount) and adding an alternate naming column (Alias). 我需要分配日期索引(DayCount)和添加备用命名列(Alias)的帮助。 I have something like this: 我有这样的事情:
df df
ID Date Name
111 1/1/17 Abc
111 1/3/17 xyz
111 1/2/17 ADC
222 1/5/17 ABC
222 1/6/17 XYZ
333 1/10/17 ijk
Ideal result would be: 理想的结果是:
ID Date DateCount Name Alias
111 1/1/17 1 Abc Adam
111 1/3/17 3 xyz X
111 1/2/17 2 ADC Adam
222 1/5/17 1 ABC Adam
222 1/6/17 2 XYZ X
333 1/10/17 1 ijk Others
For the DateCount column, I know I have to group ID and sort the date but I'm not sure how to assign the index. 对于DateCount列,我知道我必须对ID进行分组并对日期进行排序,但是我不确定如何分配索引。 As for the Alias column, I'm wondering there's a way to assign value by grouping. 至于Alias列,我想知道是否有一种通过分组分配值的方法。
Thanks! 谢谢!
IIUC.... IIUC...。
d={'X':'X','A':'Adam'}
df['Datecount']=df.sort_values('Date').groupby('ID').cumcount().add(1)
df
Out[324]:
ID Date Name Datecount
0 111 2017-01-01 Abc 1
1 111 2017-01-03 xyz 3
2 111 2017-01-02 ADC 2
3 222 2017-01-05 ABC 1
4 222 2017-01-06 XYZ 2
5 333 2017-01-10 ijk 1
df['Alias']=df.Name.str[0].str.upper().map(d).fillna('Other')
df
Out[329]:
ID Date Name Datecount Alias
0 111 2017-01-01 Abc 1 Adam
1 111 2017-01-03 xyz 3 X
2 111 2017-01-02 ADC 2 Adam
3 222 2017-01-05 ABC 1 Adam
4 222 2017-01-06 XYZ 2 X
5 333 2017-01-10 ijk 1 Other
pd.DataFrame({'ID': [111,111,111], 'Date': ['2007-01-01', '2017-01-03', '2007-01-02'],'Name':['Abc','xyz','rst']})
Date ID Name
0 2007-01-01 111 Abc
1 2017-01-03 111 xyz
2 2007-01-02 111 rst
idx = 1
cols = [1,1,1]
idx2 = 4
colAlias = ['Adam','x','Adam']
df.insert(loc=1, column='DateCount', value=cols)
df.insert(loc=4, column='Alias', value=colAlias)
Date DateCount ID Name Alias
0 2007-01-01 1 111 Abc x
1 2017-01-03 1 111 xyz Adam
2 2007-01-02 1 111 rst Adam
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.