[英]Python: In a DataFrame, how do I find the year that strings from one column appear in another column?
I've got a dataframe and want to loop through all strings within column c2 and print that string and the year it appears in column c2 and then also print the first year when it appears in column c1, if it exists in c1. 我有一个数据框,想要遍历c2列中的所有字符串,并打印该字符串及其出现在c2列中的年份,然后还打印出出现在c1列中的第一年 (如果它存在于c1中)。 And then tally the difference between the years in another column. 然后在另一列中计算年份之间的差异。 There are NaN values in c2. c2中有NaN值。
Example df: df示例:
id year c1 c2
0 1999 luke skywalker han solo
1 2000 leia organa r2d2
2 2001 han solo finn
3 2002 r2d2 NaN
4 2004 finn c3po
5 2002 finn NaN
6 2005 c3po NaN
Example printed result: 示例打印结果:
c2 year in c2 year in c1 delta
han solo 1999 2001 2
r2d2 2000 2002 2
finn 2001 2004 3
c3po 2004 2005 1
I'm using Jupyter notebook with python and pandas. 我正在将Jupyter Notebook与python和pandas一起使用。 Thanks! 谢谢!
You can do it in steps like this: 您可以按照以下步骤进行操作:
df1 = df[df.c2.notnull()].copy()
s = df.groupby('c1')['year'].first()
df1['year in c1'] = df1.c2.map(s)
df1 = df1.rename(columns={'year':'year in c2'})
df1['delta'] = df1['year in c1'] - df1['year in c2']
print(df1[['c2','year in c2','year in c1', 'delta']])
Output: 输出:
c2 year in c2 year in c1 delta
0 han solo 1999 2001 2
1 r2d2 2000 2002 2
2 finn 2001 2004 3
4 c3po 2004 2005 1
Here is one way. 这是一种方法。
df['year_c1'] = df['c2'].map(df.groupby('c1')['year'].agg('first'))\
.fillna(0).astype(int)
df = df.rename(columns={'year': 'year_c2'})
df['delta'] = df['year_c1'] - df['year_c2']
df = df.loc[df['c2'].notnull(), ['id', 'year_c2', 'year_c1', 'delta']]
# id year_c2 year_c1 delta
# 0 0 1999 2001.0 2
# 1 1 2000 2002.0 2
# 2 2 2001 2004.0 3
# 4 4 2004 2005.0 1
Explanation 说明
c1
to year
, aggregating by "first". 将c1
映射到year
,按“ first”进行聚合。 c2
to calculate year_c1
. 使用此映射在c2
上计算year_c1
。 delta
as the difference between year_c2
and year_c1
. 计算delta
为year_c2
和year_c1
之间的差。 null
in c2
and order columns. 删除c2
和order列中具有null
行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.