
[英]Pandas generates a column based by matching the dataframe columns to multiple other columns
[英]pandas dataframe column based on row and multiple columns
我有以下数据框 ,我想添加一个名为open_next_year的新列。
将通过比较两列来选择此列; 财政年度 +1和股票代号 。 然后使用open列中的值。
原始数据框:
fiscalYear ticker open
2017 FINL 17.4880
2017 AAPL 17.4880
...
2016 FINL 16.4880
2016 AAPL 16.4880
2015 FINL 15.4880
2015 AAPL 15.4880
所需的数据框:
fiscalYear ticker open open_next_year
2017 FINL 17.4880
2017 AAPL 17.4880
2016 FINL 16.4880 17.4880
2016 AAPL 16.4880 17.4880
2015 FINL 15.4880 16.4880
2015 AAPL 15.4880 16.4880
请问熊猫有什么办法做到这一点?
我相信每个组都需要通过DataFrameGroupBy.shift
移动所有值:
df['open_next_year'] = df.groupby('ticker')['open'].shift()
print (df)
fiscalYear ticker open open_next_year
0 2017 FINL 17.488 NaN
1 2017 AAPL 17.488 NaN
2 2016 FINL 16.488 17.488
3 2016 AAPL 16.488 17.488
4 2015 FINL 15.488 16.488
5 2015 AAPL 15.488 16.488
更改样本以获取唯一的open
值:
print (df)
fiscalYear ticker open
0 2017 FINL 17.4881
1 2017 AAPL 17.4882
2 2016 FINL 16.4883
3 2016 AAPL 16.4884
4 2015 FINL 15.4885
5 2015 AAPL 15.4886
df['open_next_year'] = df.groupby('ticker')['open'].shift()
print (df)
fiscalYear ticker open open_next_year
0 2017 FINL 17.4881 NaN
1 2017 AAPL 17.4882 NaN
2 2016 FINL 16.4883 17.4881
3 2016 AAPL 16.4884 17.4882
4 2015 FINL 15.4885 16.4883
5 2015 AAPL 15.4886 16.4884
这是首先创建地图的另一种方法。
m = dict(zip(tuple(zip(df.fiscalYear - 1, df.ticker)),df.open))
df['open_next_year'] = df[['fiscalYear','ticker']].apply(tuple, 1).map(m)
地图/字典看起来像这样,是通过将第1年,股票行情和开盘价压缩在一起获得的:
{(2014, 'AAPL'): 15.488,
(2014, 'FINL'): 15.488,
(2015, 'AAPL'): 16.488,
(2015, 'FINL'): 16.488,
(2016, 'AAPL'): 17.488,
(2016, 'FINL'): 17.488}
完整示例:
data = '''\
fiscalYear ticker open
2017 FINL 17.488
2017 AAPL 17.488
2016 FINL 16.488
2016 AAPL 16.488
2015 FINL 15.488
2015 AAPL 15.488'''
fileobj = pd.compat.StringIO(data)
df = pd.read_csv(fileobj, sep='\s+')
m = dict(zip(tuple(zip(df.fiscalYear - 1, df.ticker)),df.open))
df['open_next_year'] = df[['fiscalYear','ticker']].apply(tuple, 1).map(m)
print(df)
返回值:
fiscalYear ticker open open_next_year
0 2017 FINL 17.488 NaN
1 2017 AAPL 17.488 NaN
2 2016 FINL 16.488 17.488
3 2016 AAPL 16.488 17.488
4 2015 FINL 15.488 16.488
5 2015 AAPL 15.488 16.488
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.