[英]DataFrame: update one column value based on dynamic column determined by value in 3rd column
Consider the following dataframe created from a dictionary考虑以下从字典创建的数据框
d = { 'p_symbol': ['A','B','C','D','E']
, 'p_volume': [0,0,0,0,0]
, 'p_exchange': ['IEXG', 'ASE', 'PSE', 'NAS', 'NYS']
, 'p_volume_rh': [1000,1000,1000,1000,1000]
, 'p_volume.1': [2000,2000,2000,2000,2000]
, 'p_volume.2': [3000,3000,3000,3000,3000]
, 'p_volume.3': [4000,4000,4000,4000,4000]
, 'p_volume.4': [5000,5000,5000,5000,5000]
}
snapshot = pd.DataFrame(d)
I need to set the value in p_volume to be the value in one of the last 5 p_volume* columns based on the value in p_exchange.我需要根据 p_exchange 中的值将 p_volume 中的值设置为最后 5 个 p_volume* 列之一中的值。 I need to do it this way due to the way data is being returned from a third party vendor API over which I have no control.由于我无法控制的第三方供应商 API 返回数据的方式,我需要这样做。
I have tried setting up a dictionary that given the value in p_exchange gives me the column name with the resulting code tried我试过设置一个字典,给出 p_exchange 中的值给我列名,结果代码尝试
us_primary_exchange_map = {
"NYS": "xp_volume_rh"
, "NAS": "xp_volume.1"
, "PSE": "xp_volume.2"
, "ASE": "xp_volume.3"
, "IEXG": "xp_volume.4"
}
snapshot["p_volume"] = snapshot[us_primary_exchange_map[snapshot["p_exchange"]]])
But this does not work...但这不起作用......
Can someone help me out here?有人可以帮我吗? Is there a straightforward way to do this without having to iterate over the rows?有没有一种直接的方法可以做到这一点而不必遍历行?
I hope I've understood your question right (and xp_volume_*
is a typo, should be p_volume_*
without x
?):我希望我已经正确理解了您的问题(并且xp_volume_*
是一个错字,应该是没有x
p_volume_*
?):
snapshot['p_volume'] = snapshot.lookup(snapshot.index, snapshot['p_exchange'].map(us_primary_exchange_map))
print(snapshot)
Prints:印刷:
p_symbol p_volume p_exchange ... p_volume.2 p_volume.3 p_volume.4
0 A 5000 NYS ... 3000 4000 5000
1 B 4000 ASE ... 3000 4000 5000
2 C 3000 PSE ... 3000 4000 5000
3 D 2000 NAS ... 3000 4000 5000
4 E 1000 NYS ... 3000 4000 5000
[5 rows x 8 columns]
You can use pandas.dataframe.apply with argument axis=1
to apply a function on dataframe rows:您可以使用带有参数axis=1
pandas.dataframe.apply在数据帧行上应用函数:
snapshot['p_volume'] = snapshot.apply(lambda row: snapshot.loc[row.name,
us_primary_exchange_map[row['p_exchange']]], axis=1)
And your dataframe will look like:您的数据框将如下所示:
p_symbol p_volume p_exchange p_volume_rh p_volume.1 p_volume.2 \
0 A 5000 IEXG 1000 2000 3000
1 B 4000 ASE 1000 2000 3000
2 C 3000 PSE 1000 2000 3000
3 D 2000 NAS 1000 2000 3000
4 E 1000 NYS 1000 2000 3000
p_volume.3 p_volume.4
0 4000 5000
1 4000 5000
2 4000 5000
3 4000 5000
4 4000 5000
I'm not sure it's more efficient than iterate the rows, but I think it's prettier.我不确定它是否比迭代行更有效,但我认为它更漂亮。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.