DataFrame：根据第 3 列中的值确定的动态列更新一列值

Question

Consider the following dataframe created from a dictionary考虑以下从字典创建的数据框

d = { 'p_symbol': ['A','B','C','D','E']
     , 'p_volume': [0,0,0,0,0]
     , 'p_exchange': ['IEXG', 'ASE', 'PSE', 'NAS', 'NYS']
     , 'p_volume_rh': [1000,1000,1000,1000,1000]
     , 'p_volume.1': [2000,2000,2000,2000,2000]
     , 'p_volume.2': [3000,3000,3000,3000,3000]
     , 'p_volume.3': [4000,4000,4000,4000,4000]
     , 'p_volume.4': [5000,5000,5000,5000,5000]
     }

snapshot = pd.DataFrame(d)

I need to set the value in p_volume to be the value in one of the last 5 p_volume* columns based on the value in p_exchange.我需要根据 p_exchange 中的值将 p_volume 中的值设置为最后 5 个 p_volume* 列之一中的值。 I need to do it this way due to the way data is being returned from a third party vendor API over which I have no control.由于我无法控制的第三方供应商 API 返回数据的方式，我需要这样做。

I have tried setting up a dictionary that given the value in p_exchange gives me the column name with the resulting code tried我试过设置一个字典，给出 p_exchange 中的值给我列名，结果代码尝试

us_primary_exchange_map = {
    "NYS": "xp_volume_rh"
    , "NAS": "xp_volume.1"
    , "PSE": "xp_volume.2"
    , "ASE": "xp_volume.3"
    , "IEXG": "xp_volume.4"
    }

snapshot["p_volume"] = snapshot[us_primary_exchange_map[snapshot["p_exchange"]]])

But this does not work...但这不起作用......

Can someone help me out here?有人可以帮我吗？ Is there a straightforward way to do this without having to iterate over the rows?有没有一种直接的方法可以做到这一点而不必遍历行？

Answer 1

I hope I've understood your question right (and xp_volume_* is a typo, should be p_volume_* without x ?):我希望我已经正确理解了您的问题（并且xp_volume_*是一个错字，应该是没有x p_volume_* ？）：

snapshot['p_volume'] = snapshot.lookup(snapshot.index, snapshot['p_exchange'].map(us_primary_exchange_map))
print(snapshot)

Prints:印刷：

  p_symbol  p_volume p_exchange  ...  p_volume.2  p_volume.3  p_volume.4
0        A      5000        NYS  ...        3000        4000        5000
1        B      4000        ASE  ...        3000        4000        5000
2        C      3000        PSE  ...        3000        4000        5000
3        D      2000        NAS  ...        3000        4000        5000
4        E      1000        NYS  ...        3000        4000        5000

[5 rows x 8 columns]

Answer 2

You can use pandas.dataframe.apply with argument axis=1 to apply a function on dataframe rows:您可以使用带有参数axis=1 pandas.dataframe.apply在数据帧行上应用函数：

snapshot['p_volume'] = snapshot.apply(lambda row: snapshot.loc[row.name,
us_primary_exchange_map[row['p_exchange']]], axis=1)

And your dataframe will look like:您的数据框将如下所示：

  p_symbol  p_volume p_exchange  p_volume_rh  p_volume.1  p_volume.2  \
0        A      5000       IEXG         1000        2000        3000   
1        B      4000        ASE         1000        2000        3000   
2        C      3000        PSE         1000        2000        3000   
3        D      2000        NAS         1000        2000        3000   
4        E      1000        NYS         1000        2000        3000   

   p_volume.3  p_volume.4  
0        4000        5000  
1        4000        5000  
2        4000        5000  
3        4000        5000  
4        4000        5000

I'm not sure it's more efficient than iterate the rows, but I think it's prettier.我不确定它是否比迭代行更有效，但我认为它更漂亮。

DataFrame：根据第 3 列中的值确定的动态列更新一列值

问题描述

2 个解决方案

解决方案1
0 2020-10-22 16:52:36

解决方案2
0 2020-10-22 17:15:25

DataFrame：根据第 3 列中的值确定的动态列更新一列值

问题描述

2 个解决方案

解决方案1 0 2020-10-22 16:52:36

解决方案2 0 2020-10-22 17:15:25

解决方案1
0 2020-10-22 16:52:36

解决方案2
0 2020-10-22 17:15:25