使用 .to_numpy() 将特定列从 Pandas Dataframe 的一行复制到另一行

Question

I have a Dataframe like this:我有一个这样的数据框：

     UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     123      12          1113
1  413-45365    1     889      75          6748
2  413-21165    8     554      13          4536
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     654      66          2232

And I have a dictionary of parent:child mappings for the UniqueID's:我有一本关于 UniqueID 的 parent:child 映射字典：

parent_child_dict = {
    '413-51015': '413-41158',
    '413-21165': '413-23457',
    '413-45365': '413-41158',
    '413-20012': '413-23457'
}

What I need to do is loop through the Dataframe, and replace the WEIGHT, VOLUME, and PRODUCTIVITY values of the 'child' UniqueID row with the values from the 'parent' UniqueID row, where resulting Dataframe would look like this:我需要做的是循环遍历数据框，并将“子”UniqueID 行的 WEIGHT、VOLUME 和 PRODUCTIVITY 值替换为“父”UniqueID 行中的值，其中生成的 Dataframe 如下所示：

     UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     355      42          7894
1  413-45365    1     453      91          4545
2  413-21165    8     355      42          7894
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     453      91          4545

I've experimented with several possible solutions, and the trouble I'm having is limiting the copy in such a way that the UniqueID and the CST values of the 'child' row are preserved, but the other values are copied over.我已经尝试了几种可能的解决方案，我遇到的问题是限制副本的方式是保留“子”行的 UniqueID 和 CST 值，但复制其他值。

The closest I've gotten is a loop through the dictionary where each pairing gets fed into this:我得到的最接近的是通过字典的循环，其中每个配对都被输入：

df.loc[df['UniqueID'] == '413-51015'] = df.loc[df['UniqueID'] == '413-41158'].to_numpy()

This seems to nicely replace all values from one row to another.这似乎很好地将所有值从一行替换为另一行。

Any help on the exceptions or a better solution overall would be extremely helpful.任何有关例外情况的帮助或更好的整体解决方案都会非常有帮助。 Thank you.谢谢你。

EDIT编辑

I've looped the first solution into the columns that I want changed in the dataset like this:我已经将第一个解决方案循环到我想要在数据集中更改的列中，如下所示：

columns = []
for col in df.columns:
    columns.append(col)
remove_perm = columns.remove('UniqueID')
remove_perm = columns.remove('CST')
print(columns)

OUTPUT输出

['WEIGHT', 'VOLUME', 'PRODUCTIVITY']

Then然后

for col in columns:
    s = df[['UniqueID', col]].set_index('UniqueID')
    df[col] = s.loc[df['UniqueID'].replace(parent_child_dict)].to_numpy()

This has resulted in the desired dataset.这导致了所需的数据集。

Answer 1

replace and loc access: replace和loc访问：

s = df[['UniqueID','PRODUCTIVITY']].set_index('UniqueID')

# using to_numpy here :-)
df['PRODUCTIVITY'] = s.loc[df['UniqueID'].replace(parent_child_dict)].to_numpy()

Output:输出：

    UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     123      12          7894
1  413-45365    1     889      75          4545
2  413-21165    8     554      13          7894
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     654      66          4545

Answer 2

First create a mapping out of your UniqueID and PRODUCTIVITY .首先根据您的UniqueID和PRODUCTIVITY创建一个映射。

Then use your parent child to map your ids:然后使用您的父子映射您的 ID：

mapping = df.set_index('UniqueID')['PRODUCTIVITY'].to_dict()
df['PRODUCTIVITY'] = (
    df['UniqueID'].map(parent_child_dict).map(mapping).fillna(df['PRODUCTIVITY']).astype(int)
)
print(df)
    UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     123      12          7894
1  413-45365    1     889      75          4545
2  413-21165    8     554      13          7894
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     654      66          4545

使用 .to_numpy() 将特定列从 Pandas Dataframe 的一行复制到另一行

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-03-19 21:42:29

解决方案2
0 2020-03-19 21:39:30

使用 .to_numpy() 将特定列从 Pandas Dataframe 的一行复制到另一行

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-03-19 21:42:29

解决方案2 0 2020-03-19 21:39:30

解决方案1
2 已采纳 2020-03-19 21:42:29

解决方案2
0 2020-03-19 21:39:30