简体   繁体   English

使用 .to_numpy() 将特定列从 Pandas Dataframe 的一行复制到另一行

[英]Using .to_numpy() to copy specific columns from one row of Pandas Dataframe to another

I have a Dataframe like this:我有一个这样的数据框:

     UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     123      12          1113
1  413-45365    1     889      75          6748
2  413-21165    8     554      13          4536
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     654      66          2232

And I have a dictionary of parent:child mappings for the UniqueID's:我有一本关于 UniqueID 的 parent:child 映射字典:

parent_child_dict = {
    '413-51015': '413-41158',
    '413-21165': '413-23457',
    '413-45365': '413-41158',
    '413-20012': '413-23457'
}

What I need to do is loop through the Dataframe, and replace the WEIGHT, VOLUME, and PRODUCTIVITY values of the 'child' UniqueID row with the values from the 'parent' UniqueID row, where resulting Dataframe would look like this:我需要做的是循环遍历数据框,并将“子”UniqueID 行的 WEIGHT、VOLUME 和 PRODUCTIVITY 值替换为“父”UniqueID 行中的值,其中生成的 Dataframe 如下所示:

     UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     355      42          7894
1  413-45365    1     453      91          4545
2  413-21165    8     355      42          7894
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     453      91          4545

I've experimented with several possible solutions, and the trouble I'm having is limiting the copy in such a way that the UniqueID and the CST values of the 'child' row are preserved, but the other values are copied over.我已经尝试了几种可能的解决方案,我遇到的问题是限制副本的方式是保留“子”行的 UniqueID 和 CST 值,但复制其他值。

The closest I've gotten is a loop through the dictionary where each pairing gets fed into this:我得到的最接近的是通过字典的循环,其中每个配对都被输入:

df.loc[df['UniqueID'] == '413-51015'] = df.loc[df['UniqueID'] == '413-41158'].to_numpy()

This seems to nicely replace all values from one row to another.这似乎很好地将所有值从一行替换为另一行。

Any help on the exceptions or a better solution overall would be extremely helpful.任何有关例外情况的帮助或更好的整体解决方案都会非常有帮助。 Thank you.谢谢你。


EDIT编辑

I've looped the first solution into the columns that I want changed in the dataset like this:我已经将第一个解决方案循环到我想要在数据集中更改的列中,如下所示:

columns = []
for col in df.columns:
    columns.append(col)
remove_perm = columns.remove('UniqueID')
remove_perm = columns.remove('CST')
print(columns)

OUTPUT输出

['WEIGHT', 'VOLUME', 'PRODUCTIVITY']

Then然后

for col in columns:
    s = df[['UniqueID', col]].set_index('UniqueID')
    df[col] = s.loc[df['UniqueID'].replace(parent_child_dict)].to_numpy()

This has resulted in the desired dataset.这导致了所需的数据集。

replace and loc access: replaceloc访问:

s = df[['UniqueID','PRODUCTIVITY']].set_index('UniqueID')

# using to_numpy here :-)
df['PRODUCTIVITY'] = s.loc[df['UniqueID'].replace(parent_child_dict)].to_numpy()

Output:输出:

    UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     123      12          7894
1  413-45365    1     889      75          4545
2  413-21165    8     554      13          7894
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     654      66          4545

First create a mapping out of your UniqueID and PRODUCTIVITY .首先根据您的UniqueIDPRODUCTIVITY创建一个映射。

Then use your parent child to map your ids:然后使用您的父子映射您的 ID:

mapping = df.set_index('UniqueID')['PRODUCTIVITY'].to_dict()
df['PRODUCTIVITY'] = (
    df['UniqueID'].map(parent_child_dict).map(mapping).fillna(df['PRODUCTIVITY']).astype(int)
)
print(df)
    UniqueID  CST  WEIGHT  VOLUME  PRODUCTIVITY
0  413-20012    3     123      12          7894
1  413-45365    1     889      75          4545
2  413-21165    8     554      13          7894
3  413-24354    1     387      35          7649
4  413-34658    2     121      88          2468
5  413-36889    4     105      76          3336
6  413-23457    5     355      42          7894
7  413-30089    5     146      10          9112
8  413-41158    5     453      91          4545
9  413-51015    9     654      66          4545

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Pandas 将列从一个 DataFrame 复制到另一个 DataFrame 的最快方法? - Fastest way to copy columns from one DataFrame to another using pandas? Pandas dataframe 从 as_matrix 移动到 to_numpy - Pandas dataframe moving from as_matrix to to_numpy 仅将两列从一个DataFrame复制到Pandas中的另一列 - Copy just two columns from one DataFrame to another in pandas Pandas function to_numpy - Pandas function to_numpy 如何将一行从一个熊猫数据帧复制到另一个熊猫数据帧? - How do I copy a row from one pandas dataframe to another pandas dataframe? 使用 pandas dataframe 上的名称将图像从一个文件夹复制到另一个文件夹 - Copy images from one folder to another using their names on a pandas dataframe Pandas从一个数据帧中删除不在另一个数据帧的索引中的列 - 错误TypeError:unhashable type:'numpy.ndarray' - Pandas remove columns from one dataframe that are not in the index of another dataframe - error TypeError: unhashable type: 'numpy.ndarray' 将一个 Pandas DataFrame 的副本合并到另一个 DataFrame 的每一行中? - Merge a copy of one pandas DataFrame into every row of another DataFrame? 使用熊猫自动将列从一个数据框映射到另一个数据框 - Automatically Map columns from one dataframe to another using pandas 使用pandas / numpy数据框操作特定列(样本特征)以另一列的条目(特征值)为条件 - Manipulate specific columns (sample features) conditional on another column's entries (feature value) using pandas/numpy dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM