简体   繁体   English

根据存储在其他两个数据帧上的索引在大熊猫数据帧上分配值

[英]Assign value on large pandas dataframe based on index stored on two others dataframes

I have three dataframes that I am comparing, where I have stored several data, one where is the information of my interest, which is the one I want to complete.我有三个要比较的数据框,其中存储了多个数据,其中一个是我感兴趣的信息,这是我想要完成的。 The second one where is the column with the coordinates that I want to add to my general dataframe and the third one where are stored the indexes of the two previous dataframes where the values correspond.第二个是我想添加到我的通用数据帧的坐标列,第三个是存储值对应的前两个数据帧的索引。

It is a little confusing, but I put an example where you can see it better:这有点令人困惑,但我举了一个例子,你可以更好地看到它:

Dataframe 1:数据框 1:

index指数 n_tree n_tree
247 247 1 1
248 248 2 2

Dataframe 2:数据框 2:

index指数 coords坐标
1400 1400 (20,47) (20,47)
1401 1401 (30,85) (30,85)

dataframe 3:数据框 3:

index指数 index_dataframe_1 index_dataframe_1 index_dataframe_2 index_dataframe_2
0 0 247 247 1401 1401

My intention is that my general dataframe contains the correct coordinate column.我的意图是我的通用数据框包含正确的坐标列。 as follow:如下:

index指数 n_tree n_tree coords坐标
247 247 1 1 (30,85) (30,85)

I have tried to assign it with .iloc, .loc, .at but I get the following error:我试图用 .iloc、.loc、.at 分配它,但出现以下错误:

 for idx, rw in dataframe_3.iterrows():
        coords = dataframe_1.loc[rw.index_dataframe_2, "coords"]
        dataframe_2.loc[int(rw.index_dataframe_1), "coords"] = coords

ValueError: Must have equal len keys and value when setting with an iterable. ValueError:使用可迭代对象设置时必须具有相等的 len 键和值。

You can perform two merges:您可以执行两个合并:

(df3.merge(df1, left_on='index_dataframe_1', right_index=True)
    .merge(df2, left_on='index_dataframe_2', right_index=True)
    [['n_tree', 'coords']]
)

output:输出:

       n_tree   coords
index                 
0           1  (30,85)

inputs:输入:

>>> df1
       n_tree
index        
247         1
248         2

>>> df2
        coords
index         
1400   (20,47)
1401   (30,85)

>>> df3
       index_dataframe_1  index_dataframe_2
index                                      
0                    247               1401

Use 2 inner joins by .merge() :通过.merge()使用 2 个内部连接:

(Assuming index in your dataframes are data columns instead of row indexes): (假设数据框中的index是数据列而不是行索引):

df_out = (df1.merge(df3, left_on='index', right_on='index_dataframe_1', suffixes=('', '_y'))       
             .merge(df2, left_on='index_dataframe_2', right_on='index', suffixes=('', '_z'))
          )


df_out = df_out[['index', 'n_tree', 'coords']]

Result:结果:

print(df_out)


   index  n_tree   coords
0    247       1  (30,85)

I think this could work for you:我认为这对你有用:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'index': [247, 248], 'n_tree': [1, 2]}).set_index('index')
df2 = pd.DataFrame({'index': [1400, 1401], 'coords': [(20,47), (30,85)]}).set_index('index')
df3 = pd.DataFrame({'index': [0], 'index_dataframe_1': [247], 'index_dataframe_2': [1401]}).set_index('index')

mapping = dict(zip(df3.index_dataframe_1, df3.index_dataframe_2))

l = list()
for i in df1.index:
    m = mapping.get(i, np.nan)
    if m is not np.nan:
        l.append(df2.at[m, 'coords'])
    else:
        l.append(np.nan)
df1['coords'] = l

print(df1)

Result:结果:

       n_tree    coords
index                  
247         1  (30, 85)
248         2       NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas 基于使用 dataframe 的一列作为索引的两个列表,将值分配给 dataframe 列 - Pandas assign value to dataframe column based on two lists using one column of the dataframe as index 根据熊猫中的索引值将一个数据帧分为多个数据帧 - Separating a dataframe into multiple dataframes based on the index value in pandas 两个 Pandas 数据帧:基于日期,将值添加到 dataframe - Two Pandas dataframes: Based on date, add value to dataframe 使用基于堆叠条件的分层索引为 pandas DataFrame 赋值 - Assign value to pandas DataFrame with hierarchical index based on stacked condition 熊猫基于索引和列合并两个数据框 - pandas merging two dataframes based on index and columns 根据具有连续数据的列值将DataFrame切片为其他DataFrame - Slice DataFrame to others DataFrames based in Column Value with Continuos Data 大熊猫-按另一个数据框的索引过滤数据框,然后将两个数据框合并 - pandas - filtering a dataframe by index of another dataframe, then combine the two dataframes Pandas 根据列的值对两个数据帧求和 - Pandas sum two dataframes based on the value of column 将pandas数据框基于另一个数据框分为两个数据框 - Split pandas dataframe into two dataframes based on another dataframe 根据条件在 Pandas 数据框中迭代和赋值 - Iterate and assign value in Pandas dataframe based on condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM