根据来自另一个数据框的数据将值分配给Pandas数据框中的列

Question

I have two dataframes containing concentration data and coordinates: 我有两个包含浓度数据和坐标的数据框：

Concentration Data (conc): 浓度数据（浓度）：

    Sample  analParam                 Conc  Units
0   CW7-1   1,1,1-Trichloroethane     0     UG/L
1   CW7-1   1,1,2,2-Tetrachloroethane 0     UG/L
2   CW7-1   1,1,2-Trichloroethane     0     UG/L
3   CW7-1   1,1-Dichloroethane        0     UG/L
4   CW7-1   1,1-Dichloroethylene      0     UG/L
5   CW7-1   1,1-Dichloropropene       0     UG/L
6   CW7-1   1,2,3-Trichlorobenzene    0     UG/L
... ... ... ... ...
50311   VOA2-2  Tetrachloroethylene  1.8    MG/KG
50312   VOA2-2  Toluene              1.2    MG/KG
50313   VOA2-2  Trichloroethylene    1.8    MG/KG
50314   VOA2-2  Vinyl Chloride       1.8    MG/KG

Coordinate Data (coord): 坐标数据（坐标）：

    Sample  x            y
0   CW7-1   320800.000  396500.000
1   CW7-2   320800.000  396500.000
2   CW7-3   320800.000  396500.000
3   FB06-17 0.000       0.000
4   FB06-18 0.000       0.000
5   FB06-19 0.000       0.000
6   FB07-08 0.000       0.000
... ... ... ...
453 TP21-1  318807.281  398547.485
454 TP21-2  318807.281  398547.485
455 TP24-1  318489.248  398544.797
456 VOA1-1  318500.582  398573.558
457 VOA1-2  318500.582  398573.558
458 VOA2-1  318536.337  398589.805
459 VOA2-2  318536.337  398589.805

I want to add two columns to my concentration dataframe that contains all the coordinates of the respective sample IDs for each concentration. 我想在浓度数据框中添加两列，其中包含每种浓度的相应样品ID的所有坐标。 For example, the first six rows in the Concentration Data would have columns of x = 320800 and y = 396500 since they all have Sample IDs of CW7-1: 例如，浓度数据的前六行将具有x = 320800和y = 396500的列，因为它们的样本ID均为CW7-1：

    Sample  analParam                 Conc  Units   x       y
0   CW7-1   1,1,1-Trichloroethane     0     UG/L    320800  396500   
1   CW7-1   1,1,2,2-Tetrachloroethane 0     UG/L    320800  396500
2   CW7-1   1,1,2-Trichloroethane     0     UG/L    320800  396500  
3   CW7-1   1,1-Dichloroethane        0     UG/L    320800  396500  
4   CW7-1   1,1-Dichloroethylene      0     UG/L    320800  396500  
5   CW7-1   1,1-Dichloropropene       0     UG/L    320800  396500

I've tried using double for loops, but it takes way too slow since I have so many data points: 我已经尝试过使用double for循环，但是由于我有很多数据点，它的速度太慢了：

for index, row in conc.iterrows():
    for cindex, crow in coord.iterrows():
        if conc.iloc[index,0] == coord.iloc[cindex,0]:
            conc.at[index,4] = coord.iloc[cindex,1]
            conc.at[index,5] = coord.iloc[cindex,2]

I've tried using the apply function, but I keep getting errors. 我已经尝试过使用apply函数，但是却不断出错。 For this rendition, I got TypeError: call () takes from 1 to 2 positional arguments but 3 were given. 对于此演示，我得到了TypeError： call （）从1到2个位置参数，但是给出了3个。

def xcoord (i):
    for index, row in coord.iterrows():
        if i == coord.iloc[index,0] :
            return coord.iloc(index,4)
conc['Sample'].apply(xcoord)

Answer 1

Thanks Wen! 谢谢温！

In[1]:
conc.merge(coord,on='Sample',how='left')

Out[1]:
Sample  analParam   Conc    Units   x   y
0   CW7-1   1,1,1-Trichloroethane   0   UG/L    320800.000  396500.000
1   CW7-1   1,1,2,2-Tetrachloroethane   0   UG/L    320800.000  396500.000
2   CW7-1   1,1,2-Trichloroethane   0   UG/L    320800.000  396500.000
3   CW7-1   1,1-Dichloroethane  0   UG/L    320800.000  396500.000
4   CW7-1   1,1-Dichloroethylene    0   UG/L    320800.000  396500.000
5   CW7-1   1,1-Dichloropropene 0   UG/L    320800.000  396500.000
6   CW7-1   1,2,3-Trichlorobenzene  0   UG/L    320800.000  396500.000
... ... ... ... ... ... ...
50311   VOA2-2  Tetrachloroethylene 1.8 MG/KG   318536.337  398589.805
50312   VOA2-2  Toluene 1.2 MG/KG   318536.337  398589.805
50313   VOA2-2  trans-1,3-Dichloropropene   1.8 MG/KG   318536.337  398589.805
50314   VOA2-2  Trichloroethylene   1.8 MG/KG   318536.337  398589.805
50315   VOA2-2  Vinyl Chloride  1.8 MG/KG   318536.337  398589.805
50316   VOA2-2  Xylenes (Total) 2.6 MG/KG   318536.337  398589.805

根据来自另一个数据框的数据将值分配给Pandas数据框中的列

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-11-29 21:02:07

根据来自另一个数据框的数据将值分配给Pandas数据框中的列

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-11-29 21:02:07

解决方案1
0 已采纳 2017-11-29 21:02:07