简体   繁体   English

根据另一个DataFrame填充Pandas列

[英]Filling a Pandas column based on another DataFrame

I have two data frames, and want to know how to add a column to one of them using certain values from the other. 我有两个数据框,并且想知道如何使用另一个中的某些值向其中一个添加列。 Specifically, I have data frames that look like: 具体来说,我有如下数据框:

foo = pd.DataFrame( np.random.rand(3,3))
foo.columns = ['col_1','col_2','col_3']

      col_1     col_2     col_3
0  0.661546  0.554032  0.753549
1  0.063641  0.490173  0.998119
2  0.370046  0.424208  0.125751


bar = pd.DataFrame( [[1, 2], [1,1], [3,3], [1,2], [2,1], [3,2]])

   0  1
0  1  2
1  1  1
2  0  3
3  1  2
4  2  1
5  0  2

I want to add a column to bar whose value is the value of foo at the location given by the columns of bar . 我想增加一列,以bar其值是值foo在由的列给出的位置bar So, the desired result would be: 因此,期望的结果将是:

   0  1  anything
0  1  2  0.490173
1  1  1  0.063641
2  0  3  0.753549
3  1  2  0.490173
4  2  1  0.370046
5  0  2  0.554032

My application for this involves very large data frames, so I don't think iterating through is a good option. 我对此的应用程序涉及非常大的数据帧,因此我认为遍历不是一个好的选择。 Any help would be appreciated. 任何帮助,将不胜感激。

Try this 尝试这个

foo['Index']=foo.index
df=pd.melt(foo,id_vars=['Index'],value_vars=[1,2,3])
df
Out[563]: 
   Index variable     value
0      0        1  0.178661
1      1        1  0.065537
2      2        1  0.926429
3      0        2  0.139027
4      1        2  0.502449
5      2        2  0.971156
6      0        3  0.161616
7      1        3  0.530899
8      2        3  0.420385



bar.merge(df,left_on=[0,1],right_on=['Index', 'variable'],how='left')\
    .drop(['Index', 'variable'],axis=1)

   0  1     value
0  1  2  0.502449
1  1  1  0.065537
2  0  3  0.161616
3  1  2  0.502449
4  2  1  0.926429
5  0  2  0.139027

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据条件填充 Pandas DataFrame 列 - Filling a Pandas DataFrame column based on conditions 使用基于另一个数据框中的 2 个窗口日期的值填充新列(在 Pandas 和 PySpark 中) - Filling up a new column with values based on 2 window dates in another dataframe (in Pandas and PySpark) 根据另一个 DataFrame 中最近的位置填充 Pandas DataFrame 中的缺失值 - Filling missing values in Pandas DataFrame based on nearest location in another DataFrame 通过匹配另一个DataFrame中的值来最佳填充Pandas DataFrame列 - Optimal filling of pandas DataFrame column by matching values in another DataFrame 通过pandas dataframe中的另一个列内容填充NaN列 - filling a NaN column by another column content in pandas dataframe 根据另一个数据框中的列填充一个数据框中的空值? - Filling empty values in one dataframe based on column in another dataframe? 基于来自另一个数据帧的索引和列填充数据帧 - Filling dataframe based on index and column from another dataframe 根据另一个数据帧的列名和索引值填充数据框 - Filling a dataframe based on the column name and index value of another dataframe 根据另一列的“组”值填充 dataframe 的列 - Filling column of dataframe based on 'groups' of values of another column 基于 Pandas DataFrame 中另一列的 Sum 列 - Sum column based on another column in Pandas DataFrame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM