简体   繁体   English

Python / Pandas:如果有匹配项,则将值从一个df添加到另一df的行末

[英]Python/Pandas: add value from one df to end of row in another df if there is a match

I need to return the value of one column in df1 and append it to a row in a df2 if a value from the df2 is in the first. 我需要返回df1中一列的值,并将其附加到df2中的行(如果来自df2的值位于第一行中)。

Sample code 样例代码

df1 = pd.DataFrame(
        {
        'terms' : ['term1','term2'],
        'code1': ['1234x', '4321y'],
        'code2': ['2345x','5432y'],
        'code3': ['3456x','6543y']
        }
        )
df1 = df1[['terms'] + df1.columns[:-1].tolist()]

df2 = pd.DataFrame(
        {
        'name': ['Dan','Sara','Conroy'],
        'rate': ['3','3.5','5.2'],
        'location': ['FL','OH','NM'],
        'code': ['4444g','6543y','2345x']                           
         })
df2 = df2[['name','rate','location','code']]

To merge the "code" columns into a new column, which results in a value I want to add to the rows in the second dataframe where there is a match. 要将“代码”列合并到新列中,这将产生一个值,我想将其添加到第二个数据帧中存在匹配项的行中。

df1['allcodes'] = df1[df1.columns[1:]].apply(lambda x: ','.join(x.dropna().astype(str)),axis=1)

Now df1 looks like: 现在df1看起来像:

 terms  code1  code2  code3           allcodes
0  term1  1234x  2345x  3456x  1234x,2345x,3456x
1  term2  4321y  5432y  6543y  4321y,5432y,6543y

What I need to do is, if df2['code'] is in df1['allcodes'], add the corresponding value of allcodes to the end of a row in df2 where there is a match. 我需要做的是,如果df2 ['code']位于df1 ['allcodes']中,则将所有代码的对应值添加到df2中存在匹配项的行的末尾。

The end result should be: 最终结果应为:

     name rate location   code allcodes
0    Sara  3.5       OH  6543y 4321y,5432y,6543y
1  Conroy  5.2       NM  2345x 1234x,2345x,3456x

Dan shouldn't be in there because his code isn't in df1 丹不应该在那里,因为他的代码不在df1中

I have looked and merge/join/concat, but as the tables are different sizes and the code from df2 can appear in multiple columns in df1, I don't see how to use those functions. 我已经看过了,并合并/联接/ concat,但由于表的大小不同并且df2中的代码可以出现在df1中的多个列中,所以我看不到如何使用这些功能。

Is this time for a lambda function, maybe with map? 这次是使用lambda函数吗,也许使用map? Any thoughts appreciated. 任何想法表示赞赏。

Setup 设定

df1
   terms  code1  code2  code3
0  term1  1234x  2345x  3456x
1  term2  4321y  5432y  6543y

df2
     name rate location   code
0     Dan    3       FL  4444g
1    Sara  3.5       OH  6543y
2  Conroy  5.2       NM  2345x

At the cost of space, one fast way to do this would be generate two mappings, and then chain two map calls. 以空间为代价,一种快速的方法是生成两个映射,然后链接两个map调用。

m1 = df1.melt('terms').drop('variable', 1).set_index('value').terms
m2 = df1.set_index('terms').apply(lambda x: \
                      ','.join(x.values.ravel()), 1)


df2['allcodes'] = df2.code.map(m1).map(m2)
df2 = df2.dropna(subset=['allcodes']) 

df2   
     name rate location   code           allcodes
1    Sara  3.5       OH  6543y  4321y,5432y,6543y
2  Conroy  5.2       NM  2345x  1234x,2345x,3456x

Details 细节

m1 
value
1234x    term1
4321y    term2
2345x    term1
5432y    term2
3456x    term1
6543y    term2
Name: terms, dtype: object

m2
terms
term1    1234x,2345x,3456x
term2    4321y,5432y,6543y
dtype: object

m1 will map code to the term , and m2 will map the term to the code group. m1code映射到termm2term映射到代码组。

Simple solution . 简单的解决方案。

xx=df1.set_index('terms').values.tolist()
df2['New']=df2.code.apply(lambda x : [y for y in xx if x in y] )
df2=df2[df2.New.apply(len)>0]
df2['New']=df2.New.apply(pd.Series)[0].apply(lambda x : ','.join(x))
df2
Out[524]: 
     name  rate location   code                New
1    Sara   3.5       OH  6543y  4321y,5432y,6543y
2  Conroy   5.2       NM  2345x  1234x,2345x,3456x

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何防止熊猫仅将一个 df 的值分配给另一列的另一行? - How to prevent pandas from only assigning value from one df to column of another for only one row? python pandas df:添加到一列,具体取决于另一列中该行的值 - python pandas df: adding to one column depending on the value in that row in another column append 如果 pandas 中没有重复,则从一个 df 到另一个的行值 - append row values from one df to another if no duplicates in pandas Pandas:检查一个df中的值是否存在于另一个DF的任何列中 - Pandas: Check if value in one df exists in any column of another DF 你能帮我从 pandas df 中找到一行内容到另一个 df 中,然后将发现的计数添加到第一个 df 的新列中吗? - Can you help me finding a row content from a pandas df into another df and then add the count of the findings into a new column of the first df? pandas 将列从另一个 DF 添加到另一个而不合并 - pandas add columns to one DF from another without merge 新的Pandas DF,其索引来自一个DF,列来自另一个DF - New Pandas DF with index from one DF and columns from another 如何将一行值从一个 DF 添加到另一个 - How to Add a Row of Values from One DF to Another 根据另一个df python pandas更新df列值 - update df column value based on another df python pandas Pandas 根据其他列值从另一个更新一个 df - Pandas update one df from another based on other columns value
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM