简体   繁体   English

append 如果 pandas 中没有重复,则从一个 df 到另一个的行值

[英]append row values from one df to another if no duplicates in pandas

I have theses two dfs我有这两个 df


df1 = pd.DataFrame({'pupil': ["sarah", "john", "fred"],
                  'class': ["1a", "1a", "1a"]})


df2 = pd.DataFrame({'pupil_mixed': ["sarah", "john", "lex"],
                  'class': ["1a", "1c", "1a"]})


I want to append the row values from the column "pupil_mixed" from df2 to the column "pupil" in df1 if the values are no duplicates如果值不重复,我想将 append 从 df2 的“pupil_mixed”列到 df1 中的“pupil”列的行值

desired outcome:期望的结果:

df1 = pd.DataFrame({'pupil': ["sarah", "john", "fred", 'lex'],
                  'class': ["1a", "1a", "1a", NaN]})


I used append with loc我用appendloc

df1 = df1.append(df2.loc[df2['pupil_mixed'] != df1['pupil'] ])

which just appended the other column to the df with the matching row value and changed the non matching row values to NaN它只是将另一列附加到具有匹配行值的 df,并将不匹配的行值更改为 NaN

    pupil   class   pupil_mixed
0   sarah   1a      NaN
1   john    1a      NaN
2   fred    1a      NaN
2   NaN     1a      lex




You could use concat + drop_duplicates :您可以使用concat + drop_duplicates

res = pd.concat((df1, df2['pupil_mixed'].to_frame('pupil'))).drop_duplicates('pupil')

print(res)

Output Output

   pupil class
0  sarah    1a
1   john    1a
2   fred    1a
2    lex   NaN

As an alternative you could filter first (with isin ) and then concat:作为替代方案,您可以先过滤(使用isin )然后连接:

# filter the rows in df2, rename the column pupil_mixed
filtered = df2.loc[~df2['pupil_mixed'].isin(df1['pupil'])]

# create a new single column DataFrame with the pupil column
res = pd.concat((df1, filtered['pupil_mixed'].to_frame('pupil')))

print(res)

Both solutions use to_frame , with the name parameter, effectively changing the column name.两种解决方案都使用to_frame和 name 参数,有效地更改名。

# distinct df1 & df2
df1['tag'] = 1
df2['tag'] = 2

# change the column name the same
df2.columns = df1.columns
df1 = df1.append(df2)
# drop_duplicates by keep df1
df1 = df1.drop_duplicates('pupil', keep='first')

# set tag == 2, class is null
cond = df1['tag'] == 2
df1.loc[cond, 'class'] = np.nan
del df1['tag']

print(df1)

output: output:

print(df1)

   pupil class
0  sarah    1a
1   john    1a
2   fred    1a
3    lex   NaN

You could use a merge, after renaming pupil_mixed in df2:在 df2 中重命名pupil_mixed后,您可以使用合并:

df1.merge(df2["pupil_mixed"].rename("pupil"), how="outer")

   pupil    class
0   sarah   1a
1   john    1a
2   fred    1a
3   lex    NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 pandas 仅将新值从一个 df 插入到另一个有条件的 - pandas insert only new values from one df to another with conditions Pandas 值从一个df转移到另一个溢出错误 - Pandas values transfer from one df to another OverflowError 如何将一行值从一个 DF 添加到另一个 - How to Add a Row of Values from One DF to Another 一个 df 的行在另一个 df 的一行中有多少次值更高 - How many times does row from one df have values higher in a row of another df Python / Pandas:如果有匹配项,则将值从一个df添加到另一df的行末 - Python/Pandas: add value from one df to end of row in another df if there is a match Pandas:如何根据行值从一个 df 获取列标签并将其分配为新 df 中的行值? - Pandas: How do you get column labels from one df based on row values and assign these as row values in new df? Pandas:将数据帧附加到另一个 df - Pandas: append dataframe to another df 如何防止熊猫仅将一个 df 的值分配给另一列的另一行? - How to prevent pandas from only assigning value from one df to column of another for only one row? 新的Pandas DF,其索引来自一个DF,列来自另一个DF - New Pandas DF with index from one DF and columns from another 熊猫df。 将一个数据帧中的列的值与另一数据帧中的列的值进行匹配 - Pandas df. Match values of a column from one dataframe with a values from a column from another dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM