Pandas 数据帧 - 合并两个数据帧，但省略具有同一列的条目

Question

I'm trying to create a DataFrame out of two existing ones.我正在尝试从现有的两个中创建一个 DataFrame。 I read the title of some articles in the web, first column is title and the ones after are timestamps我在web中阅读了一些文章的标题，第一列是标题，后面是时间戳

i want to concat both data frames but leave out the ones with the same title (column one)我想连接两个数据框，但忽略具有相同标题的数据框（第一栏）

I tried我试过了

df = pd.concat([df1,df2]).drop_duplicates().reset_index(drop=True) df = pd.concat([df1,df2]).drop_duplicates().reset_index(drop=True)

but because the other columns may not be the exact same all the time, I need to leave out every data pack that has the same first column.但是因为其他列可能并不总是完全相同，所以我需要省略每个具有相同第一列的数据包。 how would I do this?我该怎么做？

btw sorry for not knowing all the right terms for my problem顺便说一句，很抱歉不知道我的问题的所有正确条款

Answer 1

You should first remove the duplicate rows from df2 and then concat it with df1 :您应该首先从df2中删除重复的行，然后将其与df1连接：

df = pd.concat([df1, df2[~df2.title.isin(df1.title)]]).reset_index(drop=True)

Answer 2

This probably solves your problem:这可能会解决您的问题：

import pandas as pd
import numpy as np
df=pd.DataFrame(np.arange(2*5).reshape(2,5))
df2=pd.DataFrame(np.arange(2*5).reshape(2,5))
df.columns=['blah1','blah2','blah3','blah4','blah']
df2.columns=['blah5','blah6','blah7','blah8','blah']

for i in range(len(df.columns)):
    for j in range(len(df2.columns)):
        if df.columns[i] == df2.columns[j]:
            df2 = df2.drop(df2.columns[j], axis = 1)
        else:
            continue

print(pd.concat([df, df2], axis =1))

Pandas 数据帧 - 合并两个数据帧，但省略具有同一列的条目

问题描述

2 个解决方案

解决方案1
0 2020-07-10 12:43:49

解决方案2
0 2020-07-10 14:07:35

Pandas 数据帧 - 合并两个数据帧，但省略具有同一列的条目

问题描述

2 个解决方案

解决方案1 0 2020-07-10 12:43:49

解决方案2 0 2020-07-10 14:07:35

解决方案1
0 2020-07-10 12:43:49

解决方案2
0 2020-07-10 14:07:35