Python Dataframe prevent duplicates while concating

Question

I have two dataframes. I concat them to make one. The problem is, while troubleshooting the code, I will the same concat code multiple times. This produces the dataframe with repeated rows as many times I do the concat. I want to prevent it.

My code:

rdf = pd.DataFrame({'A':[10,20]},index=pd.date_range(start='2020-05-04 08:00:00', freq='1h', periods=2))
df2 = pd.DataFrame({'A':[30,40]},index=pd.date_range(start='2020-05-04 10:00:00', freq='1h', periods=2))

# Run it first time
rdf= pd.concat([rdf,df2])
# First time result
rdf
                      A
2020-05-04 08:00:00  10
2020-05-04 09:00:00  20
2020-05-04 10:00:00  30
2020-05-04 11:00:00  40

# Run it second time
rdf= pd.concat([rdf,df2])
# second time result produces duplicates
rdf
                      A
2020-05-04 08:00:00  10
2020-05-04 09:00:00  20
2020-05-04 10:00:00  30
2020-05-04 11:00:00  40
2020-05-04 10:00:00  30
2020-05-04 11:00:00  40

My solution: My approach is right a new line code and drop duplicates by keeping the first.

rdf= pd.concat([rdf,df2])
rdf.drop_duplicates(keep='first',inplace=True)
rdf
                      A
2020-05-04 08:00:00  10
2020-05-04 09:00:00  20
2020-05-04 10:00:00  30
2020-05-04 11:00:00  40

Is there a better approach? I mean, can we prevent this while concating? so, no need to write extra line code for dropping the duplicates.

Answer 1

Then let us try combine_first

rdf = rdf.combine_first(df2)
rdf = rdf.combine_first(df2)
rdf
Out[115]: 
                        A
2020-05-04 08:00:00  10.0
2020-05-04 09:00:00  20.0
2020-05-04 10:00:00  30.0
2020-05-04 11:00:00  40.0

Python Dataframe prevent duplicates while concating

Question

1 answers

solution1
2 ACCPTED 2021-05-10 00:29:47

Python Dataframe prevent duplicates while concating

Question

1 answers

solution1 2 ACCPTED 2021-05-10 00:29:47

solution1
2 ACCPTED 2021-05-10 00:29:47