简体   繁体   中英

Is there any method to combine 2 pandas dataframes?

I want to combine 2 dataframe. However, it has one problem. How do I combine 2 table if 1 table value have to duplicate to another table.

I have tried, pandas.concat and pandas.merge.

df1={'id':[1]}
df2={'Brand':['volvo','audi'],
     'Price':[20,000,30,000]}

pd.concat([df1],[df2])

I expect the table will show as below:

id  Brand   Price
1   volvo   20,000
1   audi    30,000

It mean the id 1 has both car.

You have to decide how you want to merge, if there are multiple values. If there is just 1, then you can simply assign like:

df1['key'] = 1
df2['key'] = 1

Perform a merge on the temp key, then drop the temporary key:

df1.merge(df2).drop(columns=['key'])

Output:

 id  Brand  Price
 0   1  volvo  20000
 1   1   audi  30000

But you are performing a Cartesian product, so if there are multiple value in df1, eg: [1,2] , you will have more duplicate data:

id  Brand  Price
0   1  volvo  20000
1   1   audi  30000
2   2  volvo  20000
3   2   audi  30000

This is my current solution:

df1={'id':[1]}
df2 = {'Brand':['Volvo','Heizen','Eizen'],
        'Price':[20000,30000,40000]}

person=pd.DataFrame(df1)
car=pd.DataFrame(df2)
id=person.loc[0].id
car.insert(0,"id",id)
print(car)

This is my output:

   id   Brand  Price
0   1   Volvo  20000
1   1  Heizen  30000
2   1   Eizen  40000

It can get my expected table. But is it has any better solution?

df1 and df2 are not Pandas data frames.

data1={'id':[1]}
data2={'Brand':['volvo','audi'],'Price':[20000,30000]} 

df1 = pd.DataFrame(data1) #creating dataframes
df2 = pd.DataFrame(data2)
frames = [df1,df2]

and concatenating

pd.concat(frames, sort=False)

yields,

id  Brand   Price
0   1.0     NaN     NaN
0   NaN     volvo   20000.0
1   NaN     audi    30000.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM