I want to combine 2 dataframe. However, it has one problem. How do I combine 2 table if 1 table value have to duplicate to another table.
I have tried, pandas.concat and pandas.merge.
df1={'id':[1]}
df2={'Brand':['volvo','audi'],
'Price':[20,000,30,000]}
pd.concat([df1],[df2])
I expect the table will show as below:
id Brand Price
1 volvo 20,000
1 audi 30,000
It mean the id 1 has both car.
You have to decide how you want to merge, if there are multiple values. If there is just 1, then you can simply assign like:
df1['key'] = 1
df2['key'] = 1
Perform a merge on the temp key, then drop the temporary key:
df1.merge(df2).drop(columns=['key'])
Output:
id Brand Price
0 1 volvo 20000
1 1 audi 30000
But you are performing a Cartesian product, so if there are multiple value in df1, eg: [1,2]
, you will have more duplicate data:
id Brand Price
0 1 volvo 20000
1 1 audi 30000
2 2 volvo 20000
3 2 audi 30000
This is my current solution:
df1={'id':[1]}
df2 = {'Brand':['Volvo','Heizen','Eizen'],
'Price':[20000,30000,40000]}
person=pd.DataFrame(df1)
car=pd.DataFrame(df2)
id=person.loc[0].id
car.insert(0,"id",id)
print(car)
This is my output:
id Brand Price
0 1 Volvo 20000
1 1 Heizen 30000
2 1 Eizen 40000
It can get my expected table. But is it has any better solution?
df1 and df2 are not Pandas data frames.
data1={'id':[1]}
data2={'Brand':['volvo','audi'],'Price':[20000,30000]}
df1 = pd.DataFrame(data1) #creating dataframes
df2 = pd.DataFrame(data2)
frames = [df1,df2]
and concatenating
pd.concat(frames, sort=False)
yields,
id Brand Price
0 1.0 NaN NaN
0 NaN volvo 20000.0
1 NaN audi 30000.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.