[英]python pandas: merging 2 dataframes
我想使用python的pandas加入如下所示的2個數據框:
customer_orders = pd.DataFrame({'customerID': [1, 2, 2, 1],
'customerName': ['John', 'Anna', 'Anna', 'John'],
'customerAge': [21, 45, 45, 21],
'orderID': [255, 256, 257, 258],
'paymentType': ['visa', 'bank', 'master', 'paypal']})
會產生:
customerAge customerID customerName orderID paymentType
0 21 1 John 255 visa
1 45 2 Anna 256 bank
2 45 2 Anna 257 master
3 21 1 John 258 paypal
和
order_products = pd.DataFrame({'orderID': [255, 255, 257, 258, 255, 257],
'price': [9.99, 23.40, 15.89, 3.99, 89.50, 23.40],
'productName': ['filter', 'cosmetic', 'shampoo', 'tissues', 'elecBrush', 'cosmetic']})
會產生:
orderID price productName
0 255 9.99 filter
1 255 23.40 cosmetic
2 257 15.89 shampoo
3 258 3.99 tissues
4 255 89.50 elecBrush
5 257 23.40 cosmetic
到下面的東西
預期產量
customerAge customerID customerName orderID paymentType
21 1 John 255 visa 255 9.99 filter
21 1 John 255 visa 255 23.40 cosmetic
21 1 John 255 visa 255 89.50 elecBrush
45 2 Anna 256 bank null null null
45 2 Anna 257 master 257 15.89 shampoo
45 2 Anna 257 master 257 23.40 cosmetic
21 1 John 258 paypal 258 3.99 tissues
據我所知,這是一個SQL左連接。 但是使用
all = customer_orders.join(order_products, on="orderID", how='left', lsuffix='_left', rsuffix='_right')
沒有給我我想要的東西(行和NaN太少,而不是第二張表的值)。
我想念什么?
剩下? 不,這是一個外部聯接。
customer_orders.merge(order_products, on="orderID", how='outer')
customerAge customerID customerName orderID paymentType price \
0 21 1 John 255 visa 9.99
1 21 1 John 255 visa 23.40
2 21 1 John 255 visa 89.50
3 45 2 Anna 256 bank NaN
4 45 2 Anna 257 master 15.89
5 45 2 Anna 257 master 23.40
6 21 1 John 258 paypal 3.99
productName
0 filter
1 cosmetic
2 elecBrush
3 NaN
4 shampoo
5 cosmetic
6 tissues
嘗試使用merge
all = customer_orders.merge(order_products, on="orderID", how='left')
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.