[英]python pandas: merging 2 dataframes
我想使用python的pandas加入如下所示的2个数据框:
customer_orders = pd.DataFrame({'customerID': [1, 2, 2, 1],
'customerName': ['John', 'Anna', 'Anna', 'John'],
'customerAge': [21, 45, 45, 21],
'orderID': [255, 256, 257, 258],
'paymentType': ['visa', 'bank', 'master', 'paypal']})
会产生:
customerAge customerID customerName orderID paymentType
0 21 1 John 255 visa
1 45 2 Anna 256 bank
2 45 2 Anna 257 master
3 21 1 John 258 paypal
和
order_products = pd.DataFrame({'orderID': [255, 255, 257, 258, 255, 257],
'price': [9.99, 23.40, 15.89, 3.99, 89.50, 23.40],
'productName': ['filter', 'cosmetic', 'shampoo', 'tissues', 'elecBrush', 'cosmetic']})
会产生:
orderID price productName
0 255 9.99 filter
1 255 23.40 cosmetic
2 257 15.89 shampoo
3 258 3.99 tissues
4 255 89.50 elecBrush
5 257 23.40 cosmetic
到下面的东西
预期产量
customerAge customerID customerName orderID paymentType
21 1 John 255 visa 255 9.99 filter
21 1 John 255 visa 255 23.40 cosmetic
21 1 John 255 visa 255 89.50 elecBrush
45 2 Anna 256 bank null null null
45 2 Anna 257 master 257 15.89 shampoo
45 2 Anna 257 master 257 23.40 cosmetic
21 1 John 258 paypal 258 3.99 tissues
据我所知,这是一个SQL左连接。 但是使用
all = customer_orders.join(order_products, on="orderID", how='left', lsuffix='_left', rsuffix='_right')
没有给我我想要的东西(行和NaN太少,而不是第二张表的值)。
我想念什么?
剩下? 不,这是一个外部联接。
customer_orders.merge(order_products, on="orderID", how='outer')
customerAge customerID customerName orderID paymentType price \
0 21 1 John 255 visa 9.99
1 21 1 John 255 visa 23.40
2 21 1 John 255 visa 89.50
3 45 2 Anna 256 bank NaN
4 45 2 Anna 257 master 15.89
5 45 2 Anna 257 master 23.40
6 21 1 John 258 paypal 3.99
productName
0 filter
1 cosmetic
2 elecBrush
3 NaN
4 shampoo
5 cosmetic
6 tissues
尝试使用merge
all = customer_orders.merge(order_products, on="orderID", how='left')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.