[英]How to combine order data into one row in R
我有一个如下所示的数据集。 数据集取自 Kaggle:
https://www.kaggle.com/aksha17/superstore-sales
head(supstore_df, 10)
ï..Row.ID Order.ID Order.Date Ship.Date Ship.Mode Customer.ID Customer.Name
1 1 CA-2016-152156 08-11-16 11-11-16 Second Class CG-12520 Claire Gute
2 2 CA-2016-152156 08-11-16 11-11-16 Second Class CG-12520 Claire Gute
3 3 CA-2016-138688 12-06-16 16-06-16 Second Class DV-13045 Darrin Van Huff
4 4 US-2015-108966 11-10-15 18-10-15 Standard Class SO-20335 Sean O'Donnell
5 5 US-2015-108966 11-10-15 18-10-15 Standard Class SO-20335 Sean O'Donnell
6 6 CA-2014-115812 09-06-14 14-06-14 Standard Class BH-11710 Brosina Hoffman
7 7 CA-2014-115812 09-06-14 14-06-14 Standard Class BH-11710 Brosina Hoffman
8 8 CA-2014-115812 09-06-14 14-06-14 Standard Class BH-11710 Brosina Hoffman
9 9 CA-2014-115812 09-06-14 14-06-14 Standard Class BH-11710 Brosina Hoffman
10 10 CA-2014-115812 09-06-14 14-06-14 Standard Class BH-11710 Brosina Hoffman
Segment Country City State Postal.Code Region Product.ID
1 Consumer United States Henderson Kentucky 42420 South FUR-BO-10001798
2 Consumer United States Henderson Kentucky 42420 South FUR-CH-10000454
3 Corporate United States Los Angeles California 90036 West OFF-LA-10000240
4 Consumer United States Fort Lauderdale Florida 33311 South FUR-TA-10000577
5 Consumer United States Fort Lauderdale Florida 33311 South OFF-ST-10000760
6 Consumer United States Los Angeles California 90032 West FUR-FU-10001487
7 Consumer United States Los Angeles California 90032 West OFF-AR-10002833
8 Consumer United States Los Angeles California 90032 West TEC-PH-10002275
9 Consumer United States Los Angeles California 90032 West OFF-BI-10003910
10 Consumer United States Los Angeles California 90032 West OFF-AP-10002892
Category Sub.Category
1 Furniture Bookcases
2 Furniture Chairs
3 Office Supplies Labels
4 Furniture Tables
5 Office Supplies Storage
6 Furniture Furnishings
7 Office Supplies Art
8 Technology Phones
9 Office Supplies Binders
10 Office Supplies Appliances
Product.Name Sales Quantity
1 Bush Somerset Collection Bookcase 261.9600 2
2 Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400 3
3 Self-Adhesive Address Labels for Typewriters by Universal 14.6200 2
4 Bretford CR4500 Series Slim Rectangular Table 957.5775 5
5 Eldon Fold 'N Roll Cart System 22.3680 2
6 Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood 48.8600 7
7 Newell 322 7.2800 4
8 Mitel 5320 IP Phone VoIP phone 907.1520 6
9 DXL Angle-View Binders with Locking Rings by Samsill 18.5040 3
10 Belkin F5C206VTEL 6 Outlet Surge 114.9000 5
Discount Profit
1 0.00 41.9136
2 0.00 219.5820
3 0.00 6.8714
4 0.45 -383.0310
5 0.20 2.5164
6 0.00 14.1694
7 0.00 1.9656
8 0.20 90.7152
9 0.20 5.7825
10 0.00 34.4700
>
我想将同一个订单的行合并成一行,并且订购的项目在该行的单个字段中连接。
我试过这个:
df_supstore_order_list <- ddply(supstore_dfcopy,c("Customer.Name", "Product.Name", "Customer.ID", "Segment", "Category", "Sub.Category"), function(supstore_dfcopy)paste(supstore_dfcopy$Product.Name,supstore_dfcopy$Customer.ID, supstore_dfcopy$Segment, supstore_dfcopy$Category, supstore_dfcopy$Sub.Category collapse = ","))
但生成的数据框如下所示:
head(df_supstore_order_list, 5)
Customer.Name Product.Name Customer.ID
1 Claire Gute Bush Somerset Collection Bookcase CG-12520
2 Claire Gute Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back CG-12520
3 Darrin Van Huff Self-Adhesive Address Labels for Typewriters by Universal DV-13045
4 Sean O'Donnell Bretford CR4500 Series Slim Rectangular Table SO-20335
5 Sean O'Donnell Eldon Fold 'N Roll Cart System SO-20335
Segment Category Sub.Category
1 Consumer Furniture Bookcases
2 Consumer Furniture Chairs
3 Corporate Office Supplies Labels
4 Consumer Furniture Tables
5 Consumer Office Supplies Storage
V1
1 Bush Somerset Collection Bookcase CG-12520 Consumer Furniture Bookcases
2 Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back CG-12520 Consumer Furniture Chairs
3 Self-Adhesive Address Labels for Typewriters by Universal DV-13045 Corporate Office Supplies Labels
4 Bretford CR4500 Series Slim Rectangular Table SO-20335 Consumer Furniture Tables
5 Eldon Fold 'N Roll Cart System SO-20335 Consumer Office Supplies Storage
如您所见,客户名称等并未按照我的要求合并到一个列中。 关于如何做到这一点的任何建议?
虽然不太清楚你想结合什么,但据我所知你想结合
IE。 对于同一个人下多个订单,您希望将这些订单合并到一行中。
因此,
Name ID Item Category ItemsOrdered
John 1 book ----> John, 1 book, toy
John 1 toy
所以假设这个假设是正确的(如果不正确,请告诉我)。
df <- data.frame(name = c('John', 'John', 'Jane', 'Jane'), id = c(1, 1, 2, 2), item = c('chair' , 'desk', 'hat' , 'shirt'))
df %>%
# Group by columns that identify Items you would like in the same row
group_by(name, id) %>%
# paste together all items with ", "
summarise(ItemsOrdered = paste(item, collapse = ', ')) %>%
# Unite the Columns you grouped by
unite(col = Category, name, id, sep = ', ')
# A tibble: 2 x 2
Category ItemsOrdered
<chr> <chr>
1 Jane, 2 hat, shirt
2 John, 1 chair, desk
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.