繁体   English   中英

如何在R中将订单数据合并为一行

[英]How to combine order data into one row in R

我有一个如下所示的数据集。 数据集取自 Kaggle:

https://www.kaggle.com/aksha17/superstore-sales

  head(supstore_df, 10)
   ï..Row.ID       Order.ID Order.Date Ship.Date      Ship.Mode Customer.ID   Customer.Name
1          1 CA-2016-152156   08-11-16  11-11-16   Second Class    CG-12520     Claire Gute
2          2 CA-2016-152156   08-11-16  11-11-16   Second Class    CG-12520     Claire Gute
3          3 CA-2016-138688   12-06-16  16-06-16   Second Class    DV-13045 Darrin Van Huff
4          4 US-2015-108966   11-10-15  18-10-15 Standard Class    SO-20335  Sean O'Donnell
5          5 US-2015-108966   11-10-15  18-10-15 Standard Class    SO-20335  Sean O'Donnell
6          6 CA-2014-115812   09-06-14  14-06-14 Standard Class    BH-11710 Brosina Hoffman
7          7 CA-2014-115812   09-06-14  14-06-14 Standard Class    BH-11710 Brosina Hoffman
8          8 CA-2014-115812   09-06-14  14-06-14 Standard Class    BH-11710 Brosina Hoffman
9          9 CA-2014-115812   09-06-14  14-06-14 Standard Class    BH-11710 Brosina Hoffman
10        10 CA-2014-115812   09-06-14  14-06-14 Standard Class    BH-11710 Brosina Hoffman
     Segment       Country            City      State Postal.Code Region      Product.ID
1   Consumer United States       Henderson   Kentucky       42420  South FUR-BO-10001798
2   Consumer United States       Henderson   Kentucky       42420  South FUR-CH-10000454
3  Corporate United States     Los Angeles California       90036   West OFF-LA-10000240
4   Consumer United States Fort Lauderdale    Florida       33311  South FUR-TA-10000577
5   Consumer United States Fort Lauderdale    Florida       33311  South OFF-ST-10000760
6   Consumer United States     Los Angeles California       90032   West FUR-FU-10001487
7   Consumer United States     Los Angeles California       90032   West OFF-AR-10002833
8   Consumer United States     Los Angeles California       90032   West TEC-PH-10002275
9   Consumer United States     Los Angeles California       90032   West OFF-BI-10003910
10  Consumer United States     Los Angeles California       90032   West OFF-AP-10002892
          Category Sub.Category
1        Furniture    Bookcases
2        Furniture       Chairs
3  Office Supplies       Labels
4        Furniture       Tables
5  Office Supplies      Storage
6        Furniture  Furnishings
7  Office Supplies          Art
8       Technology       Phones
9  Office Supplies      Binders
10 Office Supplies   Appliances
                                                       Product.Name    Sales Quantity
1                                 Bush Somerset Collection Bookcase 261.9600        2
2       Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back 731.9400        3
3         Self-Adhesive Address Labels for Typewriters by Universal  14.6200        2
4                     Bretford CR4500 Series Slim Rectangular Table 957.5775        5
5                                    Eldon Fold 'N Roll Cart System  22.3680        2
6  Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood  48.8600        7
7                                                        Newell 322   7.2800        4
8                                    Mitel 5320 IP Phone VoIP phone 907.1520        6
9              DXL Angle-View Binders with Locking Rings by Samsill  18.5040        3
10                                 Belkin F5C206VTEL 6 Outlet Surge 114.9000        5
   Discount    Profit
1      0.00   41.9136
2      0.00  219.5820
3      0.00    6.8714
4      0.45 -383.0310
5      0.20    2.5164
6      0.00   14.1694
7      0.00    1.9656
8      0.20   90.7152
9      0.20    5.7825
10     0.00   34.4700
> 

我想将同一个订单的行合并成一行,并且订购的项目在该行的单个字段中连接。

我试过这个:

df_supstore_order_list <- ddply(supstore_dfcopy,c("Customer.Name", "Product.Name", "Customer.ID", "Segment", "Category", "Sub.Category"), function(supstore_dfcopy)paste(supstore_dfcopy$Product.Name,supstore_dfcopy$Customer.ID, supstore_dfcopy$Segment, supstore_dfcopy$Category, supstore_dfcopy$Sub.Category                                               collapse = ","))

但生成的数据框如下所示:

head(df_supstore_order_list, 5)
    Customer.Name                                                Product.Name Customer.ID
1     Claire Gute                           Bush Somerset Collection Bookcase    CG-12520
2     Claire Gute Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back    CG-12520
3 Darrin Van Huff   Self-Adhesive Address Labels for Typewriters by Universal    DV-13045
4  Sean O'Donnell               Bretford CR4500 Series Slim Rectangular Table    SO-20335
5  Sean O'Donnell                              Eldon Fold 'N Roll Cart System    SO-20335
    Segment        Category Sub.Category
1  Consumer       Furniture    Bookcases
2  Consumer       Furniture       Chairs
3 Corporate Office Supplies       Labels
4  Consumer       Furniture       Tables
5  Consumer Office Supplies      Storage
                                                                                                   V1
1                             Bush Somerset Collection Bookcase CG-12520 Consumer Furniture Bookcases
2      Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back CG-12520 Consumer Furniture Chairs
3 Self-Adhesive Address Labels for Typewriters by Universal DV-13045 Corporate Office Supplies Labels
4                    Bretford CR4500 Series Slim Rectangular Table SO-20335 Consumer Furniture Tables
5                            Eldon Fold 'N Roll Cart System SO-20335 Consumer Office Supplies Storage

如您所见,客户名称等并未按照我的要求合并到一个列中。 关于如何做到这一点的任何建议?

虽然不太清楚你想结合什么,但据我所知你想结合

  1. 客户标识符为 1 列
  2. 产品标识符为 1 列。

IE。 对于同一个人下多个订单,您希望将这些订单合并到一行中。

因此,

Name   ID   Item       Category    ItemsOrdered
John    1   book ----> John, 1     book, toy
John    1   toy

所以假设这个假设是正确的(如果不正确,请告诉我)。

df <- data.frame(name = c('John', 'John', 'Jane', 'Jane'), id = c(1, 1, 2, 2), item = c('chair' , 'desk', 'hat' , 'shirt'))

df %>% 
  # Group by columns that identify Items you would like in the same row
  group_by(name, id) %>% 
  # paste together all items with ", "
  summarise(ItemsOrdered = paste(item, collapse = ', ')) %>% 
  # Unite the Columns you grouped by
  unite(col = Category, name, id, sep = ', ') 

# A tibble: 2 x 2
  Category ItemsOrdered
  <chr>    <chr>       
1 Jane, 2  hat, shirt  
2 John, 1  chair, desk 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM