I want to combine rows that have almost the same values, but I want to combine the values that are different so I won't loose information that I want to analyse later.
I have the following dataset:
SessionId Client id Product_type Item quantity
1 1 Couch 1
1 1 Table 1
2 2 Couch 1
2 2 Chair 5
I want to have an output like:
SessionId Client id Product_type Item quantity
1 1 Couch, Table 2
2 2 Couch, Chair 6
So I need to merge rows based on the session id. But for the column product type I want to paste character names behind each other and for the item quantity I want to sum the quantities. I have way more columns, but those values can stay the same.
Maybe I need to do it in two steps, but im not sure how to begin. Hopefully someone can help me out.
Try this.
d %>% group_by(SessionId,Client_id) %>%
summarise(prod_type = toString(Product_type),
sum_item_q = sum(Item_quantity, na.rm = T))
output as:
# A tibble: 2 x 4
# Groups: SessionId [2]
SessionId Client_id prod_type sum_item_q
<int> <int> <chr> <int>
1 1 1 Couch, Table 2
2 2 2 Couch, Chair 6
data
structure(list(SessionId = c(1L, 1L, 2L, 2L), Client_id = c(1L,
1L, 2L, 2L), Product_type = c("Couch", "Table", "Couch", "Chair"
), Item_quantity = c(1L, 1L, 1L, 5L)), row.names = c(NA, -4L), class = c("data.table",
"data.frame"))->d
This can be achieved like so
df <- read.table(text = "SessionId 'Client id' Product_type 'Item quantity'
1 1 Couch 1
1 1 Table 1
2 2 Couch 1
2 2 Chair 5", header = TRUE)
library(dplyr)
df %>%
group_by(SessionId, Client.id) %>%
summarise(Product_type = paste(Product_type, collapse = ", "),
Item.quantity = sum(Item.quantity))
#> # A tibble: 2 x 4
#> # Groups: SessionId [2]
#> SessionId Client.id Product_type Item.quantity
#> <int> <int> <chr> <int>
#> 1 1 1 Couch, Table 2
#> 2 2 2 Couch, Chair 6
Created on 2020-05-23 by the reprex package (v0.3.0)
Base R solution:
aggregate(.~SessionId+Client_Id, within(df, {Product_type <- as.character(Product_type)}),
FUN = function(x){if(is.integer(x)){sum(x)}else{toString(as.character(x))}})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.