[英]String grouping (aggregation) with data.table (R 3.1.1)
Input: I have this data: 输入:我有这些数据:
library(data.table)
ids <- c(10, 10, 10, 11, 12, 12)
items <- c('soup', 'rice', 'lemon', 'chicken', 'lamb', 'noodles')
orders <- as.data.table(list(id=ids, item=items))
> orders
id item
1: 10 soup
2: 10 rice
3: 10 lemon
4: 11 chicken
5: 12 lamb
6: 12 noodles
Goal: Need to arrive at this (group all items by their id): 目标:需要达到此目的(按ID分组所有项目):
id items
1: 10 soup,rice,lemon
2: 11 chicken
3: 12 lamb,noodles
What I did: I am using data.table on R 3.1.1 (latest release) - tried the below method, which should work: 我做了什么:我在R 3.1.1(最新版本)上使用data.table - 尝试了下面的方法,它应该工作:
orders[,list(items=list(item)), by=id]
But getting the below (incorrect) output: 但得到以下(不正确)输出:
id items
1: 10 lamb,noodles,lemon
2: 11 lamb,noodles,lemon
3: 12 lamb,noodles,lemon
What am I doing wrong, and what is the right way to group strings correctly with data.table? 我做错了什么,用data.table正确分组字符串的正确方法是什么?
orders[, paste(item, collapse = ","), by = id]
## id V1
## 1: 10 soup,rice,lemon
## 2: 11 chicken
## 3: 12 lamb,noodles
The syntax for what it sounds like you're looking for is a little bit awkward, but makes sense when you think about how you would normally use list
. 听起来像你正在寻找的语法有点尴尬,但是当你想到你通常如何使用
list
时,这是有道理的。
Try the following: 请尝试以下方法:
orders[, list(item = list(item)), by = "id"]
# id item
# 1: 10 soup,rice,lemon
# 2: 11 chicken
# 3: 12 lamb,noodles
str(.Last.value)
# Classes ‘data.table’ and 'data.frame': 3 obs. of 2 variables:
# $ id : num 10 11 12
# $ item:List of 3
# ..$ : chr "soup" "rice" "lemon"
# ..$ : chr "chicken"
# ..$ : chr "lamb" "noodles"
# - attr(*, ".internal.selfref")=<externalptr>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.