简体   繁体   English

R dplyr或purrr group_by到向量列表

[英]R dplyr or purrr group_by to list of vectors

I have data coming in from a database in key:value pairs, such as: year:2012 discipline:'Chemistry' subject:'General Chemistry' subject:'General, Organic, and Biochemistry' 我从关键数据库中获取数据:价值对,例如:年:2012年纪律:'化学'主题:'普通化学'主题:'一般,有机和生物化学'

incoming = tibble(field = c('year', 'discipline', 'subject', 'subject'), 
setting = c(2012, 'Chemistry', 'General Chemistry', 'General, Organic, and Biochemistry'))

I would like to group_by the key, and create a list with values = a vector of all values in that group, such as: 我想group_by键,并创建一个列表,其值为该组中所有值的向量,例如:

$year = 2012
$discipline = 'Chemistry'
$subject = c('General Chemistry', 'General, Organic, and Biochemistry')

I know I could paste() and collapse them into, say, a |-separated string, and then break that back apart... but I figure there's probably a tidy function that can do it in one step. 我知道我可以粘贴()并将它们折叠成一个分离的字符串,然后将它分开...但我认为可能有一个整洁的功能可以一步完成。 Suggestions? 建议?

I'm thinking it will be something like this, but I'm not sure what to put at the end of the pipe: 我想它会是这样的,但我不确定在管道末端放什么:

processed = incoming %>%
   group_by(field) %>%
   awesome_listmaker_function()
split(incoming$setting, incoming$field)
# $discipline
# [1] "Chemistry"
#
# $subject
# [1] "General Chemistry"                  "General, Organic, and Biochemistry"
#
# $year
# [1] "2012"

If you're receiving multiple groups at a time from the database, then it gets a little more complicated. 如果您一次从数据库接收多个组,那么它会变得更复杂一些。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM