简体   繁体   English

r:有人可以向我解释这个 dplyr 代码吗?

[英]r: can someone explain this dplyr code to me?

Pretty straightforward question.很直接的问题。 Here is the code:这是代码:

library(dplyr)
library(tidyr)

mtcars %>% group_by(gear) %>% select(hp, disp) %>% 
summarise_all(funs(n=sum(!is.na(.)), mean=mean(.,na.rm=T))) %>% 
gather(variable, value, -gear) %>% 
arrange(gear, sub('_.*', '', variable), sub('.*_', '', variable)) %>%
separate(variable, into = c('var', 'metric'), '_')

I understand everything up to the gather statement.我明白一切,直到gather声明。 I'm not familiar with these functions and the help files are not very useful.我不熟悉这些功能,帮助文件也不是很有用。

Can anyone walk me through this?任何人都可以引导我完成这个吗? I'd like to build a function around these commands, but I need to understand how this all works before doing that.我想围绕这些命令构建一个函数,但在执行此操作之前,我需要了解这一切是如何工作的。

gather moves from "wide" format to long format, -gear means don't gather gear . gather从“宽”格式到长格式, -gear表示不收集gear gather puts the remaining columns into a single variable and value column. gather将剩余的列放入单个variablevalue列中。

arrange just sorts by gear, the sub statements are useless, you could change the arrange row to arrange(gear, variable) . arrange只是按齿轮排序, sub语句没用,您可以将arrange行更改为arrange(gear, variable)

separate splits the variable column into two using _ as the delimiter separate分割可变列分为两个使用_作为分隔符

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM