简体   繁体   English

R:如何在组内创建四分位数列

[英]R: How to create a Quartile Column within Groups

I have managed to create the column "qaurtile" with the following code, but I'd also like to create a column called "quartile_team" that shows the quartiles within each team. 我已经使用以下代码成功创建了“ qaurtile”列,但我也想创建一个名为“ quartile_team”的列,该列显示每个团队中的四分位数。 I can't figure out how to do this. 我不知道该怎么做。

Help is appreciated, 感谢您的帮助,

Paul 保罗

# generate dataset
teams <- c(rep("East", 6), rep("West", 8), rep("North", 7), rep("South", 9))
time_spent <- rnorm(30)
dataset <- as.data.frame(cbind(teams, time_spent))
dataset$time_spent <- as.numeric(dataset$time_spent)

# create quartile column
 dataset <- within(dataset,
                    quartile <- cut(x = time_spent,
                                    breaks = quantile(time_spent, probs = seq(0, 1, 0.25)),
                                    labels = FALSE,
                                    include.lowest = TRUE))

There's far better way to do this but a quick and dirty solution would probably use plyr. 有更好的方法来执行此操作,但是快速而肮脏的解决方案可能会使用plyr。 I'll use your function for calculating quartiles within: 我将使用您的函数在以下范围内计算四分位数:

library(plyr)


ddply(dataset, "teams", function(team){

  team_quartile <- cut(x = team$time_spent, breaks = quantile(team$time_spent, probs = seq(0, 1, 0.25)),
                       labels = FALSE,
                       include.lowest = TRUE)

  data.frame(team, team_quartile)
})

Basically, you want to split the data frame up by the team and then perform the calculation on each subset of the data frame. 基本上,您想由团队拆分数据框架,然后对数据框架的每个子集执行计算。 You could use tapply for this as well. 您也可以为此使用tapply。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM