[英]Split dataframe and calculate averages for data subsets in R
I have this data frame in R: 我在R中有此数据框:
steps day month
4758 Tuesday December
9822 Wednesday December
10773 Thursday December
I want to iterate over the data frame and apply a function to the steps column based on the value in the month column. 我想遍历数据框,并根据月份列中的值将函数应用于步骤列。 I'm trying to work out the average number of steps per weekday for each month. 我正在尝试计算每月每个工作日的平均步骤数。
I want to output to a new data frame like so where the week days repeat but I only have the average values per day: 我想像这样输出到一个新的数据框,在工作日重复的地方,但是我只有每天的平均值:
average.steps day month
4500 Tuesday December
9000 Wednesday December
1000 Thursday December
I can work out how to work out the averages for the data frame as a whole, but want to use a for loop to apply it just for step values from the same month. 我可以算出如何计算整个数据帧的平均值,但是想使用for循环将其仅应用于同一月份的步长值。
avgsteps <- ddply(DATA, "day", summarise, msteps = mean(steps))
My basic idea for the for function was: 我对于for函数的基本想法是:
f <- function(m in month) {ddply(DATA, "day", summarise, msteps = mean(steps))}
But it won't process it and throws the error: 但是它不会处理它并抛出错误:
Error: unexpected 'in' in "f <- function(m in"
Any help would be greatly appreciated! 任何帮助将不胜感激!
EDIT: 编辑:
SO I've tried @agstudy's suggested fix (below) and it gets the right data structure (single value for each weekday for each month), but the value assigned to each day is identical. 因此,我尝试了@agstudy的建议修复方法(如下),它获得了正确的数据结构(每个月每个工作日的单个值),但是分配给每天的值是相同的。 I'm a bit confused what could be going wrong. 我有点困惑可能出了什么问题。
steps.month.day.avg <- ddply(steps.month.day, .(fitbit.day,fitbit.month), summarise, msteps = mean(steps))
无需在此处循环,您只需更改变量即可分割数据帧,
ddply(DATA, .(day,month), summarise, msteps = mean(steps))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.