简体   繁体   English

R - 使用dcast汇总具有变量名称的数据

[英]R - Aggregate data with variable name using dcast

Python guy new to R , so forgive the naive question. Python家伙新R所以原谅幼稚的问题。

I have an R dataframe named metrics with four columns: 我有一个名为metricsR数据框,有四列:

I want to pass the level of aggregation ( day or week ) as a variable to dcast for aggregation. 我想将聚合级别( dayweek )作为变量传递给dcast进行聚合。

agg_level <- c("week")

If I hard-code week in the in the function, it aggregates data for each week correctly: 如果我在函数中硬编码week ,它会正确地聚合每周的数据:

  • met <- dcast(metrics, week ~ city, value.var = count, fun.aggregate = sum)
  • Output: 输出:

week NYC CHI SF week NYC CHI SF

2015-10-18 1 2 3 2015-10-18 1 2 3

2015-10-25 4 5 6 2015-10-25 4 5 6

If I replace week with the variable, it fails. 如果我用变量替换week ,则失败。 (It aggregates data for all weeks.) (它汇总了所有周的数据。)

  • met <- dcast(metrics, agg_level ~ city, value.var = count, fun.aggregate = sum)

  • Output: 输出:

agg_level NYC CHI SF agg_level NYC CHI SF

week 5 7 9 week 5 week 7 9

Based on this , metrics[[agg_level]] extracts a column from variable, but this fails: 基于metrics[[agg_level]]从变量中提取列,但这会失败:

  • met <- dcast(m, [[agg_level]] ~ city, value.var = metric, fun.aggregate = sum)

  • Error in (function ... unexpected '[['

What is the correct way to do this? 这样做的正确方法是什么?

The formula argument of dcast expects that the words passed to it are column/variable names inside of the data.frame x. dcast的公式参数期望传递给它的单词是data.frame x中的列/变量名。 It does not recognize or resolve the fact that "agg_level" is a variable. 它不识别或解决“agg_level”是变量的事实。 As such, you have two options: 因此,您有两种选择:

# Option 1
# Do some text operations to make the formula based on variables.
if(this==that) {agg_level <- 'week'} else {agg_level <- 'day'}
myFormula <- sprintf("%s ~ city", agg_level)
met <- dcast(metrics, as.formula(myFormula), sum, value.var = metric)

# Option 2 - Untested
# Take advantage of dcast's alternative to the formula notation and pass a list instead.
# No idea if this will work.
met <- dcast(metrics, list(.(agg_level),.(city)), sum, value.var=metric)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM