如何在使用plyr时避免重复代码

Question

I want to produce same type of chart for some combination of data. 我想为某些数据组合生成相同类型的图表。 Currently, I am using plyr to split the data and executing some code for each of the combination. 目前，我正在使用plyr来拆分数据并为每个组合执行一些代码。

For example, let's say the dataframe has company, department, region, and revenue. 例如，假设dataframe具有公司，部门，区域和收入。 Here's my pseudocode: 这是我的伪代码：

     d_ply(dataframe, .(company),  function(df) {
      d_ply(df, .(department),  function(df) {
        d_ply(df, .(region), function(df) {
           bar_chart(df$region, df$revenue)
        })
            bar_chart(df$department, df$revenue)
      })
            bar_chart(df$company, df$revenue)
    })

In my real example, I need to do multiple things, and the code is 10 or so lines. 在我的实例中，我需要做多件事，代码是10行左右。 Is there a way to avoid repeating the code in each combination, other than creating a function and just passing the proper parameters? 有没有办法避免重复每个组合中的代码，除了创建一个函数，只是传递适当的参数？ I was hoping that there is some magic plyr trick. 我希望有一些神奇的plyr技巧。

Answer 1

Dummy data: 虚拟数据：

d <- data.frame(company=letters[1:26],
                department=sample(letters[1:10],26,replace=TRUE),
                region=sample(letters[1:3],26,replace=TRUE),
                revenue=round(runif(26)*10000))

Update 更新

I think an explanation of your code is necessary: 我认为有必要对您的代码进行解释：

d_ply(dataframe, .(company),  function(df) { # by company
      d_ply(df, .(department),  function(df) { # by department
        d_ply(df, .(region), function(df) { # by region
           bar_chart(df$region, df$revenue)
           # this part is essentially equal to
           # d_ply(df, .(company,department,region), function(df), plot(df)) 
    })
  bar_chart(df$department, df$revenue)
  # this part is essentially equal 
  # d_ply(df,.(company,department), function(df), fun(df))
  })
 bar_chart(df$company, df$revenue)
 # this part is essentially equal to 
 # d_ply(df,.(company), function(df), fun(df))
})

I find your code to be highly unreadable. 我发现你的代码非常难以理解。 It could be replaced with: 它可以替换为：

some.fun <- function(df, ...) {
# ...
}

d_ply(d, .(company), function(df) some.fun(df, ...))
d_ply(d, .(company,department), function(df) some.fun(df, ...)) 
d_ply(d, .(company,department,region), function(df) some.fun(df, ...))

如何在使用plyr时避免重复代码

问题描述

1 个解决方案

解决方案1
1 已采纳 2012-12-13 19:26:36

Update 更新

如何在使用plyr时避免重复代码

问题描述

1 个解决方案

解决方案1 1 已采纳 2012-12-13 19:26:36

Update 更新

解决方案1
1 已采纳 2012-12-13 19:26:36