简体   繁体   English

如何在使用plyr时避免重复代码

[英]How to avoid repeating code while using plyr

I want to produce same type of chart for some combination of data. 我想为某些数据组合生成相同类型的图表。 Currently, I am using plyr to split the data and executing some code for each of the combination. 目前,我正在使用plyr来拆分数据并为每个组合执行一些代码。

For example, let's say the dataframe has company, department, region, and revenue. 例如,假设dataframe具有公司,部门,区域和收入。 Here's my pseudocode: 这是我的伪代码:

     d_ply(dataframe, .(company),  function(df) {
      d_ply(df, .(department),  function(df) {
        d_ply(df, .(region), function(df) {
           bar_chart(df$region, df$revenue)
        })
            bar_chart(df$department, df$revenue)
      })
            bar_chart(df$company, df$revenue)
    })

In my real example, I need to do multiple things, and the code is 10 or so lines. 在我的实例中,我需要做多件事,代码是10行左右。 Is there a way to avoid repeating the code in each combination, other than creating a function and just passing the proper parameters? 有没有办法避免重复每个组合中的代码,除了创建一个函数,只是传递适当的参数? I was hoping that there is some magic plyr trick. 我希望有一些神奇的plyr技巧。

Dummy data: 虚拟数据:

d <- data.frame(company=letters[1:26],
                department=sample(letters[1:10],26,replace=TRUE),
                region=sample(letters[1:3],26,replace=TRUE),
                revenue=round(runif(26)*10000))

Update 更新

I think an explanation of your code is necessary: 我认为有必要对您的代码进行解释:

d_ply(dataframe, .(company),  function(df) { # by company
      d_ply(df, .(department),  function(df) { # by department
        d_ply(df, .(region), function(df) { # by region
           bar_chart(df$region, df$revenue)
           # this part is essentially equal to
           # d_ply(df, .(company,department,region), function(df), plot(df)) 
    })
  bar_chart(df$department, df$revenue)
  # this part is essentially equal 
  # d_ply(df,.(company,department), function(df), fun(df))
  })
 bar_chart(df$company, df$revenue)
 # this part is essentially equal to 
 # d_ply(df,.(company), function(df), fun(df))
})

I find your code to be highly unreadable. 我发现你的代码非常难以理解。 It could be replaced with: 它可以替换为:

some.fun <- function(df, ...) {
# ...
}

d_ply(d, .(company), function(df) some.fun(df, ...))
d_ply(d, .(company,department), function(df) some.fun(df, ...)) 
d_ply(d, .(company,department,region), function(df) some.fun(df, ...))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Plyr,Apply或Similar来矢量化此R代码? - How to Vectorize this R code Using Plyr, Apply, or Similar? R:避免在脚本中使用R子集重复代码行 - R: Avoid repeating lines of code using R subsets in scripts 如何避免在 dplyr::mutate() 调用中使用多个 arguments 重复代码? - How to avoid repeating code in dplyr::mutate() call with multiple arguments? 如何在R上使用plyr时,将不同的随机种子用于网状函数的并行实例 - How to use different random seeds for parallel instances of a reticulate function while using plyr on R 如何重写此代码以便按预期使用plyr / ddply? - How can I rewrite this code so that it uses plyr/ddply as intended? 用lapply或plyr对类别重复分析 - Repeating analysis over categories with lapply or plyr 如何动态创建变量名并为它们动态分配一个函数,以避免在R中重复代码? - How to dynamically create variables names and dynamically assign them a functions to avoid repeating code in R? 如何避免在 function 中重复代码以使用 ggplot2 绘制密度直方图和简单直方图? - How to avoid repeating code in a function to draw a density histogram and a simple histogram with ggplot2? 将 plyr::mapvalues 与 dplyr 一起使用 - Using plyr::mapvalues with dplyr 并行使用plyr - parallel using plyr
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM