简体   繁体   English

对于 R 数据框中的每一列

[英]For each column in R data frame

I was wondering how for loops work in R data frames.我想知道 for 循环如何在 R 数据帧中工作。 This is not a reproducible example, but I'm wondering if the concept can work.这不是一个可重复的例子,但我想知道这个概念是否可行。 If df has a Date, ID, Amount, and 4 variables, can I loop through the columns?如果 df 有日期、ID、金额和 4 个变量,我可以遍历列吗? I need to remove NA rows from columns Var1 to Var4, create a "weight vector" based off of the Amount column, then calculate the weighted mean.我需要从 Var1 到 Var4 列中删除 NA 行,根据 Amount 列创建一个“权重向量”,然后计算加权平均值。

a<- names(df)
a<- a[4:7]

a
[1] "Var1" "Var2" "Var3" "Var4"


#df has Date, ID, Amount ,Var1, Var2, Var3, Var4

for(i in a) {

  NEW <-df[ !is.na(df$i), ]
  NEW <- NEW %>%
    group_by(Date) %>%
    mutate(Weights = Amount/sum(Amount))

  SUM <-  NEW %>%
    group_by(Date) %>%
    summarise(Value = weighted.mean(i, Weights))

  write.csv(SUM , paste0(i, ".csv"))

}

You can loop through column, you have to make slight adjustments for your syntax, though.您可以遍历列,但您必须对语法进行轻微调整。 If you want to index your dataframe with a column name stored in a variable (in your loop the names are stored in the loop variable i ) you can access the column in the following ways:如果要使用存储在变量中的列名索引数据框(在循环中,名称存储在循环变量i ),您可以通过以下方式访问该列:

1.) With the base-R subset syntax you have to use [,i] to subset the column you want: 1.) 使用 base-R 子集语法,您必须使用[,i]对您想要的列进行子集:

df[,i]

NOTE: df$i will not work here.注意: df$i在这里不起作用。

2.) In dplyr functions you have to convert your character variable i to a name of your dataframe in the dplyr sense. 2.) 在dplyr函数中,您必须将字符变量i转换为 dplyr 意义上的数据框名称。 This can be done by the function as.name .这可以通过函数as.name来完成。 Next you have to evaluate the name so that the dplyr functions can work with it.接下来,您必须评估名称,以便 dplyr 函数可以使用它。 This is done by the !!这是由!! ("bang-bang") function: (“砰砰”)功能:

df %>% select(!!as.name(i))

or in your case:或者在你的情况下:

SUM <-  NEW %>%
   group_by(Date) %>%
   summarise(Value = weighted.mean(!!as.name(i), Weights))

Otherwise your syntax seems fine, just loop through a set of names and index the dataframe in the ways I described.Hope this answers your question.否则你的语法看起来不错,只需循环一组名称并按照我描述的方式索引数据框。希望这能回答你的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM