简体   繁体   English

R data.table规则,列名为字符串

[英]R data.table rule with column name as a string

I have a data.table which looks like: 我有一个data.table看起来像:

>DT
   ID Year Value ABC_1 ABC_2 ABC_3
1:  3 2015     5     0     1     0
2:  4 2015     2     1     0     1
3:  5 2015     1     0     1     1

What I want to do for each ABC_... is: 我要为每个ABC _...做的是:

> unique(DT[Year == 2015 & ABC_1 == 1, .(Year = Year, ABC = ABC_1, N = .N, MEAN = mean(Value))])
   Year ABC N MEAN
1: 2015   1 1    2
> unique(DT[Year == 2015 & ABC_2 == 1, .(Year = Year, ABC = ABC_2, N = .N, MEAN = mean(Value))])
   Year ABC N MEAN
1: 2015   1 2    3
> unique(DT[Year == 2015 & ABC_3 == 1, .(Year = Year, ABC = ABC_3, N = .N, MEAN = mean(Value))])
   Year ABC N MEAN
1: 2015   1 2  1.5

I have over 20 columns with ABC_... and I would like to put this statement in a for-loop. 我的ABC _...有20多个列,我想将此语句放入for循环中。 My problem is that the selection / rule needs the column name. 我的问题是选择/规则需要列名。 It doesn't work with that: 它不起作用:

> abc_name <- names(DT)[names(DT) %like% 'ABC']
> abc_name
[1] "ABC_1" "ABC_2" "ABC_3"
> abc_row<- data.table(Year=0, ABC=0, N=0, MEAN=0)
> for (i in 1: length(abc_name)){
+   
+   temp_row <- unique(DT[Year == 2015 & abc_name[i] == 1, .(Year = Year, ABC = abc_name[i], N = .N, MEAN = mean(Value))])
+   abc_row <- rbind(abc_row, temp_row)
+ }
> abc_row
   Year ABC N MEAN
1:    0   0 0    0

temp_row is empty... When I change the abc_name[I] with ABC_1 it works: temp_row为空...当我用ABC_1更改abc_name[I] ,它起作用:

> abc_name <- names(DT)[names(DT) %like% 'ABC']
> abc_name
[1] "ABC_1" "ABC_2" "ABC_3"
> abc_row<- data.table(Year=0, ABC=0, N=0, MEAN=0)
> for (i in 1: length(abc_name)){
+ 
+   temp_row <- unique(DT[Year == 2015 & ABC_1 == 1, .(Year = Year, ABC = ABC_1, N = .N, MEAN = mean(Value))])
+   abc_row <- rbind(abc_row, temp_row)
+ }
> abc_row
   Year ABC N MEAN
1:    0   0 0    0
2: 2015   1 1    2
3: 2015   1 1    2
4: 2015   1 1    2

How can I use the abc_name in a for-loop that my script works? 如何在脚本可以工作的for循环中使用abc_name I hope you understand my question and someone can help me. 希望您能理解我的问题,并且有人可以帮助我。

Loop through the name vector ('abc_name') using lapply , apply the logic in the OP's post, get the value of the column with get and rbind the list elements. 遍历名称矢量使用(“abc_name”) lapply ,应用逻辑在OP的帖子,获得与列的值getrbindlist元素。

lst <- lapply(abc_name, function(nm)
          unique(DT[Year == 2015 & get(nm) == 1,
          .(Year = Year, ABC = get(nm), N = .N, MEAN = mean(Value))]))

rbindlist(lst)
#   Year ABC N MEAN
#1: 2015   1 1  2.0
#2: 2015   1 2  3.0
#3: 2015   1 2  1.5

Or another option is melt to reshape the 'wide' to 'long' format, grouped by 'variable' and 'year', and specifying the logical index in 'i' ( value==1 ), summarise the dataset 或另一种选择是melt以将“宽”格式重塑为“长”格式,按“变量”和“年”分组,并在“ i”中指定逻辑索引( value==1 ),汇总数据集

melt(DT, measure = abc_name)[value==1, .(ABC=1, N= .N, 
     MEAN= mean(Value)), .(variable, Year)][, variable := NULL][]
#   Year ABC N MEAN
#1: 2015   1 1  2.0
#2: 2015   1 2  3.0
#3: 2015   1 2  1.5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM