简体   繁体   English

使用transform和plyr在R中添加计数列

[英]Using transform and plyr to add a counting column in R

I have a two-level dataset (let's say classes nested within schools) and the dataset was coded 我有一个两级数据集(假设在学校内嵌套的类),数据集已编码

Like this: 像这样:

School  Class
  A       1
  A       1
  A       2
  A       2
  B       1
  B       1
  B       2
  B       2

But to run an analysis I need the data to have a unique Class ID, regardless of school membership. 但是要进行分析,我需要数据具有唯一的类ID,无论学校成员身份如何。

School  Class  NewClass
  A       1       1
  A       1       1
  A       2       2
  A       2       2
  B       1       3
  B       1       3
  B       2       4
  B       2       4 

I tried using transform and ddply, but I'm not sure how to keep NewClass continually incrementing to a larger number for each combination of School and Class. 我尝试使用transform和ddply,但我不确定如何让NewClass为School和Class的每个组合不断增加一个更大的数字。 I can think of a few inelegant ways to do this, but I'm sure there are much easy solutions I just can't think of right now. 我可以想到一些不太优雅的方法来做到这一点,但我确信有一些我现在想不到的简单解决方案。 Any help would be appreciated! 任何帮助,将不胜感激!

using interaction to create a factor, and then coerce it to integer: 使用interaction创建一个因子,然后将其强制转换为整数:

transform(dat,nn = as.integer(interaction(Class,School)))
  School Class nn
1      A     1  1
2      A     1  1
3      A     2  2
4      A     2  2
5      B     1  3
6      B     1  3
7      B     2  4
8      B     2  4

Using data.table : 使用data.table

library(data.table)
dt = as.data.table(your_df)

dt[, NewClass := .GRP, by = list(School, Class)]
dt
#   School Class NewClass
#1:      A     1        1
#2:      A     1        1
#3:      A     2        2
#4:      A     2        2
#5:      B     1        3
#6:      B     1        3
#7:      B     2        4
#8:      B     2        4

.GRP is simply a group counter. .GRP只是一个群体计数器。 Also note that you don't really need to do this and can keep using the above combination list(School, Class) in whatever by operation you need to do. 还要注意的是,你并不真的需要做到这一点,可以继续使用上述组合list(School, Class)在任何by操作你需要做的。


Note that from data.table versions >= 1.9.0 , a function setDT is exported that converts a data.frame to data.table by reference (no copy is made), in case you'd want to stick to data.tables. 需要注意的是,从data.table版本>= 1.9.0 ,功能setDT出口,其转换data.framedata.table引用(无副本制),在情况下,你要坚持到data.tables。

require(data.table) ## >= 1.9.0
setDT(your_df)      ## your_df is now a data.table, changed by reference.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM