简体   繁体   English

将数据帧号平均分组

[英]Divide data frame number equally in groups

I have a data frame with a column of names and more columns containing properties coded by 0 and 1 (equal to no and yes ). 我有一个数据框,其中包含一列名称,更多列包含由0和1编码的属性(等于noyes )。

     Name     Running   Cycling
1     Adam     1         0
2     Steve    0         1
3     Aaron    1         1
4     Nick     1         0
5     Paul     1         0
6     Stuart   1         0

I now want to divide the yes equally into a given number of groups column-wise for all 1s and add the number of the related group in an additional column. 现在,我想将yes均等地分为所有1的给定数量的组,并在另一列中添加相关组的数量。 If we would divide Running and Cycling in two groups each this should be the result: 如果将“跑步”和“骑自行车”分成两组,则应为结果:

Name     Running   Cycling  Running-Group Cycling-Group
1     Adam     1         0        1           0 
2     Steve    0         1        0           1
3     Aaron    1         1        1           2
4     Nick     1         0        1           0
5     Paul     1         0        2           0
6     Stuart   1         0        2           0

I can get the group number with: 我可以通过以下方式获取组号:

ceiling(sum(column)/100*groups)

I am sure there is an easy way with R, however I couldn't find a solution which ignores the 0s ( nos ) and adds the group number only to the 1s ( yes ). 我敢肯定R有一个简单的方法,但是我找不到一个忽略0( nos )并将组号仅添加到1( yes )的解决方案。

Thanks for your help. 谢谢你的帮助。

May be this helps 也许这会有所帮助

nm1 <- paste(names(df1)[-1], 'Group', sep="_")
df1[nm1] <- lapply(df1[-1], function(x) {
                  x1 <- x==1
                  x[x1] <- gl(sum(x1),ceiling(sum(x1)/2), sum(x1))
                   x})
 df1
 #    Name Running Cycling Running_Group Cycling_Group
 #1   Adam       1       0             1             0
 #2  Steve       0       1             0             1
 #3  Aaron       1       1             1             2
 #4   Nick       1       0             1             0
 #5   Paul       1       0             2             0
 #6 Stuart       1       0             2             0

Use the grps function shown below: 使用如下所示的grps函数:

grp <- function(x) { 
  s <- seq_along(x)
  x * ((s > mean(s)) + 1)
}

grps <- function(x) ave(x, x, FUN = grp)

transform(DF, 
  Running_Group = grps(Running),
  Cycling_Group = grps(Cycling))

giving: 赠送:

    Name Running Cycling Running_Group Cycling_Group
1   Adam       1       0             1             0
2  Steve       0       1             0             1
3  Aaron       1       1             1             2
4   Nick       1       0             1             0
5   Paul       1       0             2             0
6 Stuart       1       0             2             0

Note: We used the following as DF : 注意:我们使用以下作为DF

Lines <- "     Name     Running   Cycling
1     Adam     1         0
2     Steve    0         1
3     Aaron    1         1
4     Nick     1         0
5     Paul     1         0
6     Stuart   1         0"

DF <- read.table(text = Lines, header = TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM