简体   繁体   English

在R中经过一轮后,应用整数因子求和

[英]Apply sum by integer factor after make a round in R

here is my questions: I got data with 3000 obs. 这是我的问题:我获得了3000 obs的数据。 and 5000 features, the 3000 obs. 和5000个功能,即3000 obs。 has a numeric names like 100.1,100.3,100.5,100.7. 具有数字名称,如100.1、100.3、100.5、100.7。 I changed the names into a integer variables by segs <-as.integer(names) , then I want to use segs as a factor to sum all of the 3000 features. 我通过segs <-as.integer(names)将名称更改为整数变量,然后我想将segs作为一个因素来segs <-as.integer(names)所有3000个功能。 The length of the segs is 300 so the final data frame is 300 by 5000. I know tapply could be used to get the sum by factor for one variable but I have to use for to get all of the 5000 features summed. 在长度segs是300所以最终的数据帧300 5000我知道tapply可以用来获得由系数总和一个变量,但我不得不使用for能得到大家总结了5000层的功能。 It is really time-consuming, so I want to know if there is a clear way in R to solve those problems or if there are some packages to solve this kind of problem. 这确实很耗时,所以我想知道R中是否有解决这些问题的明确方法,或者是否有解决这些问题的软件包。

This is the dirty code and df0 is the data while df is what I want: 这是肮脏的代码,而df0是数据,而df是我想要的:

df <- data.frame()
 for(i in 2:ncol(df0)-1){
    temp <- tapply(df0[,i],df2$segs,sum)
    df <- cbind(df,temp)
}

Thanks! 谢谢!

===== =====

Thanks, Roland, a demo data is shown as follows: 谢谢罗兰,演示数据如下所示:

set.seed(42)    
df0 <- data.frame(
X = rnorm(100,10,10),
Y = rnorm(100), 
Z = rnorm(100))
df0$seq <- as.integer(df0$X)

Try this... 尝试这个...

set.seed(42)    
df0 <- data.frame(
    X = rnorm(100,10,10),
    Y = rnorm(100), 
    Z = rnorm(100))
df0$seq <- as.integer(df0$X)

library(data.table)
dt = data.table(df0)
dt[,lapply(.SD, sum), by=seq ]

    seq           X            Y           Z
 1:  23 164.8144774  1.293768670 -3.74807730
 2:   4   8.9247301  1.909529066 -0.06277254
 3:  13  40.2090180 -2.036599633  0.88836392
 4:  16 147.8571697 -2.571487358 -1.35542918
 5:  14  72.1640142  0.432493959 -1.49983832
 6:   8  42.8498355 -0.582031919 -1.35989852
 7:  25  75.9995653  0.896369560 -1.08024329
 8:   9  27.5244048  0.833429855 -1.19363017
 9:  30  30.1842371  0.188193035 -0.64574372
10:  32  32.8664539  0.108072728  2.03697217
11:  -3  -7.5714175 -0.899304085 -1.27286230
12:   7  29.6254908 -0.929790177  2.75906514

27:  12  50.2535374 -0.620793351 -3.80900436
28:  24  24.4410126 -0.433169033 -0.02671746
29: -19 -19.9309008 -0.533492330 -1.01759612
30:  11  11.8523056 -1.071782384  0.96954501
31:  19  38.5407490 -0.751408534 -4.81312992
32:   0  -0.9642319  1.453325156  2.20977601
33:  -1  -4.3685646 -0.834654913 -0.24624546
34:  18  18.2177311 -1.594588162  0.27369527
35:  -4  -4.5921400  0.586487537  0.86256338

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM