简体   繁体   English

合并两个列表组件

[英]Merge two list components

I have a big list, but micro example would be like the following: 我有一个很大的列表,但微观示例如下:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 
mylist <- list(A=A, B=B, C= C)

expected output is merge A with B so that each component will look like AB 预期输出是将A与B合并,以便每个组件看起来像AB

AA, aA, Aa, aa, Aa

better should be sorted, upper case is always first 更好的应该排序,大写总是第一

AA, Aa, Aa, aa, Aa

Thus new list or matrix should have two columns or rows: 因此,新列表或矩阵应该有两列或多行:

AA, Aa, Aa, aa, Aa
1,   2, 3,   1, 4

Now I want calculate average of C based on class - "AA", "Aa", and "aa" 现在我想根据类来计算C的平均值 - “AA”,“Aa”和“aa”

Looks simple but I could not figure out easily. 看起来很简单但我无法轻易弄明白。

> (ab <- paste(A, B, sep="") )
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- paste(A, B, sep="") )  # the joining step
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- sub("([a-z])([A-Z])", "\\2\\1", ab) ) # swap lowercase uppercase
[1] "AA" "Aa" "Aa" "aa" "Aa"

> rbind(ab, C)                  # matrix
   [,1] [,2] [,3] [,4] [,5]
ab "AA" "Aa" "Aa" "aa" "Aa"
C  "1"  "2"  "3"  "1"  "4" 
> data.frame(alleles=ab, count=C)  # dataframes are lists
  alleles count
1      AA     1
2      Aa     2
3      Aa     3
4      aa     1
5      Aa     4

I can do it if your data is arranged in a data.frame using the package plyr 如果使用包plyr将数据排列在data.frame中,我可以这样做

> A <- c("A", "a", "A", "a", "A")
> B <- c("A", "A", "a", "a", "a")
> C <- c(1, 2, 3, 1, 4) 
> groups <- sort(paste(A, B, sep=""))
[1] "AA" "aA" "Aa" "aa" "Aa"
> my.df <- data.frame(A=A, B=B, C=C, group=groups)

> require(plyr)
> result <- ddply(my.df, "group", transform, group.means=mean(C))
> result[order(result$group, decreasing=TRUE),]
  A B C group group.means
5 A A 1    AA         1.0
3 A a 3    Aa         3.5
4 A a 4    Aa         3.5
2 a A 2    aA         2.0
1 a a 1    aa         1.0

With your data: 使用您的数据:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 

I define a data.frame using the combination of A and B as the key column: 我使用A和B的组合作为键列来定义data.frame

AB <- paste(A, B, sep='')
df <- data.frame(id=AB, C=C)

> df
  id C
1 AA 1
2 aA 2
3 Aa 3
4 aa 1
5 Aa 4

If you need to order this data.frame before the aggregation then: 如果您需要在聚合之前订购此data.frame ,那么:

df <- df[order(AB, decreasing=TRUE),]

> df
  id C
 1 AA 1
 3 Aa 3
 5 Aa 4
 2 aA 2
 4 aa 1

And with aggregate you calculate the mean for each id : 并使用aggregate计算每个id的平均值:

meanDF <- aggregate(C~id, data=df, mean)

> meanDF

  id   C
1 aa 1.0
2 aA 2.0
3 Aa 3.5
4 AA 1.0

But if you want to order after the aggregation, then: 但是如果你想在聚合后订购,那么:

df <- data.frame(id=AB, C=C)
meanDF <- aggregate(C~id, data=df, mean)
meanDF <- meanDF[order(meanDF$id, decreasing=TRUE),]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM