R using 't' with ddply

Question

I need to transform some data like this:

df<-data.frame(Plate=c("4660", "4660", "4660", "4660", "4660", "4660", "4660", "4660", "4660", "4660", "4660"), Well=c("A1", "A2", "A3", "A4", "B1", "B2", "B3", "C1", "C2", "C3", "C4"), Result=c(1, 10, 100, 1000, 1, 10, 100, 1, 10, 100, 1000), Compound=c("C1", "C1", "C1", "C1", "C2", "C2", "C2", "C3", "C3", "C3", "C3"))
cmpds <- ddply(df, .(Compound), .fun = "t")

What I want to end up with is this:

   1     2     3     4
A  1     10    100   1000
B  1     10    100   NA
C  1     10    100   1000

Is there a way to fill the missing B4 row with NA or just ignore it? The t function or ddply seem to be choking on the fact that B is a different length than the others.

Thanks, J--

Answer 1

Like @Justin, I am assuming your column names are coming from the numeric part of the well specification. If so, here is a slightly more general solution (will work for non-single digit numbers and non-single letter, um, letters.

library("gsubfn")
library("reshape2")

wells <- strapply(as.character(df$Well), ".*([A-Z]+)([0-9]+)", c, simplify=rbind)
colnames(wells) <- c("well.letter", "well.number")
df <- cbind(df, wells)

Then use dcast :

> dcast(df, Compound~well.number, value.var="Result")
  Compound 1  2   3    4
1       C1 1 10 100 1000
2       C2 1 10 100   NA
3       C3 1 10 100 1000

If the horizontal labels are meaningless and you just want to fill in how many ever values you have, you can do this with plyr :

ddply(df, .(Compound), function(DF) {
  as.data.frame(t(DF$Result))
})

which gives

  Compound V1 V2  V3   V4
1       C1  1 10 100 1000
2       C2  1 10 100   NA
3       C3  1 10 100 1000

What you want is not really clear since the rows in your example are labeled with the well letters, while the code implies splitting by compound name. Not sure which you really want.

Answer 2

You want your rows and columns to be the letters and numbers from the Well column correct? You can split those out into two new columns:

well.split <- strsplit(df$Well, '')

df$well.letter <- sapply(well.split, '[', 1)
df$well.number <- sapply(well.split, '[', 2)

Then I'd use dcast from the reshape2 package:

dcast(df, well.letter~well.number, value.var='Result')

R using 't' with ddply

Question

2 answers

solution1
3 ACCPTED 2012-05-24 18:10:25

solution2
1 2012-05-24 17:50:22

R using 't' with ddply

Question

2 answers

solution1 3 ACCPTED 2012-05-24 18:10:25

solution2 1 2012-05-24 17:50:22

solution1
3 ACCPTED 2012-05-24 18:10:25

solution2
1 2012-05-24 17:50:22