Using the following data,
library(reshape)
P <- c( "D" , "D" , "P" )
a_0_2 <- c( "M" , "Y" , "M" )
a_3_5 <- c( "M" , "M" , "Y" )
n <- c( 48 , 57 , 15 )
df <- data.frame( P , a_0_2 , a_3_5 , n )
I'd like to get to the following data.frame:
P variable value nIDs
D a_0_2 M 48
D a_0_2 Y 57
P a_0_2 M 15
D a_3_5 M 48
D a_3_5 M 57
P a_3_5 Y 15
I tried melt( df , id.vars = "P" )
which of course doesn't treat the n
variable correctly:
P variable value
1 D a_0_2 M
2 D a_0_2 Y
3 P a_0_2 M
4 D a_3_5 M
5 D a_3_5 M
6 P a_3_5 Y
7 D n <NA>
8 D n <NA>
9 P n <NA>
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = c(48, 57, 15)) :
invalid factor level, NA generated
However using the intuitive melt( df , id.vars = "P" , measure.vars = "n" )
call produces
P variable value
1 D n 48
2 D n 57
3 P n 15
which is further away from the objective. What is it that am I missing? Thanks.
It looks like you simply need this
melt(df, id.vars = c("P", "n"))
# P n variable value
# 1 D 48 a_0_2 M
# 2 D 57 a_0_2 Y
# 3 P 15 a_0_2 M
# 4 D 48 a_3_5 M
# 5 D 57 a_3_5 M
# 6 P 15 a_3_5 Y
Or using the newer tidyr
packge
library(tidyr)
gather(df, variable, value, a_0_2:a_3_5)
# P n variable value
# 1 D 48 a_0_2 M
# 2 D 57 a_0_2 Y
# 3 P 15 a_0_2 M
# 4 D 48 a_3_5 M
# 5 D 57 a_3_5 M
# 6 P 15 a_3_5 Y
If we assume that df[2:3]
aren't necessarily factors (adding stringsAsFactors = FALSE
to OPs data.frame
function), we can add a nice solution proposed by @Thela using base R only
data.frame(df[c(1, 4)], stack(df[2:3]))
# P n values ind
# 1 D 48 M a_0_2
# 2 D 57 Y a_0_2
# 3 P 15 M a_0_2
# 4 D 48 M a_3_5
# 5 D 57 M a_3_5
# 6 P 15 Y a_3_5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.