I'd like to take some data that is currently in rows, and transform it into columns. The idea here is to have a single row for every value of x1
in df
, and to split the data in x3
into two columns on the basis of a unique x1
and x2
combination.
> df
x1 x2 x3
1 A 0 4
2 A 1 2
3 B 1 1
4 C 0 5
5 C 1 2
6 D 0 1
7 D 1 1
8 E 0 3
This may involve a multi-step cleanup process, but eventually I'd like to get something like the below table, df_rev
. Note the missing combinations of B0
and E1
have been replaced with 0 values.
> df_rev
x1 x3_0 x3_1
1 A 4 2
3 B 0 1
4 C 5 2
6 D 1 1
8 E 3 0
Right now I've been trying to fit this answer to my situation, but without much luck. Any help would be much appreciated.
df='
No x1 x2 x3
1 A 0 4
2 A 1 2
3 B 1 1
4 C 0 5
5 C 1 2
6 D 0 1
7 D 1 1
8 E 0 3'
df=read.table(text=df,header=T)
library(reshape)
nf = cast(df, x1 ~ x2, value = .(x3))
colnames(nf) = c('x1','x3_0','x3_1')
nf[is.na(nf)] <- 0
nf
You can also use the built-in "reshape" function. The sub
expression just replaces all instances of .
in variable names with _
, which might be more convenient than retyping all the new variable names if you have many "times" (here you just have two, but you can easily have many more than that):
df_rev = reshape(df, timevar="x2", idvar="x1", direction="wide")
names(df_rev) = sub("\\.", "_", names(df_rev))
df_rev[is.na(df_rev)] = 0
df_rev
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.