简体   繁体   中英

Multi-Row Data to Columns

I'd like to take some data that is currently in rows, and transform it into columns. The idea here is to have a single row for every value of x1 in df , and to split the data in x3 into two columns on the basis of a unique x1 and x2 combination.

> df
    x1 x2 x3
1    A  0  4
2    A  1  2
3    B  1  1
4    C  0  5
5    C  1  2
6    D  0  1
7    D  1  1
8    E  0  3

This may involve a multi-step cleanup process, but eventually I'd like to get something like the below table, df_rev . Note the missing combinations of B0 and E1 have been replaced with 0 values.

> df_rev
    x1 x3_0 x3_1
1    A    4    2
3    B    0    1
4    C    5    2
6    D    1    1
8    E    3    0

Right now I've been trying to fit this answer to my situation, but without much luck. Any help would be much appreciated.

df='
    No    x1 x2 x3
    1    A  0  4
    2    A  1  2
    3    B  1  1
    4    C  0  5
    5    C  1  2
    6    D  0  1
    7    D  1  1
    8    E  0  3'

    df=read.table(text=df,header=T)

    library(reshape)
    nf = cast(df, x1 ~ x2, value = .(x3))
    colnames(nf) = c('x1','x3_0','x3_1')
    nf[is.na(nf)] <- 0
nf

You can also use the built-in "reshape" function. The sub expression just replaces all instances of . in variable names with _ , which might be more convenient than retyping all the new variable names if you have many "times" (here you just have two, but you can easily have many more than that):

df_rev = reshape(df, timevar="x2", idvar="x1", direction="wide")
names(df_rev) = sub("\\.", "_", names(df_rev))
df_rev[is.na(df_rev)] = 0
df_rev

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM