简体   繁体   中英

Reorganizing dataframe with multiple header types following "tidy" approach in R

I have a dataframe that looks like somewhat like this:

Age  A1U_sweet  A2F_dip  A3U_bbq  C1U_sweet  C2F_dip  C3U_bbq  Comments
23   1          2        1        NA         NA       NA       Good
54   NA         NA       NA       4          1        2        ABCD
43   2          4        7        NA         NA       NA       HiHi

I am trying to reorganize it in way shown below to make it more "tidy". Is there a way for me to do this that also incorporates the Age and Comments columns in the same style as shown for the other variables below? How would you suggest incorporating them - one idea is shown below, but I am open to other suggestions. How would I modify the following code in order to account for multiple different styles of column name?

library(tidyr)

df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
##   id kind  type     A    C
## 1  1  Age   Age    23   23
## 2  1  1U_ sweet     1   NA
## 3  1  2F_   dip     2   NA
## 4  1  3U_   bbq     1   NA
## 5  1  Com   Com  Good Good
## 6  2  Age   Age    54   54
## 7  2  1U_ sweet    NA    4
## 8  2  2F_   dip    NA    1
## 9  2  3U_   bbq    NA    2
##10  2  Com   Com  ABCD ABCD
##11  3  Age   Age    43   43
##12  3  1U_ sweet     2   NA
##13  3  2F_   dip     4   NA
##14  3  3U_   bbq     7   NA
##15  3  Com   Com  HiHi HiHi

And how would I modify the following code to return the data back to how it originally was?

df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)

For context, this question was prompted by Ista's comment under this question: Combining columns in R based on matching beginnings of column title names

Since Age and Comments are presumably measured at the level of whatever a row in your original data is, just bring them along for the ride:

df <- data.frame(id = 1:nrow(df), df)

dfl <- gather(df, key = "key", value = "value", -id, -Age, -Comments)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2

df2 <- transform(df2, B = ifelse(is.na(A), C, A))
df2

df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
df

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM