简体   繁体   English

R宽到长整形并带有列名

[英]R wide to long reshape with column names

I have data in this format 我有这种格式的数据

A1 A2 B1 B2  C1  C2
10  5 11  5  21  10

And I want to convert it to: 我想将其转换为:

  1  2
A 10 5
B 11 5
C 21 10

How can I do it in R? 我如何在R中做到这一点?

We can gather into 'long' format, then separate the 'key' column into two by splitting before the numeric part, spread it to 'wide' and change the 'key1' column to row names 我们可以gather到“长”格式,然后separate数字前面的部分“关键”列被分裂成两个, spread它“宽”和改变“键1”列行名

library(tidyverse)
gather(df1) %>%
    separate(key, into = c('key1', 'key2'), sep="(?=\\d)") %>% 
    spread(key2, value) %>% 
    column_to_rownames('key1')
#  1  2
#A 10  5
#B 11  5
#C 21 10

data 数据

df1 <- structure(list(A1 = 10L, A2 = 5L, B1 = 11L, B2 = 5L, C1 = 21L, 
     C2 = 10L), class = "data.frame", row.names = c(NA, -1L))

The tags to the question are r, reshape and reshape2 so we show solutions using each of those. 问题的标签是r,reshape和reshape2,因此我们使用每个标签显示解决方案。

1) xtabs A base R solution is the following. 1)xtabs以下是基本的R解决方案。

let <- gsub("\\d", "", names(DF))
num <- gsub("\\D", "", names(DF))
tab <- xtabs(unlist(DF) ~ let + num)

giving: 给予:

> tab
   num
let  1  2
  A 10  5
  B 11  5
  C 21 10

or for a data frame: 或对于数据帧:

cbind(let = rownames(tab), as.data.frame.matrix(tab))

giving: 给予:

  let  1  2
A   A 10  5
B   B 11  5
C   C 21 10

2) reshape Another base R solution is the following. 2)重塑以下是另一个基本的R解决方案。 let and num are from above. letnum来自上面。

varying <- split(names(DF), num)
reshape(DF, dir = "long", varying = varying, v.names = names(varying),
  times = unique(let), timevar = "let")[-4]

giving: 给予:

    let  1  2
1.A   A 10  5
1.B   B 11  5
1.C   C 21 10

3) reshape2 Using let and num from above: 3)reshape2从上面使用letnum

library(reshape2)

dcast(let ~ num, data = data.frame(value = unlist(DF)), value.var = "value")

giving: 给予:

  let  1  2
1   A 10  5
2   B 11  5
3   C 21 10

Note 注意

The input in reproducible form: 可复制形式的输入:

Lines <- "
A1 A2 B1 B2  C1  C2
10  5 11  5  21  10"
DF <- read.table(text = Lines, header = TRUE)

A data.table solution: 数据data.table解决方案:

library(data.table)
library(magrittr)
melt(df1, measure.vars = names(df1)) %>%
  .[, c("l", "n") := tstrsplit(variable, "")] %>%
  dcast(l ~ n)

   l  1  2
1: A 10  5
2: B 11  5
3: C 21 10

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM