[英]How to reshape data from wide to long with multipe variables?
I have a very large dataset that I need to reshape from wide to long.我有一个非常大的数据集,我需要从宽到长重塑。
Here is a demo of my datset which contains all the situation:这是我的数据集的演示,其中包含所有情况:
genename case1 case2 case3 strand
TP53 1 0 1 pos
TNN 0 0 1 pos
CD13 0 0 0 pos
AP35 1 1 1 neg
And the case will be only kept and reshape to longitudinal when an 1
exist, just like the following:只有当存在1
时,case 才会被保留并重新整形为纵向,如下所示:
genename case strand
TP53 case1 pos
TP53 case3 pos
TNN case3 pos
AP35 case1 neg
AP35 case2 neg
AP35 case3 neg
How could I process this kind of reshape in R?我怎么能在 R 中处理这种重塑?
df <- read.table(text="genename case1 case2 case3 strand
TP53 1 0 1 pos
TNN 0 0 1 pos
CD13 0 0 0 pos
AP35 1 1 1 neg", header =T)
library(tidyverse)
df %>%
gather( case, case_value, c(case1, case2, case3) ) %>%
filter( case_value == 1 )
# genename strand case case_value
# 1 TP53 pos case1 1
# 2 AP35 neg case1 1
# 3 AP35 neg case2 1
# 4 TP53 pos case3 1
# 5 TNN pos case3 1
# 6 AP35 neg case3 1
library(data.table)
data.table::melt( setDT(df), id.vars = c("genename", "strand"), measure.vars = c("case1", "case2", "case3") )[value == 1, ][]
# genename strand variable value
# 1: TP53 pos case1 1
# 2: AP35 neg case1 1
# 3: AP35 neg case2 1
# 4: TP53 pos case3 1
# 5: TNN pos case3 1
# 6: AP35 neg case3 1
microbenchmark::microbenchmark(
tidyverse = { df %>%
gather( case, case_value, c(case1, case2, case3) ) %>%
filter( case_value == 1 )},
data.table = { melt( setDT(df), id.vars = c("genename", "strand"), measure.vars = c("case1", "case2", "case3") )[value == 1, ][] },
times = 1000)
# Unit: milliseconds
# expr min lq mean median uq max neval
# tidyverse 2.335393 2.569323 3.157647 2.737729 3.089605 29.29513 1000
# data.table 1.374062 1.551656 1.845519 1.676229 1.838309 28.23499 1000
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.