[英]R -- transform tab delimited string to long format
How to you reshape a string delimited by a tab or space into a long format?如何将由制表符或空格分隔的字符串重塑为长格式? The string (called
label
here) can be of different lengths.字符串(这里称为
label
)可以有不同的长度。
I have this我有这个
var label
1 work 100 101
2 sleep 500 409 200
and I want this我想要这个
var code
1 work 100
2 work 101
3 sleep 500
4 sleep 409
5 sleep 200
# data
df = data.frame(var = c("work", 'sleep'), label = c('100 101', '500 409 200'))
library(tidyr)
df %>%
separate_rows(label)
# A tibble: 5 x 2
var label
<chr> <chr>
1 work 100
2 work 101
3 sleep 500
4 sleep 409
5 sleep 200
A great answer was already posted.已经发布了一个很好的答案。 But let's say you had a strange delimiter, like this:
但是假设您有一个奇怪的分隔符,如下所示:
df = data.frame(var = c("work", 'sleep'), label = c('100-gh-101', '500-gh-409-gh-200'))
In that case, you could use regex and unnest()
:在这种情况下,您可以使用正则表达式和
unnest()
:
df %>%
mutate(label2 = strsplit(label, "-gh-")) %>%
unnest(label2)
var label label2
<chr> <chr> <chr>
1 work 100--101 100
2 work 100--101 101
3 sleep 500-gh-409-gh-200 500
4 sleep 500-gh-409-gh-200 409
5 sleep 500-gh-409-gh-200 200
Using strsplit
in Map
在
Map
中使用strsplit
Map(cbind, df$var, strsplit(df$label, ' ')) |> do.call(what=rbind.data.frame)
# V1 V2
# work.1 work 100
# work.2 work 101
# sleep.1 sleep 500
# sleep.2 sleep 409
# sleep.3 sleep 200
or in by
.或
by
.
by(df, rev(df$var), \(x) with(x, cbind(var, code=el(strsplit(label, split=' '))))) |>
do.call(what=rbind.data.frame)
# var code
# sleep.1 work 100
# sleep.2 work 101
# work.1 sleep 500
# work.2 sleep 409
# work.3 sleep 200
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.