[英]Similar but different to: Split unique values into separate columns for multiple columns
我的问题是我是否在 R 中有下一个数据框。
a<-data.frame(col1=c("a","a","a","d","a"),
col2=c("b","b","c","e","e"),
col3=c("c","d","e",NA,NA),
col4=c("d","e",NA,NA,NA),
col5=c("e",NA,NA,NA,NA))
print(a)
col1| col2| col3| col4| col5|
a b c d e
a b d e NA
a c e NA NA
d e NA NA NA
a e NA NA NA
我需要这样的其他数据框:
b<-data.frame(col1=c("a","a","a",NA,"a"),
col2=c("b","b",NA,NA,NA),
col3=c("c",NA,"c",NA,NA),
col4=c("d","d",NA,"d",NA),
col5=c("e","e","e","e","e"))
print(b)
col1| col2| col3| col4| col5|
a b c d e
a b NA d e
a NA c NA e
NA NA NA d e
a NA NA NA e
抱歉,我没有解释我的问题的概念,这是我提出问题的原因,但我想,我首先要:按与新列中组的 rest 不同的列分隔,然后, 以获取同一列中具有相同值的行。
我认为我的问题与此类似: Split unique values into separate columns for multiple columns
如果有人可以帮助我,我将非常感激。
使用一些 tidyverse 库你可以做
library(dplyr)
library(tidyr)
a %>%
mutate(id=row_number()) %>%
pivot_longer(-id) %>%
filter(!is.na(value)) %>%
pivot_wider(id_cols=id, names_from="value", values_from="value") %>%
select(-id)
我们使用 pivot 函数来重塑和转换数据。 技巧只是添加id
列,以便更轻松地按行存储数据。 这返回
a b c d e
<chr> <chr> <chr> <chr> <chr>
1 a b c d e
2 a b NA d e
3 a NA c NA e
4 NA NA NA d e
5 a NA NA NA e
另一个基础 R 选项:
setNames(data.frame(sapply(sort(na.omit(unique(unlist(a)))),
function(x) ifelse(rowSums(a==x, na.rm=TRUE) > 0, x, NA))), colnames(a))
#> col1 col2 col3 col4 col5
#> 1 a b c d e
#> 2 a b <NA> d e
#> 3 a <NA> c <NA> e
#> 4 <NA> <NA> <NA> d e
#> 5 a <NA> <NA> <NA> e
我们可以在base R
中执行此操作
t(apply(a, 1, function(x) {
v1 <- character(length(x))
v1[match(x, letters, nomatch = 0)] <- x
v1}))
# [,1] [,2] [,3] [,4] [,5]
#[1,] "a" "b" "c" "d" "e"
#[2,] "a" "b" "" "d" "e"
#[3,] "a" "" "c" "" "e"
#[4,] "" "" "" "d" "e"
#[5,] "a" "" "" "" "e"
或者另一种选择是
b <- a
m1 <- t(apply(a, 1, function(x) {table(factor(x, levels = letters[1:5]))})) > 0
b[] <- colnames(m1)[col(m1)* NA^!m1]
b
# col1 col2 col3 col4 col5
#1 a b c d e
#2 a b <NA> d e
#3 a <NA> c <NA> e
#4 <NA> <NA> <NA> d e
#5 a <NA> <NA> <NA> e
或上述的轻微变化
t(apply(a, 1, function(x) {
tbl1 <- table(factor(x, levels = letters[1:5]))
ifelse(tbl1 >0, names(tbl1), NA)}))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.