[英]Split Column by variable and create new column R
我正在嘗試使用答案第一個問題拆分下面的列。 目前,我正在使用字母在df中創建新列。 我想在名稱前使用字母作為新的列名稱。 在低於G,D,W,C,UTIL的情況下。 由於類別G
和名字First Person
等之間僅存在“空格”,因此我為將類別G
以及名字和姓氏分開並在適當的列中加入它們而努力。
library(stringr)
test <- data.frame(Lineup = c("G First Person D Another Last W Fake Name C Test Another UTIL Another Test", "G Fake Name W Another Fake D Third person UTIL Another Name C Name Another "))
1 G First Person D Another Last W Fake Name C Test Another UTIL Another Test
2 G Fake Name W Another Fake D Third person UTIL Another Name C Name Another
test$G <- str_split_fixed(test$Lineup, " ", 2)
結果:
G
G
希望的結果:
G D W C UTIL
First Person Another Last Fake Name Test Another Another Test
Fake Name Third Person Another Fake Name Another Another Name
這是使用tidyverse
的一種方法:
# example data
test <- data.frame(Lineup = c("G First Person D Another Last W Fake Name C Test Another UTIL Another Test",
"G Fake Name W Another Fake D Third person UTIL Another Name C Name Another "))
library(tidyverse)
# create a dataset of words and info about
# their initial row id
# whether they should be a column in our new dataset
# group to join on
dt_words = test %>%
mutate(id = row_number()) %>%
separate_rows(Lineup) %>%
mutate(is_col = Lineup %in% c(LETTERS, "UTIL"),
group = cumsum(is_col))
# get the corresponding values of your new dataset
dt_values = dt_words %>%
filter(is_col == FALSE) %>%
group_by(group, id) %>%
summarise(values = paste0(Lineup, collapse = " "))
# get the columns of your new dataset
# join corresponding values
# reshape data
dt_words %>%
filter(is_col == TRUE) %>%
select(-is_col) %>%
inner_join(dt_values, by=c("group","id")) %>%
select(-group) %>%
spread(Lineup, values) %>%
select(-id)
# C D G UTIL W
# 1 Test Another Another Last First Person Another Test Fake Name
# 2 Name Another Third person Fake Name Another Name Another Fake
請注意 ,這里的假設是,您總是會有一個大寫字母來拆分值,這些大寫字母將成為新數據集中的列。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.