[英]Create new rows in data frame based on multiple values of column
我一直在尋找問題的具體答案,但沒有成功。
首先,我有一個由48個變量組成的數據框,看起來像這樣:
> df
Text Screen_Name ...
1 a text where @Sam and @Su and @Jim are addressed Peter
2 a text where @Eric is addressed Margret
3 a text where @Sarah and @Adam are addressed John
現在,我提取所有等於(“ @ \\ S +”)的字符串並將其存儲在新列中
df$addressees <- str_extract_all(df$text, "@\\S+")
這使我:
... Screen_Name Addressees ...
1 Peter c("@Sam", "@Su", "@Jim")
2 Margret @Eric
3 John c("@Sarah", "@Adam")
現在,我想為兩列創建一個新的數據框,其中通過重復列“ Screen_Name”列的相應值來為每個“收件人”創建新行:
> df
Screen_Name Addressees
1 Peter Sam
2 Peter Su
3 Peter Jim
4 Margret Eric
5 John Sarah
6 John Adam
我嘗試過類似方法的解決方案,但似乎都沒有用。
非常感謝您的幫助!
好,有一個可重現的示例:
# create df
ego <- c("peter","margaret","john")
friends <- list(c("sam","su","jim"),c("eric"),c("sarah","adam"))
df <- data.frame(ego,friends= I(friends),stringsAsFactors = F)
# use repeat function to repeat rows
times <- sapply(df$friends,length)
df <- df[rep(seq_len(nrow(df)), times),]
# assign back unlisted friends
df$friends <- unlist(friends)
你也可以嘗試data.table
使用df
通過@raistlin創建:
library(data.table)
setDT(df)[, .(friends = unlist(friends)), by = "ego"]
ego friends
1: peter sam
2: peter su
3: peter jim
4: margaret eric
5: john sarah
6: john adam
現在,借助OP提供的附加上下文 ,可以簡化data.table
解決方案,以data.table
解決潛在的問題。
要按照OP的要求在“ Addressees
列中刪除前導@
,需要將正則表達式修改為使用正向后看 。
library(data.table)
# read data (to make it a reproducible example)
dt <- fread("Text; Screen_Name
a text where @Sam and @Su and @Jim are addressed; Peter
a text where @Eric is addressed; Margret
a text where @Sarah and @Adam are addressed; John")
# use str_extract_all with modified regex
dt[, .(Addressees = unlist(stringr::str_extract_all(Text, "(?<=@)\\S+"))),
by = .(Screen_Name)]
# Screen_Name Addressees
#1: Peter Sam
#2: Peter Su
#3: Peter Jim
#4: Margret Eric
#5: John Sarah
#6: John Adam
這有幫助嗎?
輸入:
Screen_Name <- c("Peter", "Margaret", "John")
Addressees <- c(c("@Sam", "@Su", "@Jim"), "@Eric", c("@Sarah", "@Adam") )
tidyverse
方式:
df <- data.frame(Screen_Name, Addressees) %>% tidyr::expand(Screen_Name, Addressees)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.