[英]Get the column name and assign as value in unlisted column in the dataframe using r
I have dataframe that looks like this: 我有看起来像这样的数据框:
negative positive
1: enjoyed
2: hate famous,helpful
3: forget,poor
4: hate enjoyed, kind
I want to convert it into something like this: 我想将其转换成这样的东西:
text sentiment
1 hate negative
2 forget negative
3 poor negative
4 enjoyed positive
5 famous positive
6 helpful positive
7 kind positive
Appreciate your kind help. 感谢您的帮助。
How about something like: 怎么样:
df0 <- data.frame(
negative = c("", "hate", "forget,poor", "hate"),
positive = c("enjoyed", "famous,helpful", "", "enjoyed, kind"),
stringsAsFactors = FALSE
)
values <- sapply(df0, function(x) unique(trimws(unlist(strsplit(x, ",")))))
df1 <- data.frame(
text = unlist(values),
sentiment = rep(c("negative", "positive"), lengths(values)),
row.names = NULL
)
df1
note that I used stringsAsFactors = FALSE
, if your variables are factors you'll have to convert them to strings first. 请注意,我使用
stringsAsFactors = FALSE
,如果变量是因素,则必须首先将其转换为字符串。
You could try something like this: 您可以尝试这样的事情:
# create testdat
test_data <- data.frame(negative = c("", "hate", "forget, poor", "hate"), positive = c("enjoyed", "famous, helpful", "", "enjoyed, kind"), stringsAsFactors = F)
#extract positive and negative colum and split along ", "
neg <- unique(unlist(strsplit(test_data$negative, ", ")))
pos <- unique(unlist(strsplit(test_data$positive, ", ")))
# combine neg and positive into a dataframe and add the sentiment column
combined <- data.frame(text = c(pos, neg), sentiment = c(rep("positive", length(pos)), rep("negative", length(neg))))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.