简体   繁体   English

R:将类型“List”转换为 dataframe 以转换为 excel - 文本挖掘

[英]R: Convert type "List" to dataframe to convert to excel - Text Mining

When trying to stem and tokenize my list of reviews, it will automatically make it a list.当试图阻止和标记我的评论列表时,它会自动将其变成一个列表。 It is a "character" type variable at first, but when applying the following code it turns it into a "list":起初它是一个“字符”类型的变量,但是当应用下面的代码时它变成了一个“列表”:

reviews <- tokenize_word_stems(reviews)

I want to eventually convert this into excel, but my write_xlsx function can only convert dataframes, and not lists.我想最终将其转换为 excel,但我的 write_xlsx function 只能转换数据帧,而不能转换列表。

the rest of my code looks like this, but it goes "wrong" when trying to stem the words:我的代码中的 rest 看起来像这样,但是在试图阻止这些词时它会“出错”:

reviews <- readLines("Reviewlist.csv")
reviews <- gsub(pattern = "\\W", replace = " ", reviews)
reviews <- tolower(reviews)
reviews <- gsub(pattern="\\b[A-z]\\b{1}", replace=" ", reviews)
reviews <- stripWhitespace(reviews)
reviews <- removeWords(reviews, stopwords())
reviews <- tokenize_word_stems(reviews)

the file:文件:

Thanks in advance!提前致谢!

Creating a lorem-ipsum dummy input here, based on my assumption what your "Reviewlist.csv" seems to look like.根据我对您的“Reviewlist.csv”的外观的假设,在此处创建一个 lorem-ipsum 虚拟输入。

library(dplyr)
library(stringi)

stri_rand_lipsum(5) %>%
  writeLines("Reviewlist.csv")

Then, this here is just your original code without alterations, but using dplyr grammar and explicitly stating the libraries necessary:然后,这只是您未经更改的原始代码,但使用dplyr语法并明确说明必要的库:

library(tm)
library(tokenizers)

reviews <- readLines("Reviewlist.csv") %>%
  gsub(pattern = "\\W", replace = " ", .) %>%
  tolower() %>%
  gsub(pattern="\\b[A-z]\\b{1}", replace=" ", .) %>%
  stripWhitespace() %>%
  removeWords(stopwords()) %>%
  tokenize_word_stems()

Now, what you can do is to bind your list items into a dataframe before being able to write it as an xlsx-file:现在,您可以做的是将列表项绑定到 dataframe 中,然后才能将其写入 xlsx 文件:


library(purrr)
library(writexl)

reviews_df <- reviews %>%
    map_dfr(~ setNames(., sprintf("word_%04d", seq_along(.))))

reviews_df %>%
  write_xlsx("Reviewlist.xlsx")

And that might create a very wide xlsx for you.这可能会为您创建一个非常宽的 xlsx。
No idea whether Excel really is able to open it, but there you go:)不知道 Excel 是否真的可以打开它,但是你 go:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM