簡體 English 中英

如何在 dataframe 中使用 TM package3AF1287F6B7D8ZAF1287F6B7D8Z 導出到我的 dtm dataframe 導出

[英]How can I append my corpus metadata onto my dtm dataframe export using the TM package in R

原文 2020-12-15 21:35:59 4 1 r/ tm/ corpus

我目前正在使用 tm package 進行一些文本挖掘。 我希望能夠將我的文檔術語矩陣導出為帶有我的語料庫元數據（id 變量等）的數據框。這是我當前的工作流程：

導入數據集
轉換為語料庫
基本清潔
創建 TF-IDF 文檔術語矩陣
將 DTM 轉換為 dataframe
使用語料庫元數據導出 dataframe

5號是我卡住的地方。 我覺得 package 絕對可以做到這一點，但我找不到任何文檔。 使用 tm 創建 DTM 時元數據會丟失嗎？

1 個解決方案

在這里回答我自己的問題，以防其他人忽略我所做的同樣的事情。

tm 制作的 DTM 將 doc_id 變量存儲為行名。 因此，您可以將首選行名稱用於變量代碼來創建一個新變量，然后將其用作 append 任何其他元數據的鍵。

一種方法的示例：

dtm <- tibble::rownames_to_column(dtm, var = "doc_id")

如何使用R中的tm包從語料庫中刪除重復項

[英]How to remove duplicates from a corpus using the tm package in R

如何使用R中的'tm'包在語料庫中設置TF權重

[英]How do I set up TF weight of terms in corpus using the ‘tm’ package in R

如何在 R tm 包中顯示語料庫文本？

[英]How to show corpus text in R tm package?

如何根據元數據過濾R中的tm語料庫中的文檔？

[英]How to filter documents in a tm corpus in R based on metadata?

使用tm包在R中打印語料庫中一個元素的第一行

[英]Print first line of one element of Corpus in R using tm package

textProcessor 更改我的語料庫的觀察次數（與 R 中的 stm 包一起使用）

[英]textProcessor changes the number of observations of my corpus (using with stm package in R)

R，“tm”包 - 錯誤：找不到語料庫功能

[英]R, "tm" package - Error: Corpus Function not found

如何將具有單個列的R數據幀轉換為tm的語料庫，以便將每一行作為文檔？

[英]How can I convert an R data frame with a single column into a corpus for tm such that each row is taken as a document?

在R中使用tm包清除數據框中的列

[英]Using tm package in R to clean the columns in dataframe

在r的整個循環中附加txt文件-然后使用tm包將該txt文件讀入語料庫

[英]Append txt file throughout loop in r - then read that txt file into a corpus with the tm package

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 如何使用R中的tm包從語料庫中刪除重復項如何使用R中的'tm'包在語料庫中設置TF權重如何在 R tm 包中顯示語料庫文本？如何根據元數據過濾R中的tm語料庫中的文檔？使用tm包在R中打印語料庫中一個元素的第一行 textProcessor 更改我的語料庫的觀察次數（與 R 中的 stm 包一起使用） R，“tm”包 - 錯誤：找不到語料庫功能如何將具有單個列的R數據幀轉換為tm的語料庫，以便將每一行作為文檔？在R中使用tm包清除數據框中的列在r的整個循環中附加txt文件-然后使用tm包將該txt文件讀入語料庫

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM