如何使用 R 计算每列特定单词的频率？

Question

I am using this dataset https://archive.ics.uci.edu/ml/datasets/Eco-hotel我正在使用这个数据集https://archive.ics.uci.edu/ml/datasets/Eco-hotel

I am trying to figure out how to count the frequency of certain words like "room" or "vacation" within each column.我试图弄清楚如何计算每列中某些单词（例如“房间”或“假期”）的频率。 I have attempted following tutorials online, but unfortunately, I have had no luck.我曾尝试在线学习教程，但不幸的是，我没有运气。

Answer 1

Using the iris dataset as an example, what you can do is:以 iris 数据集为例，你可以做的是：

library(tidyverse)

iris %>%
  summarize(across(everything(), ~ sum(str_detect(., 'setosa'))))

Of course, you'd need to change the seqrch term to what you need.当然，您需要将 seqrch 术语更改为您需要的内容。

If you want to have dedicated columns for each of your search patterns, you could alternatively do sth.如果您想为每个搜索模式设置专用列，您也可以这样做。 like:喜欢：

df <- data.frame(x = sample(letters, 10, replace = TRUE),
                 y = sample(letters, 10, replace = TRUE))

df |> 
  summarize(across(c(x, y), ~sum(str_count(., c("u"))), .names = "{.col}_u"),
            across(c(x, y), ~sum(str_count(., c("g"))), .names = "{.col}_g"))

Here I'M searching for letters "u" and "g", respectively.在这里，我分别搜索字母“u”和“g”。

如何使用 R 计算每列特定单词的频率？

问题描述

1 个解决方案

解决方案1
0 2022-09-22 05:14:16

如何使用 R 计算每列特定单词的频率？

问题描述

1 个解决方案

解决方案1 0 2022-09-22 05:14:16

解决方案1
0 2022-09-22 05:14:16