简体   繁体   English

plot如何用quanteda多词表达

[英]How to plot multi-word expressions with quanteda

I am using the quanteda package in r for textual data analysis.我正在使用 r 中的 quanteda package 进行文本数据分析。 I am interested in plotting some Keyword-in-context display using the kwic() command that is to useful to find multi-word expressions in tokens.我有兴趣使用 kwic() 命令绘制一些上下文中的关键字显示,这对于在标记中查找多词表达式很有用。

# Remove punctuation and symbols 
toks_comments <- tokens(comments_corpus, remove_punct = TRUE, remove_symbols = TRUE, padding = 
TRUE) %>% 
tokens_remove(stopwords("spanish"), padding = TRUE)


# Get relevant keywords and phrases from dictionary
servicio <- 
c("servicio","atencion","atención","personal","mesera","mesero","muchacha","muchacho","joven",
         "pelado", "pelada","meseros")

# Keyword-in-context
servicio_context <- kwic(toks_comments, pattern = phrase(servicio))  
View(servicio_context)

Once the previous lines have been run, I get the result that I have included in the photo.一旦运行了前面的几行,我就会得到包含在照片中的结果。 From that table in the photo, I am interested in graphing the "pre" and "post" column but I don't know how to do it.从照片中的那个表格,我有兴趣绘制“前”和“后”列,但我不知道该怎么做。 Is there a way to include the words in a multiword wordcloud or some other frequency visualization?有没有办法将单词包含在多词词云或其他频率可视化中?

Here is the pic: "View(servicio_context)"这是图片: “查看(servicio_context)”

You could do both a wordcloud and a frequency bar graph.你可以做一个词云图和一个频率条形图。

Wordcloud词云

library(quanteda.textplots)
library(quanteda)

dfm(servicio_context$pre) %>%
  textplot_wordcloud()

Bar Graph条状图

library(ggplot2)

servicio_context %>%
  ggplot(aes(x = pre)) +
  geom_bar(stat = "count")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM