简体   繁体   English

用ggplot2绘制差异

[英]Plotting differences with ggplot2

I have an R dataframe (named frequency ) like this: 我有一个R数据框(称为frequency ),如下所示:

word    author  proportion
a   Radicals    1.679437e-04
aa  Radicals    2.099297e-04
aaa Radicals    2.099297e-05
abbe    Radicals    NA
aboow   Radicals    NA
about   Radicals    NA
abraos  Radicals    NA
ytterst Conservatives   5.581042e-06
yttersta    Conservatives   5.581042e-06
yttra   Conservatives   2.232417e-05
yttrandefrihet  Conservatives   5.581042e-06
yttrar  Conservatives   2.232417e-05

I want to plot document differences using ggplot2. 我想使用ggplot2绘制文档差异。 Something like this 这样

I have the code below, but my plot ends up empty. 我有下面的代码,但是我的情节最终是空的。

library(scales)
ggplot(frequency, aes(x = proportion, y = `Radicals`, color = abs(`Radicals` - proportion))) +
    geom_abline(color = "gray40", lty = 2) +
    geom_jitter(alpha = 0.1, size = 2.5, width = 0.3, height = 0.3) +
    geom_text(aes(label = word), check_overlap = TRUE, vjust = 1.5) +
  scale_x_log10(labels = percent_format()) +
  scale_y_log10(labels = percent_format()) +
  scale_color_gradient(limits = c(0, 0.001), low = "darkslategray4", high = "gray75") +
  facet_wrap(~author, ncol = 2) +
  theme(legend.position="none") +
  labs(y = "Radicals", x = NULL)

Your plot ends up empty because there isn't a column 'Radicals'. 由于没有“ Radicals”列,您的图最终为空。 if you're trying to narrow to only Radicals and then plot that you should do something like 如果您尝试将范围缩小到仅自由基,然后绘图,则应执行以下操作

 radical_frequecy <- subset(frequency, author == 'Radicals')

then you can do 那你就可以

 library(scales)
 ggplot(radical_frequency, aes(x = proportion, y = author, color = abs(`Radicals` - proportion))) +
geom_abline(color = "gray40", lty = 2) +
geom_jitter(alpha = 0.1, size = 2.5, width = 0.3, height = 0.3) +
geom_text(aes(label = word), check_overlap = TRUE, vjust = 1.5) +
   scale_x_log10(labels = percent_format()) +
   scale_y_log10(labels = percent_format()) +
   scale_color_gradient(limits = c(0, 0.001), low = "darkslategray4", high = "gray75") +
   theme(legend.position="none") +
   labs(y = "Radicals", x = NULL)

(I took out facet wrap since you've already narrowed to Radicals. You could add that back in and then do the first bit of code if you did y=author and facet_wrap(~author, ncol = 2) (由于您已将范围缩小到Radicals,因此我进行了facet换行。如果您执行y = author和facet_wrap(〜author,ncol = 2), 可以将其重新添加回去, 然后执行第一部分代码。

basically, tl:dr your error is caused by trying to create an axis from a variable not a column 基本上,tl:dr您的错误是由尝试从变量而不是列创建轴引起的

If what you are wanting to do is make a plot comparing the frequency of one "author" (like, say, Conservatives) on the x-axis and one "author" (perhaps the Radicals) on the y-axis, you need to spread your dataframe (from the tidyr package) so that you can plot it that way. 如果您想做的是比较x轴上一个“作者”(例如,保守派)和y轴上一个“作者”(也许是自由基)的频率作图,则需要spread您的数据框(来自tidyr包),以便您可以通过这种方式进行绘制。

library(tidyverse)
library(scales)

frequency %>%
  spread(author, proportion) %>%
  ggplot(aes(Conservatives, Radicals)) +
  geom_abline(color = "gray40", lty = 2) +
  geom_point() + 
  geom_text(aes(label = word), check_overlap = TRUE, vjust = 1.5) +
  scale_x_log10(labels = percent_format()) +
  scale_y_log10(labels = percent_format())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM