简体   繁体   English

R:如何用变量填充ggplot2中的点

[英]R: How to Fill Points in ggplot2 with a variable

I am attempting to make a ggplot2 scatter plot that is grouped by bins in R. I successfully made the first model, which I did not try to alter the fill for.我正在尝试制作一个ggplot2散点图 plot,它按 R 中的垃圾箱分组。我成功制作了第一个 model,我没有尝试更改填充。 But when I tried to have the fill of the scatter plot be based upon my variable (Miss.),which is a numeric value ranging from 0.00 to 0.46, it essentially ignores the heat map scale and turns everything gray.但是当我试图让散点图 plot 的填充基于我的变量(小姐)时,它是一个范围从 0.00 到 0.46 的数值,它基本上忽略了热量 map 比例并将所有内容变成灰色。

   ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk))+
   geom_bin_2d(bins = 15)+
   scale_fill_continuous(type = "viridis")+
   ylim(5, 20)+
   xlim(0,15)+
   coord_fixed(1.3)


   ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk, fill 
   =Miss.))+
   geom_bin_2d(bins = 15)+
   scale_fill_continuous(type = "viridis")+
   ylim(5, 20)+
   xlim(0,15)+
   coord_fixed(1.3)

I appreciate any help!感谢您的帮助! Thanks!谢谢!

I think I understand your problem, so let's replicate it with a reproducible example.我理解你的问题,所以让我们用一个可重现的例子来复制它。 Obviously we don't have your data, but the following data frame has the same names, types and ranges as your own data, so this walk-through should work for you.显然我们没有您的数据,但以下数据框与您自己的数据具有相同的名称、类型和范围,因此本演练应该适合您。

set.seed(1)

RightFB <- data.frame(TMHrzBrk = runif(1000, 0, 15),
                      TMIndVertBrk = runif(1000, 5, 20),
                      Miss. = runif(1000, 0, 0.46))

Your first plot will look something like this:你的第一个 plot 看起来像这样:

library(tidyverse)

ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk)) +
  geom_bin_2d(bins = 15) +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)
#> Warning: Removed 56 rows containing missing values (`geom_tile()`).

Here, the fill colors represent the counts of observations within each bin.在这里,填充 colors 表示每个 bin 中的观察计数。 But if you try to map the fill to Miss. , you get all gray squares:但是,如果您尝试将 map 填充为Miss. ,则会得到所有灰色方块:

ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk,
                                     fill = Miss.)) +
  geom_bin_2d(bins = 15) +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)
#> Warning: The following aesthetics were dropped during statistical transformation: fill
#> i This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> i Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?
#> Removed 56 rows containing missing values (`geom_tile()`).

The reason this happens is that by default geom_bin_2d calculates the bins and the counts within each bin to get the fill variable.发生这种情况的原因是默认情况下geom_bin_2d计算箱子和每个箱子内的计数以获得填充变量。 There are multiple observations within each bin, and they all have a different value of Miss. .每个 bin 中有多个观察值,它们都有不同的Miss.值。 Furthermore, geom_bin_2d doesn't know what you want to do with this variable.此外, geom_bin_2d不知道你想用这个变量做什么。 My guess is that you are looking for the average of Miss. within each bin, but this is difficult to achieve within the framework of geom_bin_2d .我的猜测是您正在寻找每个 bin 中Miss.平均值,但这很难在geom_bin_2d的框架内实现。

The alternative is to calculate the bins yourself, get the average of Miss. in each bin, and plot as a geom_tile另一种方法是自己计算 bins,在每个 bin 中获取Miss.的平均值,并将 plot 作为geom_tile

RightFB %>%
  mutate(TMHrzBrk = cut(TMHrzBrk, breaks = seq(0, 15, 1), seq(0.5, 14.5, 1)),
         TMIndVertBrk = cut(TMIndVertBrk, seq(5, 20, 1), seq(5.5, 19.5, 1))) %>%
  group_by(TMHrzBrk, TMIndVertBrk) %>%
  summarize(Miss. = mean(Miss., na.rm = TRUE), .groups = "drop") %>%
  mutate(across(TMHrzBrk:TMIndVertBrk, ~as.numeric(as.character(.x)))) %>%
  ggplot(aes(x = TMHrzBrk, y = TMIndVertBrk, fill = Miss.)) +
  geom_tile() +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)


EDIT编辑

With the link to the data in the comments, here is a full reprex:通过评论中数据的链接,这里有一个完整的代表:

library(tidyverse)

RightFB <- read.csv(paste0("https://raw.githubusercontent.com/rileyfeltner/",
                           "FB-Analysis/main/Right%20FB.csv"))

RightFB <- RightFB[c(2:6, 9, 11, 13, 18, 19)]
RightFB$Miss. <- as.numeric(as.character(RightFB$Miss.))
#> Warning: NAs introduced by coercion
RightFB$TMIndVertBrk <- as.numeric(as.character(RightFB$TMIndVertBrk))
#> Warning: NAs introduced by coercion
RightFB <- na.omit(RightFB)
RightFB1 <- filter(RightFB, P > 24)

RightFB %>%
  mutate(TMHrzBrk = cut(TMHrzBrk, breaks = seq(0, 15, 1), seq(0.5, 14.5, 1)),
         TMIndVertBrk = cut(TMIndVertBrk, seq(5, 20, 1), seq(5.5, 19.5, 1))) %>%
  group_by(TMHrzBrk, TMIndVertBrk) %>%
  summarize(Miss. = mean(Miss., na.rm = TRUE), .groups = "drop") %>%
  mutate(across(TMHrzBrk:TMIndVertBrk, ~as.numeric(as.character(.x)))) %>%
  ggplot(aes(x = TMHrzBrk, y = TMIndVertBrk, fill = Miss.)) +
  geom_tile() +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)
#> Warning: Removed 18 rows containing missing values (`geom_tile()`).

Created on 2022-11-23 with reprex v2.0.2创建于 2022-11-23,使用reprex v2.0.2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM