[英]ggplot2: two discrete variables + percentage of cases + continuos variable
我正在嘗試生成一個 plot,在 x 軸上有一個離散變量,在 y 軸上有另一個離散變量,點的顏色由val
的平均值決定,大小由x
中的個案比例決定。
數據如下所示:
df1 <- data.frame(y=c("a","b","c","a","d","a","a","c","d","a","b","c","a","d","a","a","c","d","d","a","b","c","a","d","a","a","c","d"),
x=c("x","y","z","t","r","x","x","x","y","z","t","r","r","x","y","z","t","r","x","x","y","z","t","r","r","x","r","x"),
val=c(1,4,1,6,3,6,2,7,8,2,5,7,2,8,5,8,6,4,2,4,5,7,6,5,4,4,3,3))
我試過 geom_count 和以下內容:
ggplot(data = df1, aes(x=x, y=y, fill=val))+
stat_sum(aes(size=..prop.., group=x))+
scale_size_area(max_size = 10)
但是一定有一些我不知道的奇怪的覆蓋。 size 參數中產生的道具不正確,就好像我從 plot 中刪除了填充變量一樣,它們是不同的。 誰能幫我? 我仔細檢查了谷歌,但我沒有找到任何解決方案。
一種選擇是在ggplot
之外計算填充值的計數、百分比和平均值,並使用geom_point
到 plot 聚合數據:
library(ggplot2)
library(dplyr)
df2 <- df1 |>
group_by(x, y) |>
summarise(n = n(), val = mean(val)) |>
mutate(pct = n / sum(n)) |>
ungroup()
#> `summarise()` has grouped output by 'x'. You can override using the `.groups`
#> argument.
ggplot(df2, aes(x, y, size = pct, fill = val)) +
geom_point(shape = 21) +
scale_size_area(max_size = 10)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.