使用 dplyr/ggplot 創建頻率表並繪制直方圖

Question

我是 R 中管道和 dplyr 的新手，需要一些幫助。 請注意 - 我有一個使用 cut 功能解決這個問題的方法。

我想使用 dplyr 解決問題。 我想使用 dplyr 創建一個頻率表（不想存儲這個頻率表）並使用 ggplot 繪制數據。

問題：我有來自 2 個傳感器的數據——參考數據和傳感器數據（這是我正在評估的傳感器）。 傳感器數據是分類數據（1 或 2 或 3）。 我正在嘗試為不同的 bin 參考值繪制傳感器狀態的直方圖。 例如：當參考值為 1-5 時，我想查看傳感器 1 狀態（1 或 2 或 3）的頻率分布。 同樣，對於 6-10 個參考數據和高達 95-100 個參考值，我想要傳感器狀態的頻率分布。 請參閱下面的示例數據。 感謝幫助。

dput(sample1)
structure(list(dusttrak_conc = c(1.2, 0.2, 0.6, 1.6, 1, 1, 0.4, 
0.4, 0.8, 0.8, 0.4, 0.2, 15.8, 59.2, 55.4, 54.8, 54.6, 54.2, 
49, 53, 47.2, 44, 40.2, 39, 34.2, 35.8, 33.4, 30.6, 29.4, 29.2, 
27.6, 24.8, 24, 22, 21.2, 20.6, 18.6, 18, 17, 17.2, 14.8, 15.2, 
13.2, 13.4, 12, 11.8, 11, 10.8, 10, 9.2, 8.8, 8.8, 8.4, 7.8, 
7.6, 6.6, 6.4, 6.2, 6, 5.8, 5.4, 5, 4.8, 4.4, 4.2, 4, 3.8, 3.6, 
3.6, 3.6, 3, 2.8, 3, 2.8, 2.6, 2.4, 2.4, 2.2, 2, 2.2, 2.2, 1.8, 
1.8, 1.6, 1.8, 1.8, 2.2, 71.2, 75.8, 74.6, 74.6, 74.2, 67.2, 
66.2, 60.6, 60.6, 54.8, 53.6, 48.4, 45.2), sensor1_status = c(1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)), row.names = 113:212, class = "data.frame")

Answer 1

library(dplyr)
library(ggplot2)
dat %>%
  mutate(bin = cut(dusttrak_conc, breaks=seq(0,100,by=10))) %>%
  count(bin, sensor1_status) %>%
  ggplot(aes(bin, sensor1_status)) +
  geom_tile(aes(fill = n))

Answer 2

聽起來您希望將每個狀態級別1:3的頻率視為每個參考值范圍的條形圖。 這里的一個選擇是使用 faceting 為每個感興趣的范圍拆分一個新的圖表。 要生成拆分， base::cut()的替代方法是plyr::round_any() 。 在我的示例中，我分成 15 個箱，以使圖形更簡單，但您可以調整以適應。

注意：由於與共享名稱的 {dplyr} 函數沖突，通常不希望加載 {plyr} 庫。 因此，您可能只想顯式調用此函數或在腳本中手動定義它，如此處定義的那樣。

library(tidyverse)

d <- structure(list(reference = c(1.2, 0.2, 0.6, 1.6, 1, 1, 0.4, 0.4, 0.8, 0.8, 0.4, 0.2, 15.8, 59.2, 55.4, 54.8, 54.6, 54.2, 49, 53, 47.2, 44, 40.2, 39, 34.2, 35.8, 33.4, 30.6, 29.4, 29.2, 27.6, 24.8, 24, 22, 21.2, 20.6, 18.6, 18, 17, 17.2, 14.8, 15.2, 13.2, 13.4, 12, 11.8, 11, 10.8, 10, 9.2, 8.8, 8.8, 8.4, 7.8, 7.6, 6.6, 6.4, 6.2, 6, 5.8, 5.4, 5, 4.8, 4.4, 4.2, 4, 3.8, 3.6, 3.6, 3.6, 3, 2.8, 3, 2.8, 2.6, 2.4, 2.4, 2.2, 2, 2.2, 2.2, 1.8, 1.8, 1.6, 1.8, 1.8, 2.2, 71.2, 75.8, 74.6, 74.6, 74.2, 67.2, 66.2, 60.6, 60.6, 54.8, 53.6, 48.4, 45.2), 
                    sensor1_status = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)), row.names = 113:212, class = "data.frame")


d %>% 
  mutate(ref_range = plyr::round_any(reference, accuracy = 15, f = ceiling)) %>%
  ggplot(aes(sensor1_status)) +
  geom_bar() +
  facet_grid(rows = "ref_range", scales = "free_y")

^{由reprex 包於 2022-07-05 創建 (v2.0.1)}

Answer 3

dplyr 在這里不是必需的：

library('ggplot2')

# create ggplot; specify data frame and x-axis variable
ggplot(sample1, aes(x = sensor1_status)) +
  
  # geom_bar() counts the number of cases at each x position
  geom_bar(stat = "count") +
  
  # facet_wrap() creates a square-ish grid of multiple panels
  # - facets defines the "grouping" per panel; cut creates the bins
  # - scales chooses to keep all x/y-axes the same or not
  # - drop chooses if empty groups should be dropped
  facet_wrap(facets = vars(reference_bin = cut(dusttrak_conc, seq(0,100,5), right = F, include.lowest = T)),
             scales = "free_y",
             drop = F) +
  
  # format y axis: desire ± 4 ticks; round() + unique() to prevent fractions on the ticks
  # I used (the new base R) piping because you were interested in it :)
  scale_y_continuous(breaks=\(x) pretty(x, n = 4) |> round() |> unique() )

使用 dplyr/ggplot 創建頻率表並繪制直方圖

問題描述

3 個解決方案

解決方案1
1 2022-07-05 16:13:06

解決方案2
0 2022-07-05 17:43:47

解決方案3
0 2022-07-07 12:44:08

使用 dplyr/ggplot 創建頻率表並繪制直方圖

問題描述

3 個解決方案

解決方案1 1 2022-07-05 16:13:06

解決方案2 0 2022-07-05 17:43:47

解決方案3 0 2022-07-07 12:44:08

解決方案1
1 2022-07-05 16:13:06

解決方案2
0 2022-07-05 17:43:47

解決方案3
0 2022-07-07 12:44:08