在R中創建時間序列圖，每天對實例進行分箱，並根據分箱中的實例數繪制點大小

Question

我有很多個月的數據，每天都有每秒的讀數。 缺少幾個值。 數據位於格式為R的數據幀中：

日期值
2015-01-01 100
2015-01-01 300
2015-01-01 350
2015-02-01 400
2015-02-01 50

在我的代碼中，此數據框稱為“組合”，包含combined $ time（用於日期）和combined $ value（用於值）。 我想按天繪制值，以五分位數表示每個值范圍的實例數量（例如，每天的值數量介於100和200之間，數量介於200和300之間，依此類推）。 我已經將bin邊界的值定義為下限，上限等。在此圖中，我希望點的大小與當天該范圍內值的實例數相對應。

（我制作了該圖的示例圖像，但是我沒有足夠的聲譽點可以發布它！）

我當然還沒有寫出最有效的方法來執行此操作，但是我的主要問題是，既然我每天都成功地將值進行裝箱，那么如何實際生成圖。 我也很樂意為更好的方法提供任何建議。 這是我到目前為止的代碼：

lim<-c(lowlimit, midlowlimit, midupperlimit, uplimit)
bin <- c(0, 0, 0, 0)
for (i in 2:length(combined$values){
  if (is.finite(combined$value[i])=='TRUE'){  # account for NA values 
    if (combined$time[i]==combined$time[i-1]){
      if (combined$value[i] <= lowlimit){
        bin[1]=bin[1]+1
        i=i+1
      }
      else if (combined$value[i] > lowlimit && combined$value[i] <= midlowlimit){
        bin[2]=bin[2]+1
        i=i+1
      }
      else if (combined$value[i] > midlowlimit && combined$value[i] <= midupperlimit ){
        bin[3]=bin[3]+1
        i=i+1
      }
      else if (combined$value[i] > midupperlimit && combined$value[i] <= uplimit){
        bin[4]=bin[4]+1
        i=i+1
      }
      else if (combined$skin_temp[i] > uplimit ){
        bin[5]=bin[5]+1
        i=i+1
      }
    }

  else{
     ### I know the plotting portion here is incorrect ###
    for (j in 1:5){
    ggplot(combined$date[i], lim[j]) + geom_point(aes(size=bin[j]))}
    i = i+1}
  }
}

我非常感謝您可以提供的任何幫助！

Answer 1

這是我為您的嘗試。 希望我能正確閱讀您的問題。 似乎您想使用cut()每天創建五個組。 然后，您要計算每個組中存在多少個數據點。 您想每天執行此操作。 我創建了一個樣本數據來演示我的工作。

mydf <- data.frame(Date = as.Date(c("2015-01-01", "2015-01-01", "2015-01-01", "2015-01-01",
                                    "2015-01-02", "2015-01-02", "2015-01-02", "2015-01-02"),
                                    format = "%Y-%m-%d"),
                   Value = c(90, 300, 350, 430, 210, 330, 410, 500),
                   stringsAsFactors = FALSE)

### This is necessary later when you use left_join().
foo <- expand.grid(Date = as.Date(c("2015-01-01", "2015-01-02"), format = "%Y-%m-%d"),
                   group = c("a", "b", "c", "d", "e"))

library(dplyr)
library(ggplot2)
library(scales)

### You group your data by Date, and create five sub groups using cut().
### Then, you want to count how many data points exist for each date by
### group. This is done with count(). In this case, there are some subgroups
### which have no data points. They do not exist in the data frame that
### count() returns. So you want to use left_join() with foo. foo has all
### possible combination of Date and group. Once you join the two data frames,
### You want to replace NA with 0, which is done in the last mutate().

mutate(group_by(mydf, Date),
       group = cut(Value, breaks = c(0, 100, 200, 300, 400, 500),
       labels = c("a", "b", "c", "d", "e"))) %>%
count(Date, group) %>%
left_join(foo, ., by = c("Date" = "Date", "group" = "group")) %>%
rename(Total = n) %>%
mutate(Total = replace(Total, which(Total %in% NA), 0)) -> out


### Time to draw a figure
ggplot(data = out, aes(x = Date, y = Total, size = Total, color = group)) +
geom_point() +
scale_x_date(breaks = "1 day")

如果要修改y軸，則可以使用scale_y_continuous() 。 我希望這能幫到您。

在R中創建時間序列圖，每天對實例進行分箱，並根據分箱中的實例數繪制點大小

問題描述

1 個解決方案

解決方案1
1 已采納 2015-08-20 02:06:40

在R中創建時間序列圖，每天對實例進行分箱，並根據分箱中的實例數繪制點大小

問題描述

1 個解決方案

解決方案1 1 已采納 2015-08-20 02:06:40

解決方案1
1 已采納 2015-08-20 02:06:40