[英]R Match and compare values from different vectors
我正在從單個價格向量計算每小時平均價格。 我想將此小時值與每日平均值進行比較-並刪除所有每日均值超過2倍的值。 我可以很容易地計算出不同的值,但是我不知道如何將小時值與每日值進行比較?
快速數據示例:
df <- data.frame(dates = rep(seq(from = as.POSIXct("2013-01-01 00:00:00", tz = "UTC"),
to = as.POSIXct("2013-01-30 23:00:00", tz = "UTC"), by = "hour" ), 12),
price = runif(8640, min = -25, max = 225) )
require(dplyr)
results <- group_by(df, dates)
results <- summarise(results,
average = mean(price))
day_results <- mutate(df, days = format(df$dates, "%Y-%m-%d"))
day_results <- group_by(day_results, days)
day_results <- summarise(day_results,
average_d = mean(price))
我不知道如何將24個average
與average_d
的單個日值進行比較。
我想做什么很清楚嗎?
這簡單嗎:
> df %>% group_by(dates) %>% filter(price>2*mean(price))
Source: local data frame [811 x 2]
Groups: dates
dates price
1 2013-01-01 02:00:00 182.4726
2 2013-01-01 07:00:00 155.5009
3 2013-01-01 20:00:00 139.6948
4 2013-01-01 22:00:00 132.3332
5 2013-01-02 06:00:00 222.0633
6 2013-01-03 01:00:00 217.6383
7 2013-01-03 15:00:00 224.7268
8 2013-01-03 18:00:00 215.8826
即按日期對數據進行分組,然后僅過濾價格大於該組中均值兩倍的數據? 或者,如果您也想在輸出中保留平均價格,請執行以下操作:
> df %>% group_by(dates) %>% mutate(average=mean(price)) %>% filter(price > 2*average) %>% arrange(dates)
Source: local data frame [811 x 3]
Groups: dates
dates price average
1 2013-01-01 00:00:00 140.5748 70.12211
2 2013-01-01 00:00:00 201.6484 70.12211
3 2013-01-01 01:00:00 223.9240 89.91996
4 2013-01-01 01:00:00 196.5975 89.91996
5 2013-01-01 01:00:00 203.6165 89.91996
6 2013-01-01 02:00:00 182.4726 70.85858
7 2013-01-01 02:00:00 193.0930 70.85858
8 2013-01-01 02:00:00 177.7848 70.85858
9 2013-01-01 03:00:00 202.9842 92.84580
10 2013-01-01 03:00:00 217.1840 92.84580
那也使用arrange
按日期排序輸出。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.