[英]using split, cut and duplicated function to filter data in R
我有一個如下所示的數據集。
date <- strptime(c("2011-09-01 00:00:00","2011-09-01 06:00:00","2011-09-01 12:00:00","2011-09-01 18:00:00","2011-09-02 00:00:00",
"2011-09-02 06:00:00","2011-09-02 12:00:00","2011-09-02 18:00:00","2011-09-03 00:00:00","2011-09-03 06:00:00","2011-09-03 12:00:00",
"2011-09-03 18:00:00","2011-09-04 00:00:00","2011-09-04 06:00:00","2011-09-04 12:00:00","2011-09-04 18:00:00","2011-09-05 00:00:00",
"2011-09-05 06:00:00","2011-09-05 12:00:00","2011-09-05 18:00:00","2011-09-06 00:00:00"), format ="%Y-%m-%d %H:%M:%S")
volt <- c(7,8,9,10, 7, 8, 9, 10, 6.1, 11.1, 9.1, 10.1, 7, 8, 9, 10, 6.3, 9.4, 1.3, 19.1, 5.6)
sampV <- data.frame(date,volt)
sampV
date volt
2011-09-01 00:00:00 7
2011-09-01 06:00:00 8
2011-09-01 12:00:00 9
2011-09-01 18:00:00 10
2011-09-02 00:00:00 7
2011-09-02 06:00:00 8
2011-09-02 12:00:00 9
2011-09-02 18:00:00 10
2011-09-03 00:00:00 6.1
2011-09-03 06:00:00 11.1
2011-09-03 12:00:00 9.1
2011-09-03 18:00:00 10.1
2011-09-04 00:00:00 7
2011-09-04 06:00:00 8
2011-09-04 12:00:00 9
2011-09-04 18:00:00 10
2011-09-05 00:00:00 6.3
2011-09-05 06:00:00 9.4
2011-09-05 12:00:00 1.3
2011-09-05 18:00:00 19.1
2011-09-06 00:00:00 5.6
現在我想每天使用日期列對數據進行分組,然后檢查 v 中的結果分組是否重復。 例如,重復 9 月 1 日和 2 日的“伏特”數據 (7,8,9,10)。
我一直在嘗試使用此代碼將其拆分為不同的日子,但這是我所能做到的。
t1 <- strptime("2011-09-01 00:00:00",format="%Y-%m-%d %H:%M:%S")
t2 <- strptime("2011-09-06 00:00:00",format="%Y-%m-%d %H:%M:%S")
seqD <- seq(t1,t2, by="day")
ctD <- cut(sampV$date, seqD, labels=F )
spD <- split(sampV$date,ctD)
所以我的問題是,您如何使用重復函數或任何與此相關的函數提取從一天復制到下一天的那些數據? 我只是 R 的初學者,我仍在學習繩索,因此將不勝感激您的幫助。 謝謝
假設我已經正確理解了您的問題,這是使用split
和duplicated
的一種方法:
days <- format(sampV$date, '%Y%m%d')
filtered <- split(sampV, days)[! duplicated(split(sampV$volt, days))]
do.call(rbind, filtered)
# date volt
# 20110901.1 2011-09-01 00:00:00 7.0
# 20110901.2 2011-09-01 06:00:00 8.0
# 20110901.3 2011-09-01 12:00:00 9.0
# 20110901.4 2011-09-01 18:00:00 10.0
# 20110903.9 2011-09-03 00:00:00 6.1
# 20110903.10 2011-09-03 06:00:00 11.1
# 20110903.11 2011-09-03 12:00:00 9.1
# 20110903.12 2011-09-03 18:00:00 10.1
# 20110905.17 2011-09-05 00:00:00 6.3
# 20110905.18 2011-09-05 06:00:00 9.4
# 20110905.19 2011-09-05 12:00:00 1.3
# 20110905.20 2011-09-05 18:00:00 19.1
# 20110906 2011-09-06 00:00:00 5.6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.