简体   繁体   中英

Generating yearly frequency plot from multi-year data

I have hourly wind speed data in the following format

DT,DIR,SPEED                                                                                                                               
2002/01/01 00:00,***,0.0
2002/01/01 03:00,360,0.89408
2002/01/01 06:00,070,4.91744
2002/01/01 09:00,050,4.4704
2002/01/01 15:00,050,2.2352
2002/01/01 18:00,050,3.12928
2002/01/01 21:00,020,0.89408

which starts with data point recorded once in three hours to data point three times in a single hour from year 2002 to 2012 as below:

2012/12/31 00:00,***,0.0
2012/12/31 00:10,***,0.0
2012/12/31 00:40,***,0.0
2012/12/31 01:10,***,0.0
2012/12/31 01:40,***,0.0
2012/12/31 02:10,***,0.0
2012/12/31 02:40,***,0.0
2012/12/31 03:00,***,0.0
2012/12/31 03:10,310,2.2352
2012/12/31 03:40,060,4.02336
2012/12/31 04:40,060,3.12928
2012/12/31 05:10,070,4.91744

I am trying to create yearly frequency plots showing SPEED vs No.of.Hours using R. I tried to use the histograms but the number of points is unequal and certainly doesn't exactly represent no.of hours. How can this be solved?

Note: DIR value is not used, also * is considered as NA

You could estimate speed for every hour using the approx() function, then use those estimated hourly speeds to create your histograms. For example, assuming your data frame is called df , ...

library(lubridate)
# date/time as class POSIXct
df$DT2 <- ymd_hm(df$DT)

# create a new data frame, everyhour, with every hour between the first and the last in df
everyhour <- data.frame(DT2=seq(ceiling_date(min(df$DT2), "hour"), floor_date(max(df$DT2), "hour"), 3600), FORHIST=TRUE)

# merge the observed data with the everyhour data
df2 <- merge(df, everyhour, all=TRUE)
# set missing FORHIST to FALSE
df2$FORHIST[is.na(df2$FORHIST)] <- FALSE
# define year
df2$YEAR <- year(df2$DT2)
# estimate speed for everyhour
df2$estSPEED <- approx(x=df2$DT2, y=df2$SPEED, xout=df2$DT2, method="linear")$y

# plot annual histograms of hourly speeds
suy <- sort(unique(df2$YEAR))
par(mfrow=n2mfrow(length(suy)), mar=c(3, 3, 2, 1), oma=c(2, 2, 0, 0))
for(i in seq(suy)) {
    sel <- df2$YEAR==suy[i] & df2$FORHIST==TRUE
    hist(df2$estSPEED[sel], xlab="", ylab="", main=suy[i])
    }
mtext("Speed", side=1, outer=TRUE)
mtext("Frequency", side=2, outer=TRUE)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM