简体   繁体   English

R 无法在我的数据中找到分位数?

[英]R Can't Find Quantile Value Within My Data?

I have a dataframe of several vectors of velocity data that I have smoothed with rollmean().我有几个速度数据向量的数据框,我用 rollmean() 对其进行了平滑处理。 From my rolled means, I'd like to calculate the 0.75 and 1 quantiles with quantile(), and then find the row position of the first instance of those particular values.根据我的滚动平均值,我想用 quantile() 计算 0.75 和 1 分位数,然后找到这些特定值的第一个实例的行位置。 This is the code that I have written to do that:这是我为此编写的代码:

df_quantilesummary <- df %>%
  group_by(TrialID) %>%
  summarize(MaxVel = max(RollingMean, na.rm=TRUE), Qant75Vel = quantile(RollingMean, probs=c(0.75), na.rm=TRUE),
            TotalLength = length(RollingMean),RowNumMax = which(grepl(max(RollingMean, na.rm=TRUE), RollingMean)),
            RowNum75 = which(grepl(quantile(RollingMean, probs=c(0.75), na.rm=TRUE), RollingMean))[1])

No errors are getting thrown and everything seems to be working well, however in my dataframe, quantile will calculate the 75th quantile perfectly fine, but I am finding which(grepl(...)) will sometimes return NA?没有错误被抛出,一切似乎都运行良好,但是在我的数据框中,分位数将完美计算第 75 个分位数,但我发现 which(grepl(...)) 有时会返回 NA? As if the 75th quantile point it calculated doesn't exist.好像它计算的第 75 个分位数不存在。 It won't be for every trial, just some of them.不会针对每一次试验,只是针对其中一些试验。 And the which(grepl(...)) works fine for the maximum value (it doesn't matter if I use max() or if I use quantile(x, probs=1), both ways work).并且 which(grepl(...)) 对于最大值可以正常工作(无论我使用 max() 还是使用 quantile(x, probs=1),两种方式都可以)。

I made a mock dataframe below, but the code is working for it, so I am at a loss as to what is going on.我在下面制作了一个模拟数据框,但代码正在为它工作,所以我不知道发生了什么。 Any insights would be helpful.任何见解都会有所帮助。 Thank you.谢谢你。

set.seed(82828)
dummyvelocity1 <- c(runif(200, min=-0.0523, max=1))
set.seed(3983289389)
dummyvelocity2 <- c(runif(200, min=-0.1, max=1.2))
set.seed(227272)
dummyvelocity3 <- c(runif(200, min=-0.08, max=0.9))
set.seed(27272728393)
dummyvelocity4 <- c(runif(200, min=-0.02, max=1.45))
set.seed(1488)
dummyvelocity5 <- c(runif(200, min=-0.07, max=1.03))
Velocity <- c(dummyvelocity1, dummyvelocity2, dummyvelocity3, dummyvelocity4, dummyvelocity5)
TrialID <- c(rep(1, 200), rep(2, 200), rep(3, 200), rep(4,200), rep(5, 200)) 
df <- data.frame(TrialID, Velocity)
dflist <- split(df$Velocity, df$TrialID)
RollingMeanList <- lapply(dflist, function(x) rollmean(x, 20, fill=NA))
RollingMean <- unlist(RollingMeanList)
df <- cbind(df, RollingMean)

This is likely a precision issue.. You can try to use near()这可能是一个精度问题。您可以尝试使用near()

df_quantilesummary <- df %>%
  group_by(TrialID) %>%
  summarize(
    MaxVel = max(RollingMean, na.rm=TRUE),
    Qant75Vel = quantile(RollingMean, probs=c(0.75), na.rm=TRUE),
    TotalLength = length(RollingMean),
    RowNumMax = which(near(RollingMean, max(RollingMean, na.rm=T))),
    RowNum75 = which(near(RollingMean, quantile(RollingMean, probs=0.75, na.rm=T)))
  )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM