使用 R 中的 nls function 循環分組數據

Question

我有一個分組數據集。 我的數據按 GaugeID 分組。 我有一個 nls function 我想遍歷每個組並提供一個 output 值。

library(tidyverse)
library(stats)

# sample of data (yearly), first column is gauge (grouping variable), year, then two formula inputs PETvP and ETvP 

# A tibble: 10 x 4
   GaugeID  WATERYR  PETvP  ETvP 
   <chr>      <dbl>  <dbl> <dbl>  
 1 06892000    1981  0.854 0.754 
 2 06892000    1982  0.798 0.708 
 3 06892000    1983  1.12  0.856 
 4 06892000    1984  0.905 0.720  
 5 06892000    1985  0.721 0.618 
 6 06892000    1986  0.717 0.625 
 7 06892000    1987  0.930 0.783 
 8 06892000    1988  1.57  0.945 
 9 06892000    1989  1.15  0.739 
10 06892000    1990  0.933 0.805 
11 08171300    1981  0.854 0.754 
12 08171300    1982  0.798 0.708 
13 08171300    1983  1.12  0.856 
14 08171300    1984  0.905 0.720  
15 08171300    1985  0.721 0.618 
16 08171300    1986  0.717 0.625 
17 08171300    1987  0.930 0.783 
18 08171300    1988  1.57  0.945 
19 08171300    1989  1.15  0.739 
20 08171300    1990  0.933 0.805 

# attempted for loop
for (i in unique(yearly$GaugeID)) {
   myValue = nls(ETvP[i] ~ I(1 + PETvP[i] - (1 + PETvP[i]^(w))^(1/w)), data = yearly,
    start =  list(w = 2), trace = TRUE)
}

我收到以下錯誤

Error in model.frame.default(formula = ~ETvP + i + PETvP, data = yearly) : 
  variable lengths differ (found for 'i')

我沒有找到太多關於使用 nls function 循環的信息。 本質上，我正在制作曲線，並且需要每個儀表的曲線 (w) 值到 output。 如果我將公式分配給一個儀表（如果我對數據進行子集化，即第一個儀表），它會起作用，但當我嘗試在具有分組數據的整個數據幀上使用它時則不行。 例如，這有效

# gaugeA 
# A tibble: 10 x 4
   GaugeID  WATERYR  PETvP  ETvP 
   <chr>      <dbl>  <dbl> <dbl>  
 1 06892000    1981  0.854 0.754 
 2 06892000    1982  0.798 0.708 
 3 06892000    1983  1.12  0.856 
 4 06892000    1984  0.905 0.720  
 5 06892000    1985  0.721 0.618 
 6 06892000    1986  0.717 0.625 
 7 06892000    1987  0.930 0.783 
 8 06892000    1988  1.57  0.945 
 9 06892000    1989  1.15  0.739 
10 06892000    1990  0.933 0.805 

test = nls(ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w)), data = gaugeA, 
    start =  list(w = 2), trace = TRUE)

1.574756    (4.26e+00): par = (2)
0.2649549   (1.46e+00): par = (2.875457)
0.09466832  (3.32e-01): par = (3.59986)
0.08543699  (2.53e-02): par = (3.881397)
0.08538308  (9.49e-05): par = (3.907099)
0.08538308  (1.13e-06): par = (3.907001)
> test
Nonlinear regression model
  model: ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w))
   data: gaugeA
    w 
3.907 
 residual sum-of-squares: 0.08538

Number of iterations to convergence: 5 
Achieved convergence tolerance: 1.128e-06

關於如何獲得整個分組 dataframe 的子集結果的任何想法？ 它有超過 600 種不同的儀表。 先感謝您。

Answer 1

以下任何一項都將起作用：

使用summarise ：

df %>%
  group_by(GaugeID) %>%
  summarise(result = list(nls(ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w)), 
                              data = cur_data(), 
                start =  list(w = 2)))) %>%
  pull(result)

[[1]]
Nonlinear regression model
  model: ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w))
   data: cur_data()
    w 
3.607 
 residual sum-of-squares: 0.01694

Number of iterations to convergence: 5 
Achieved convergence tolerance: 7.11e-08

[[2]]
Nonlinear regression model
  model: ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w))
   data: cur_data()
    w 
1.086 
 residual sum-of-squares: 0.1532

Number of iterations to convergence: 5 
Achieved convergence tolerance: 2.685e-07

使用map ：

df %>%
  group_split(GaugeID) %>%
  map(~nls(ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w)), 
           data = .x, 
           start =  list(w = 2)))

Answer 2

我通常更喜歡purrr和dplyr在分組數據上循環函數。 我無法編輯數據，但也許這有效：

library(dplyr)
library(purrr)

yearly %>% group_by(GaugeID) %>% summarise(test = nls(ETvP ~ I(1 + PETvP - (1 + PETvP^(w))^(1/w)), data = gaugeA, start =  list(w = 2), trace = TRUE)

Answer 3

可以制定單個 model 消除環路。 確保 GaugeID 是一個因子，公式中的 GaugeID 為 w 下標，並提供一個起始值列表，其 w 分量是一個向量，每個級別的 GaugeID 都有一個起始值。

df$GaugeID <- factor(df$GaugeID)
fo <- ETvP ~ 1 + PETvP - (1 + PETvP^(w[GaugeID]))^(1/w[GaugeID])
st <- list(w = rep(2, nlevels(df$GaugeID)))
nls(fo, df, start = st)

使用 R 中的 nls function 循環分組數據

問題描述

3 個解決方案

解決方案1
0 2021-12-02 19:31:04

解決方案2
0 2021-12-02 19:31:31

解決方案3
0 2021-12-03 03:53:06

使用 R 中的 nls function 循環分組數據

問題描述

3 個解決方案

解決方案1 0 2021-12-02 19:31:04

解決方案2 0 2021-12-02 19:31:31

解決方案3 0 2021-12-03 03:53:06

解決方案1
0 2021-12-02 19:31:04

解決方案2
0 2021-12-02 19:31:31

解決方案3
0 2021-12-03 03:53:06