R：计算线性回归并获得“数据子集”的斜率

Question

我的目标是找出半衰期（如果有人熟悉药代动力学，则从末期开始）

我有一些包含以下内容的数据；

1500 行， ID是主“键”。 每个ID有 15 行。 然后我还有其他列TIME和CONCENTRATION 。 现在我想要做的是，对于每个ID ，删除第一个TIME （等于“000”（数字）），然后在每个ID的剩余 14 行上运行lm() function ，然后使用 abs() 提取斜率的绝对值，然后将其保存到名为THALF的新列中。 （如果有人熟悉药代动力学，也许有更好的方法来做到这一点？）

但是使用我对 R 的有限知识，我无法做到这一点。

到目前为止，这是我想出的：

data_new <- data %>% dplyr::group_by(data $ID) %>% dplyr::filter(data $TIME != 10) %>% dplyr::mutate(THAFL = abs(lm$coefficients[2](data $CONC ~ data $TIME)))

根据我从其他 Stackoverflow 答案中了解到的情况，lm$coefficients[2] 将提取斜率。

但是，我无法完成这项工作。 我尝试运行代码时收到此错误：

Error: Problem with `mutate()` input `..1`.
x Input `..1` can't be recycled to size 15.
i Input `..1` is `data$ID`.
i Input `..1` must be size 15 or 1, not 1500.
i The error occurred in group 1: data$ID = "pat1".

关于如何解决这个问题的任何建议？ 如果您需要更多信息，请告诉我。

（另外，如果有人熟悉药代动力学，当他们要求从终末期获得半衰期时，我是否从浓度 max 执行 lm()？我有一个列，其中包含在什么时间观察到的最高浓度的值。）

Answer 1

如果在 model 拟合之后您仍然需要TIME == 10的观察结果，您可以尝试在按ID分组后进行汇总，然后使用右连接

data %>% 
  filter(TIME != 10) %>% 
  group_by(ID) %>%
  summarise(THAFL = abs(lm(CONC ~ TIME)$coefficients[2])) %>% 
  right_join(data, by = "ID")


# A tibble: 30 x 16
   ID      THAFL Sex   Weight..kg. Height..cm. Age..yrs. T134A A443G G769C G955C A990C  TIME  CONC LBM   `data_combine$ID`  CMAX
   <chr>   <dbl> <chr>       <int>       <int>     <int> <int> <int> <int> <int> <int> <dbl> <dbl> <chr> <chr>             <dbl>
 1 pat1  0.00975 F              50         135        47     0     2     1     2     0    10  0    Under pat1                 60
 2 pat1  0.00975 F              50         135        47     0     2     1     2     0    20  6.93 Under pat1                 60
 3 pat1  0.00975 F              50         135        47     0     2     1     2     0    30 12.2  Under pat1                 60
 4 pat1  0.00975 F              50         135        47     0     2     1     2     0    45 14.8  Under pat1                 60
 5 pat1  0.00975 F              50         135        47     0     2     1     2     0    60 15.0  Under pat1                 60
 6 pat1  0.00975 F              50         135        47     0     2     1     2     0    90 12.4  Under pat1                 60
 7 pat1  0.00975 F              50         135        47     0     2     1     2     0   120  9.00 Under pat1                 60
 8 pat1  0.00975 F              50         135        47     0     2     1     2     0   150  6.22 Under pat1                 60
 9 pat1  0.00975 F              50         135        47     0     2     1     2     0   180  4.18 Under pat1                 60
10 pat1  0.00975 F              50         135        47     0     2     1     2     0   240  1.82 Under pat1                 60
# ... with 20 more rows

如果在 model 拟合之后，您不希望TIME == 10的行出现在数据集上，则可以使用mutate

data %>% 
  filter(TIME != 10) %>% 
  group_by(ID) %>%
  mutate(THAFL = abs(lm(CONC ~ TIME)$coefficients[2]))

# A tibble: 28 x 16
# Groups:   ID [2]
   ID    Sex   Weight..kg. Height..cm. Age..yrs. T134A A443G G769C G955C A990C  TIME  CONC LBM   `data_combine$ID`  CMAX   THAFL
   <chr> <chr>       <int>       <int>     <int> <int> <int> <int> <int> <int> <dbl> <dbl> <chr> <chr>             <dbl>   <dbl>
 1 pat1  F              50         135        47     0     2     1     2     0    20  6.93 Under pat1                 60 0.00975
 2 pat2  M              75         175        29     0     2     0     0     0    20  6.78 Under pat2                 60 0.00835
 3 pat1  F              50         135        47     0     2     1     2     0    30 12.2  Under pat1                 60 0.00975
 4 pat2  M              75         175        29     0     2     0     0     0    30 11.6  Above pat2                 60 0.00835
 5 pat1  F              50         135        47     0     2     1     2     0    45 14.8  Under pat1                 60 0.00975
 6 pat2  M              75         175        29     0     2     0     0     0    45 13.5  Under pat2                 60 0.00835
 7 pat1  F              50         135        47     0     2     1     2     0    60 15.0  Under pat1                 60 0.00975
 8 pat2  M              75         175        29     0     2     0     0     0    60 13.1  Above pat2                 60 0.00835
 9 pat1  F              50         135        47     0     2     1     2     0    90 12.4  Under pat1                 60 0.00975
10 pat2  M              75         175        29     0     2     0     0     0    90  9.77 Under pat2                 60 0.00835
# ... with 18 more rows

Answer 2

您可以使用broom ：

library(broom)
library(dplyr)
#Code
data %>% group_by(ID) %>%
  filter(TIME!=10) %>%
  do(fit = tidy(lm(CONC ~ TIME, data = .))) %>% 
  unnest(fit) %>%
  filter(term=='TIME') %>%
  mutate(estimate=abs(estimate))

Output：

# A tibble: 2 x 6
  ID    term  estimate std.error statistic p.value
  <chr> <chr>    <dbl>     <dbl>     <dbl>   <dbl>
1 pat1  TIME   0.00975   0.00334     -2.92  0.0128
2 pat2  TIME   0.00835   0.00313     -2.67  0.0204

如果需要加入原始数据，请尝试：

#Code 2
data <- data %>% left_join(data %>% group_by(ID) %>%
  filter(TIME!=10) %>%
  do(fit = tidy(lm(CONC ~ TIME, data = .))) %>% 
  unnest(fit) %>%
  filter(term=='TIME') %>%
  mutate(estimate=abs(estimate)) %>%
  select(c(ID,estimate)))

类似于@RicS 。

R：计算线性回归并获得“数据子集”的斜率

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-12-06 15:03:12

解决方案2
1 2020-12-06 15:05:34

R：计算线性回归并获得“数据子集”的斜率

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-12-06 15:03:12

解决方案2 1 2020-12-06 15:05:34

解决方案1
1 已采纳 2020-12-06 15:03:12

解决方案2
1 2020-12-06 15:05:34