如何使用分数数据计算可变列数的平均变化

Question

I am trying to find the average difference in a score over repeated measures.我试图找到重复测量分数的平均差异。 The problem is, not every observation is measured unequally often and the values in the columns represent scores on 6 point scale.问题是，并非每个观察值都经常被不平等地测量，并且列中的值代表 6 分制的分数。

the data is present it both Long and wide format like this:数据以长格式和宽格式存在，如下所示：

ID    Type    M1    M2    M3    M4    M6
1      A       5     5    3
2      A       4     3    1
3      A       2     5    3     5      5
4      C       5     4    4     3
5      B       3 
6      F       4     2    3     4      1

This is the alternative format:这是替代格式：

ID    Type    M    Score
1       A     1      5
1       A     2      5
1       A     3      3
2       A     1      4
2       A     2      3
2       A     3      1
4       C     1      5
4       C     2      4
4       C     3      4
4       C     4      3

I am not really interested in the interim values, but I need the difference between M1 and whatever is the last measurement for that ID then I need to take the average of those differences.我对中间值并不真正感兴趣，但我需要 M1 与该 ID 的最后一次测量值之间的差异，然后我需要取这些差异的平均值。 I will need to do it across all types and then later broken down by type.我需要在所有类型中执行此操作，然后再按类型进行细分。

Packages installed are: dplyr, purrr, stringr, tydir, tibble, data.table安装的软件包有：dplyr、purrr、stringr、tydir、tibble、data.table

The closest I got was the following:我得到的最接近的是以下内容：

df %>% group_by(M)%>%
    arrange(M)%>%
    summarize(avg = as.numeric(mean(diff(Score))), sd = 
as.numeric(sd(diff(Score))))

and和

df %>% group_by(Type)%>%
    arrange(M)%>%
    summarize(avg = as.numeric(mean(diff(Score))), sd = 
as.numeric(sd(diff(Score))))

This was done on the Long format data and gave the result:这是在长格式数据上完成的，并给出了结果：

       M           avg       sd
     <fctr>       <dbl>    <dbl>
 1            1          NA       NA
 2            2          NA       NA
 3            3 -0.03370787 1.741534
 4            4 -0.04878049 2.036556
 5            5 -0.18181818 1.887760
 6            6  0.00000000 1.095445
 7            7         NaN       NA
 8            8         NaN       NA
 9            9         NaN       NA
10         <NA> -0.16666667 1.722401

The table above is taken from my analysis and not related to example tables.上表摘自我的分析，与示例表无关。 The NA and NaN are a problem as I know there is data in some of the sections, but it is unable to calculate the average difference. NA 和 NaN 是一个问题，因为我知道某些部分有数据，但无法计算平均差异。

Answer 1

One solution for avg per ID could be using dplyr based on OP feedback to calculate average of difference of first and last measurement.每个ID avg一种解决方案是使用基于 OP 反馈的dplyr来计算第一次和最后一次测量的差异的平均值。

library(dplyr)

df %>% group_by(ID) %>%
  arrange(M) %>%
  summarise(avg = abs(first(Score) - last(Score))/n())

#Result
#     ID   avg
#  <int> <dbl>
#1     1 0.667
#2     2 1.00 
#3     4 0.500

Actual average and SD for each ID can be calculated as:每个ID实际average和SD可以计算为：

df %>% group_by(ID) %>%
  arrange(M) %>%
  summarise(avg = mean(Score), SD = sd(Score))

#Result
     ID   avg    SD
  <int> <dbl> <dbl>
1     1  4.33 1.15 
2     2  2.67 1.53 
3     4  4.00 0.816

如何使用分数数据计算可变列数的平均变化

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-03-02 22:38:41

如何使用分数数据计算可变列数的平均变化

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-03-02 22:38:41

解决方案1
0 已采纳 2018-03-02 22:38:41