简体   繁体   English

RSQLite 对查询中的函数使用 group-by

[英]RSQLite using group-by for functions within the query

I am very new to SQL (using it in R currently with RSQLite and DBI packages)我对 SQL 非常陌生(在 R 目前使用 RSQLite 和 DBI 包)

I am trying to create a column that is the absolute mean deviation, aka:我正在尝试创建一个绝对平均偏差的列,即:

(i) - AVG(i,g) (i) - 平均 (i,g)

Where i is the individual occurence and the AVG component is the average for the group.其中 i 是个体发生率,AVG 分量是组的平均值。 What I am having troubles with is making sure the AVG component is the only part that gets grouped.我遇到的麻烦是确保 AVG 组件是唯一被分组的部分。 When I do GROUP BY, it groups everything and doesn't give me the right number.当我执行 GROUP BY 时,它会将所有内容分组,并且不会给我正确的数字。

Here is the sample data:这是示例数据:

student学生 class class grade年级
A一个 English英语 79 79
A一个 Spanish西班牙语 65 65
A一个 Chemistry化学 92 92
B English英语 46 46
B Spanish西班牙语 83 83
B Chemistry化学 78 78
C C English英语 67 67
C C Spanish西班牙语 87 87
C C Chemistry化学 98 98
D D English英语 99 99
D D Spanish西班牙语 80 80
D D Chemistry化学 75 75

Basically I would want the individual GRADE for a student in a class to compare with the average of that student (eg the individual english grade - the total average for a student)基本上我希望 class 中的学生的个人成绩与该学生的平均成绩进行比较(例如个人英语成绩 - 学生的总平均成绩)

Iha ve tried the following:我已经尝试了以下方法:

dbGetQuery(gradesdb, "SELECT student, 
                      ABS(grade-AVG(grade)) AS mad
                      FROM grades
                      GROUP BY student,class")

This gives me 0 for all of the stat values (which I gather is because the group by is going on all selected operations within the query)这给了我所有统计值的 0 (我收集是因为 group by 正在查询中的所有选定操作)

how can I make it so that the AVG portion of the calculation is only "grouped" by the student.我怎样才能使计算的 AVG 部分仅由学生“分组”。 I get the right calculation if I do:如果我这样做,我会得到正确的计算:

dbGetQuery(gradesdb2, "SELECT student, 
                      ABS(grade-AVG(grade)) AS mad
                      FROM grades
                      GROUP BY student")

But then I only get the first class for each student, instead of the stat for each class with the student average.但后来我只得到每个学生的第一个 class,而不是每个 class 的统计数据和学生的平均值。

I want to do this all in SQL and not calculate the average as a seperate column with base R or tidyverse.我想在 SQL 中完成这一切,而不是将平均值计算为具有基本 R 或 tidyverse 的单独列。

Thank you so much for your help!非常感谢你的帮助!

Consider turning AVG() via GROUPBY to AVG() via a window function :考虑通过GROUPBYAVG() AVG()为通过window function的 AVG() :

SELECT student, 
       ABS(grade - AVG(grade) OVER (PARTITION BY student)) AS mad
FROM grades

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM