[英]RSQLite using group-by for functions within the query
I am very new to SQL (using it in R currently with RSQLite and DBI packages)我对 SQL 非常陌生(在 R 目前使用 RSQLite 和 DBI 包)
I am trying to create a column that is the absolute mean deviation, aka:我正在尝试创建一个绝对平均偏差的列,即:
(i) - AVG(i,g) (i) - 平均 (i,g)
Where i is the individual occurence and the AVG component is the average for the group.其中 i 是个体发生率,AVG 分量是组的平均值。 What I am having troubles with is making sure the AVG component is the only part that gets grouped.
我遇到的麻烦是确保 AVG 组件是唯一被分组的部分。 When I do GROUP BY, it groups everything and doesn't give me the right number.
当我执行 GROUP BY 时,它会将所有内容分组,并且不会给我正确的数字。
Here is the sample data:这是示例数据:
student![]() |
class ![]() |
grade![]() |
---|---|---|
A![]() |
English![]() |
79 ![]() |
A![]() |
Spanish![]() |
65 ![]() |
A![]() |
Chemistry![]() |
92 ![]() |
B![]() |
English![]() |
46 ![]() |
B![]() |
Spanish![]() |
83 ![]() |
B![]() |
Chemistry![]() |
78 ![]() |
C ![]() |
English![]() |
67 ![]() |
C ![]() |
Spanish![]() |
87 ![]() |
C ![]() |
Chemistry![]() |
98 ![]() |
D ![]() |
English![]() |
99 ![]() |
D ![]() |
Spanish![]() |
80 ![]() |
D ![]() |
Chemistry![]() |
75 ![]() |
Basically I would want the individual GRADE for a student in a class to compare with the average of that student (eg the individual english grade - the total average for a student)基本上我希望 class 中的学生的个人成绩与该学生的平均成绩进行比较(例如个人英语成绩 - 学生的总平均成绩)
Iha ve tried the following:我已经尝试了以下方法:
dbGetQuery(gradesdb, "SELECT student,
ABS(grade-AVG(grade)) AS mad
FROM grades
GROUP BY student,class")
This gives me 0 for all of the stat values (which I gather is because the group by is going on all selected operations within the query)这给了我所有统计值的 0 (我收集是因为 group by 正在查询中的所有选定操作)
how can I make it so that the AVG portion of the calculation is only "grouped" by the student.我怎样才能使计算的 AVG 部分仅由学生“分组”。 I get the right calculation if I do:
如果我这样做,我会得到正确的计算:
dbGetQuery(gradesdb2, "SELECT student,
ABS(grade-AVG(grade)) AS mad
FROM grades
GROUP BY student")
But then I only get the first class for each student, instead of the stat for each class with the student average.但后来我只得到每个学生的第一个 class,而不是每个 class 的统计数据和学生的平均值。
I want to do this all in SQL and not calculate the average as a seperate column with base R or tidyverse.我想在 SQL 中完成这一切,而不是将平均值计算为具有基本 R 或 tidyverse 的单独列。
Thank you so much for your help!非常感谢你的帮助!
Consider turning AVG()
via GROUPBY
to AVG()
via a window function :考虑通过
GROUPBY
将AVG()
AVG()
为通过window function的 AVG() :
SELECT student,
ABS(grade - AVG(grade) OVER (PARTITION BY student)) AS mad
FROM grades
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.