简体   繁体   English

Postgres,Rails和选择不在group子句中的列

[英]Postgres, Rails and selecting columns that are not in group clause

I have the following query in which I want to group by treatment_selections.treatment_id and select the treatments.name column to be called: 我有以下查询,我想通过treatment_selections.treatment_id进行分组,并选择要调用的treatments.name列:

@search = Trial.joins(:quality_datum, treatment_selections: :treatment)
.select('DISTINCT ON (treatment_selections.treatment_id) treatment_selections.treatment_id, treatments.name, AVG(quality_data.yield) as yield')
.where("EXTRACT(year from season_year) BETWEEN #{params[:start_year]} AND #{params[:end_year]}")

I get the dreaded error: 我得到了可怕的错误:

PG::GroupingError: ERROR:  column "treatment_selections.treatment_id" must appear in the GROUP BY clause or be used in an aggregate function

So I switched to the following query: 所以我切换到以下查询:

@search = Trial.joins(:quality_datum, treatment_selections: :treatment)
.select('treatments.name, treatment_selections.treatment_id, treatments.name, AVG(quality_data.yield) as yield')
.where("EXTRACT(year from season_year) BETWEEN #{params[:start_year]} AND #{params[:end_year]}")  
.group('treatment_selections.treatment_id')

Which I know won't work because of not referencing treatments.name in the group clause. 由于没有引用group子句中的treatments.name ,我所知道的将无法工作。 But I figured the top method should of worked as I'm not grouping by anything. 但我认为顶级方法应该起作用,因为我没有任何分组。 I understand that using such methods as AVG and SUM are not needed to be referenced in the group clause, but what about columns that don't reference any aggregate functions? 我知道在组子句中不需要使用诸如AVG和SUM之类的方法,但是那些不引用任何聚合函数的列呢?

I have seen that nesting queries is a possible way of doing what I'm after, but I'm unsure of how best to implement this using the above query. 我已经看到嵌套查询是一种可能的方式来做我想要的事情,但我不确定如何使用上述查询实现这一点。 Hoping someone could help me out here. 希望有人可以帮助我。

Log 日志

SELECT treatment_selections.treatment_id, treatment.name, AVG(quality_data.yield) as yield FROM "trials" INNER JOIN "treatment_selections" ON "treatment_selections"."trial_id" = "trials"."id" INNER JOIN "quality_data" ON "quality_data"."treatment_selection_id" = "treatment_selections"."id" INNER JOIN "treatment_selections" "treatment_selections_trials" ON "treatment_selections_trials"."trial_id" = "trials"."id" INNER JOIN "treatments" ON "treatments"."id" = "treatment_selections_trials"."treatment_id" WHERE (EXTRACT(year from season_year) BETWEEN 2018 AND 2018) GROUP BY treatment_selections.treatment_id)

Selecting multiple columns (without aggregation) and using aggregate functions together won't be possible, unless you group by the selected columns - otherwise there is no way to determine how the average should be calculated (entire data set vs grouped by something). 除非按所选列进行分组,否则将无法选择多个列(没有聚合)并一起使用聚合函数 - 否则无法确定应如何计算平均值(整个数据集与按某种方式分组)。 You could do this - 你可以这样做 -

@search = Trial.joins(:quality_datum, treatment_selections: :treatment)
.select('treatment_selections.treatment_id, treatments.name, AVG(quality_data.yield) as yield')
.where("EXTRACT(year from season_year) BETWEEN ? AND ?", params[:start_year], params[:end_year])  
.group('treatment_selections.treatment_id, treatments.name')

Although this might not work well for your use case if one treatments.id can be associated with mutiple treatment.name 虽然这可能不适合你的使用情况以及工作,如果一个treatments.id可以用多发相关treatment.name

I am not expert on Rails but lets analyze the logged query: 我不是Rails的专家,但让我们分析记录的查询:

SELECT treatment_selections.treatment_id, treatment.name, AVG(quality_data.yield) as yield 选择treatment_selections.treatment_id,treatment.name,AVG(quality_data.yield)作为产量
FROM "trials" 从“试验”
INNER JOIN "treatment_selections" ON "treatment_selections"."trial_id" = "trials"."id" INNER JOIN“treatment_selections”ON“treatment_selections”。“trial_id”=“试验”。“id”
INNER JOIN "quality_data" ON "quality_data"."treatment_selection_id" = "treatment_selections"."id" INNER JOIN“quality_data”ON“quality_data”。“treatment_selection_id”=“treatment_selections”。“id”
INNER JOIN "treatment_selections" "treatment_selections_trials" ON "treatment_selections_trials"."trial_id" = "trials"."id" INNER JOIN“treatment_selections”“treatment_selections_trials”ON“treatment_selections_trials”。“trial_id”=“试验”。“id”
INNER JOIN "treatments" ON "treatments"."id" = "treatment_selections_trials"."treatment_id" INNER JOIN“治疗”ON“治疗”。“id”=“treatment_selections_trials”。“treatment_id”
WHERE (EXTRACT(year from season_year) BETWEEN 2018 AND 2018) 在哪里(2018年和2018年之间的提取年份(来自season_year))
GROUP BY treatment_selections.treatment_id GROUP BY treatment_selections.treatment_id

Maybe you are relying in the clause DISTINCT ON to make this work without specifying both columns. 也许您依赖DISTINCT ON子句来完成这项工作而不指定两列。 But as you see in the log, this is not being translated into SQL. 但正如您在日志中看到的那样,这不会被转换为SQL。

SELECT [missing DISTINCT ON(treatment_selections.treatment_id)] treatment_selections.treatment_id, treatment.name, AVG(quality_data.yield) as yield SELECT [缺少DISTINCT ON(treatment_selections.treatment_id)] treatment_selections.treatment_id,treatment.name,AVG(quality_data.yield)作为产量
FROM "trials" 从“试验”
INNER JOIN "treatment_selections" ON "treatment_selections"."trial_id" = "trials"."id" INNER JOIN“treatment_selections”ON“treatment_selections”。“trial_id”=“试验”。“id”
INNER JOIN "quality_data" ON "quality_data"."treatment_selection_id" = "treatment_selections"."id" INNER JOIN“quality_data”ON“quality_data”。“treatment_selection_id”=“treatment_selections”。“id”
INNER JOIN "treatment_selections" "treatment_selections_trials" ON "treatment_selections_trials"."trial_id" = "trials"."id" INNER JOIN“treatment_selections”“treatment_selections_trials”ON“treatment_selections_trials”。“trial_id”=“试验”。“id”
INNER JOIN "treatments" ON "treatments"."id" = "treatment_selections_trials"."treatment_id" INNER JOIN“治疗”ON“治疗”。“id”=“treatment_selections_trials”。“treatment_id”
WHERE (EXTRACT(year from season_year) BETWEEN 2018 AND 2018) 在哪里(2018年和2018年之间的提取年份(来自season_year))
GROUP BY treatment_selections.treatment_id GROUP BY treatment_selections.treatment_id

But even if you managed to force Rails to implement DISTINCT ON , you might not get your intended result because DISTINCT ON should return only one row per treatment_id . 但即使您设法强制Rails实现DISTINCT ON ,您也可能无法得到预期的结果,因为DISTINCT ON应该每个treatment_id只返回一行。

The standard SQL way is to specify both columns as grouping in the aggregation: 标准SQL方法是将两个列指定为聚合中的分组:

If it is the case that treatment_id has a 1:1 relationship to treatment_name , then if you run the query without the AVG function (and without DISTINCT ON), the data would look similar to: 如果是treatment_id与treatment_name具有1:1关系的情况 ,那么如果您运行没有 AVG函数的查询(并且没有DISTINCT ON),则数据看起来类似于:

|   treatment_id    |       name          |  yield    |  
------------------------------------------------------
|        1          |   treatment 1       |    0.50   |
|        1          |   treatment 1       |    0.45   |
|        2          |   treatment 2       |    0.65   |
|        2          |   treatment 2       |    0.66   |
|        3          |   treatment 3       |    0.85   |

Now to use the average function you must aggregate by (both) treatment_id and treatment_name . 现在要使用平均函数,您必须通过(两个) treatment_idtreatment_name进行聚合。

The reason you must specify both is because the database manager assumes that all the columns in the resulting data set are not related among each other. 必须指定两者的原因是因为数据库管理器假定结果数据集中的所有列彼此不相关。 So, aggregating by both columns 因此,按两列聚合

SELECT treatment_selections.treatment_id, treatment s .name, AVG(quality_data.yield) as yield 选择treatment_selections.treatment_id,处理s .name,AVG(quality_data.yield)作为产量
FROM "trials" 从“试验”
INNER JOIN "treatment_selections" ON "treatment_selections"."trial_id" = "trials"."id" INNER JOIN“treatment_selections”ON“treatment_selections”。“trial_id”=“试验”。“id”
INNER JOIN "quality_data" ON "quality_data"."treatment_selection_id" = "treatment_selections"."id" INNER JOIN“quality_data”ON“quality_data”。“treatment_selection_id”=“treatment_selections”。“id”
INNER JOIN "treatment_selections" "treatment_selections_trials" ON "treatment_selections_trials"."trial_id" = "trials"."id" INNER JOIN“treatment_selections”“treatment_selections_trials”ON“treatment_selections_trials”。“trial_id”=“试验”。“id”
INNER JOIN "treatments" ON "treatments"."id" = "treatment_selections_trials"."treatment_id" INNER JOIN“治疗”ON“治疗”。“id”=“treatment_selections_trials”。“treatment_id”
WHERE (EXTRACT(year from season_year) BETWEEN 2018 AND 2018) 在哪里(2018年和2018年之间的提取年份(来自season_year))
GROUP BY treatment_selections.treatment_id, treatments.name GROUP BY treatment_selections.treatment_id, treatments.name

will give you the following result: 会给你以下结果:

|   treatment_id    |       name          |   AVG(yield)   |  
------------------------------------------------------------
|        1          |   treatment 1       |      0.475     |
|        2          |   treatment 2       |      0.655     |
|        3          |   treatment 3       |      0.85      |

To understand this better, if the resulting data in the first two columns was not related; 要更好地理解这一点,如果前两列中的结果数据不相关; for example: 例如:

|   year    |       name          |   yield   |  
-----------------------------------------------
|    2000   |   treatment 1       |    0.1    |
|    2000   |   treatment 1       |    0.2    |
|    2000   |   treatment 2       |    0.3    |
|    2000   |   treatment 3       |    0.4    |
|    2001   |   treatment 2       |    0.5    |
|    2001   |   treatment 3       |    0.6    |
|    2002   |   treatment 3       |    0.7    |

you must still group by year and name and, in this case, the average function would only be used when year and name are the same (note that it is not possible to do otherwise) resulting: 您仍然必须按年份名称进行分组,在这种情况下,只有在年份和名称相同时才会使用平均函数(请注意,否则无法执行此操作):

|   year    |       name          |   AVG(yield)   |  
---------------------------------------------------
|    2000   |   treatment 1       |     0.15       |
|    2000   |   treatment 2       |     0.3        |
|    2000   |   treatment 3       |     0.4        |
|    2001   |   treatment 2       |     0.5        |
|    2001   |   treatment 3       |     0.6        |
|    2002   |   treatment 3       |     0.7        |

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM