Rails、Postgres：如何在不对列进行硬编码的情况下创建数据透视表并将其左连接到另一个表

Question

I am working in Rails and Postgres.我在 Rails 和 Postgres 工作。 I have a table Problems , which has a few columns.我有一个表Problems ，其中有几列。 I have another table ExtraInfos , which references Problems and has three columns: problem_id, info_type, info_value.我有另一个表ExtraInfos ，它引用问题并具有三列：problem_id、info_type、info_value。

For example:例如：

Problems:问题：

id ID	problem_type问题类型	problem_group问题组
0 0	type_x类型_x	grp_a grp_a
1 1个	type_y类型_y	grp_b grp_b
2 2个	type_z类型_z	grp_c grp_c

ExtraInfos:额外信息：

id ID	problem_id问题编号	info_type:String信息类型：字符串	info_value信息值
0 0	0 0	info_1信息_1	v1 v1
1 1个	0 0	info_2信息_2	v2 v2
2 2个	0 0	info_3信息_3	v3 v3
3 3个	1 1个	info_1信息_1	v4 v4
4 4个	1 1个	info_3信息_3	v5 v5

As you can see, each problem has a variable number of extra information.如您所见，每个问题都有数量不定的额外信息。

What is the best way to join both tables to create something that looks like:连接两个表以创建类似以下内容的最佳方法是什么：

id ID	problem_type问题类型	problem_group问题组	info_1信息_1	info_2信息_2	info_3信息_3
0 0	type_x类型_x	grp_a grp_a	v1 v1	v2 v2	v3 v3
1 1个	type_y类型_y	grp_b grp_b	v4 v4		v5 v5
2 2个	type_z类型_z	grp_c grp_c

I used the ruby pivot_table gem, and I did manage to create the view that I wanted, by我使用了 ruby pivot_table gem，我确实设法创建了我想要的视图，通过

@table = PivotTable::Grid.new do |g|
  g.source_data  = ExtraInfos.all.includes(:problem))
  g.column_name  = :info_type
  g.row_name     = :problem
  g.field_name   = :info_value
end
@table.build

and then iterating over it by然后迭代它

...
<% @table.columns.each do |col| %>
  <th><%= col.header %></th>
<% end %>
...
<% if @table.row_headers.include? problem %>
  <% table.rows[table.row_headers.index(problem)].data.each do |cell| %>
    <td><%= cell %></td>
  <% end %>
<% end %>
...

but this is very clunky and doesn't leave me with good ways to, for instance, sort by these extra columns.但这非常笨重，并没有给我留下好的方法，例如，按这些额外的列进行排序。 As far as I know, the tables are simply a grid, an object, and can't LEFT JOIN with my Problems.all table, which would be the ideal solution.据我所知，这些表只是一个网格、一个对象，并且不能与我的Problems.all表进行LEFT JOIN ，这将是理想的解决方案。

I have tried looking up various pure SQL methods, but all seem to start with the assumption that these extra columns will be hard coded in, which is what I am trying to avoid.我曾尝试查找各种纯 SQL 方法，但所有方法似乎都以这样的假设开始：这些额外的列将被硬编码，这是我试图避免的。 I came across crosstab , but I haven't managed to get it working as it should.我遇到了crosstab ，但我还没有设法让它正常工作。

sql = "CREATE EXTENSION IF NOT EXISTS tablefunc;
    SELECT * FROM crosstab(
      'SELECT problem_id, info_type, info_value
      FROM pre_maslas
      ORDER BY 1,2'
    ) AS ct(problem_id bigint, info_type varchar(255), info_value varchar(255))"

@try = ActiveRecord::Base.connection.execute(sql)

This gives me the result {"problem_id"=>44, "info_type"=>"6", "info_value"=>"15"} {"problem_id"=>45, "info_type"=>"6", "info_value"=>"15"} which is clearly not correct.这给了我结果{"problem_id"=>44, "info_type"=>"6", "info_value"=>"15"} {"problem_id"=>45, "info_type"=>"6", "info_value"=>"15"}这显然是不正确的。

Another method seems to be creating a separate reference table containing a list of all possible infoTypes, which will then be referenced by the ExtraInfos table, making it easier to join the tables.另一种方法似乎是创建一个单独的引用表，其中包含所有可能的 infoType 的列表，然后将由 ExtraInfos 表引用，从而更容易连接表。 However, I don't want the infoTypes coded in at all.但是，我根本不想编码 infoTypes。 I want the user to be able to give me any type and value strings, and my tables should be able to deal with this.我希望用户能够给我任何类型和值字符串，我的表应该能够处理这个问题。

What is the best solution for accomplishing this?实现这一目标的最佳解决方案是什么？

Answer 1

ActiveRecord is built on top of the AST query assembler Arel . ActiveRecord 建立在 AST 查询汇编器Arel之上。

You can use this assembler to build dynamic queries as needed basically if you can hand type it as a SQL query Arel can build it.如果您可以将其手动键入为 Arel 可以构建的 SQL 查询，则基本上可以使用此汇编程序根据需要构建动态查询。

In this case the following will build your desired crosstab query based on the table structure provided in the post.在这种情况下，以下内容将根据帖子中提供的表结构构建您想要的交叉表查询。

# Get all distinct info_types to build columns
cols = ExtraInfo.distinct.pluck(:info_type).sort
# extra_info Arel::Table
extra_infos_tbl = ExtraInfo.arel_table
# Arel::Table to use for querying 
tbl = Arel::Table.new('ct')

# SQL data type for the extra_infos.info_type column 
info_type_sql_type = ExtraInfo.columns.find {|c| c.name == 'info_type' }&.sql_type

# Part 1 of crosstab 
qry_txt = extra_infos_tbl.project( 
  extra_infos_tbl[:problem_id],
  extra_infos_tbl[:info_type],
  extra_infos_tbl[:info_value]
) 
# Part 2 of the crosstab  
cats =  extra_infos_tbl.project(extra_infos_tbl[:info_type]).distinct

# construct the ct portion of the crosstab query
ct = Arel::Nodes::NamedFunction.new('ct',[
  Arel::Nodes::TableAlias.new(Arel.sql('problem_id'), Arel.sql('bigint')),
  *cols.map {|name|  Arel::Nodes::TableAlias.new(Arel.sql(name), Arel.sql(info_type_sql_type))}
])

# build the crosstab(...) AS ct(...) statement
crosstab = Arel::Nodes::As.new(
  Arel::Nodes::NamedFunction.new('crosstab', [Arel.sql("'#{qry_txt.to_sql}'"),
    Arel.sql("'#{cats.to_sql}'")]),
  ct
)

# final query construction
q = tbl.project(tbl[Arel.star]).from(crosstab)

Using this q.to_sql will produce:使用这个q.to_sql将产生：

SELECT 
  ct.* 
FROM 
  crosstab('SELECT 
              extra_infos.problem_id, 
              extra_infos.info_type, 
              extra_infos.info_value 
            FROM 
              extra_infos', 
           'SELECT DISTINCT 
              extra_infos.info_type 
            FROM 
              extra_infos') AS ct(problem_id bigint, 
                                  info_1 varchar(255), 
                                  info_2 varchar(255), 
                                  info_3 varchar(255))

And results in并导致

problem_id问题编号	info_1信息_1	info_2信息_2	info_3信息_3
0 0	v1 v1	v2 v2	v3 v3
1 1个	v4 v4		v5 v5

We can join this to the problems table as我们可以将其加入问题表中

sub = Arel::Table.new('subq')
sub_q = Arel::Nodes::As.new(q,Arel.sql(sub.name)) 

out = Problem
  .joins(Arel::Nodes::InnerJoin.new(sub_q,            
            Arel::Nodes::On.new(Problem.arel_table[:id].eq(sub[:problem_id]))
  )).select(
     Problem.arel_table[Arel.star],
     *cols.map {|c| sub[c.intern]}
  )

This will return Problem objects where the info_type columns are virtual attributes.这将返回Problem对象，其中info_type列是虚拟属性。 eg out.first.info_1 #=> 'v1'例如out.first.info_1 #=> 'v1'

Note: Personally I would break the parts down in a class to make the assembly clearer but the above will produce the desired outcome注意：我个人会在一个类中分解零件以使装配更清晰，但以上会产生所需的结果

Answer 2

In postgres, pivot table or crosstab are not relevant when the list of columns may vary in time, ie the list of values in column info_type may increase or decrease.在 postgres 中，当列列表可能随时间变化时，数据透视表或交叉表不相关，即列info_type中的值列表可能增加或减少。

There is an other solution which consists in creating a composite type dynamically and then using the standard functions jsonb_build_agg and jsonb_populate_record :还有另一种解决方案，包括动态创建composite type ，然后使用标准函数jsonb_build_agg和jsonb_populate_record ：

Creating the composite type column_list dynamically:动态创建复合类型column_list ：

CREATE OR REPLACE PROCEDURE column_list() LANGUAGE plpgsql AS $$
DECLARE
  clist text ;
BEGIN
  SELECT string_agg(DISTINCT info_type || ' text', ',')
    INTO clist
    FROM ExtraInfos ;
 
  EXECUTE 'DROP TYPE IF EXISTS column_list' ;
  EXECUTE 'CREATE TYPE column_list AS (' || clist || ')' ;
END ; $$ ;

Then setting up the composite type column_list for the first time:然后第一次设置复合类型column_list ：

CALL column_list() ;

But this composite type must be updated after every change of column ExtraInfos.但是这种复合类型必须在每次更改 ExtraInfos 列后更新。 This can be achieved with a trigger function:这可以通过触发功能来实现：

CREATE OR REPLACE FUNCTION After_Insert_Update_Delete_ExtraInfos () RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
  CALL column_list() ;
  RETURN NULL ;
END ; $$ ;

CREATE OR REPLACE TRIGGER After_Insert_Update_Delete_ExtraInfos AFTER INSERT OR UPDATE OF info_type OR DELETE ON ExtraInfos
FOR EACH STATEMENT EXECUTE FUNCTION After_Insert_Update_Delete_ExtraInfos () ;

The final query is:最终查询是：

SELECT p.id, p. problem_type, p.problem_group, (jsonb_populate_record(NULL :: column_list, jsonb_object_agg(info_type, info_value))).*
  FROM Problems AS p
 INNER JOIN ExtraInfos AS ei
    ON ei.problem_id = p.id
 GROUP BY p.id, p. problem_type, p.problem_group

which gives the result:结果是：

id ID	problem_type问题类型	problem_group问题组	info_1信息_1	info_2信息_2	info_3信息_3
0 0	type_x类型_x	grp_a grp_a	v1 v1	v2 v2	v3 v3
1 1个	type_y类型_y	grp_b grp_b	v4 v4	null无效的	v5 v5

see test result in dbfiddle在dbfiddle中查看测试结果

Rails、Postgres：如何在不对列进行硬编码的情况下创建数据透视表并将其左连接到另一个表

问题描述

2 个解决方案

解决方案1
1 2022-12-21 21:03:16

解决方案2
0 2022-12-21 20:27:52

Rails、Postgres：如何在不对列进行硬编码的情况下创建数据透视表并将其左连接到另一个表

问题描述

2 个解决方案

解决方案1 1 2022-12-21 21:03:16

解决方案2 0 2022-12-21 20:27:52

解决方案1
1 2022-12-21 21:03:16

解决方案2
0 2022-12-21 20:27:52