Rails、Postgres：如何在不对列进行硬编码的情况下创建数据透视表并将其左连接到另一个表

Question

我在 Rails 和 Postgres 工作。 我有一个表Problems ，其中有几列。 我有另一个表ExtraInfos ，它引用问题并具有三列：problem_id、info_type、info_value。

例如：

问题：

ID	问题类型	问题组
0	类型_x	grp_a
1个	类型_y	grp_b
2个	类型_z	grp_c

额外信息：

ID	问题编号	信息类型：字符串	信息值
0	0	信息_1	v1
1个	0	信息_2	v2
2个	0	信息_3	v3
3个	1个	信息_1	v4
4个	1个	信息_3	v5

如您所见，每个问题都有数量不定的额外信息。

连接两个表以创建类似以下内容的最佳方法是什么：

ID	问题类型	问题组	信息_1	信息_2	信息_3
0	类型_x	grp_a	v1	v2	v3
1个	类型_y	grp_b	v4		v5
2个	类型_z	grp_c

我使用了 ruby pivot_table gem，我确实设法创建了我想要的视图，通过

@table = PivotTable::Grid.new do |g|
  g.source_data  = ExtraInfos.all.includes(:problem))
  g.column_name  = :info_type
  g.row_name     = :problem
  g.field_name   = :info_value
end
@table.build

然后迭代它

...
<% @table.columns.each do |col| %>
  <th><%= col.header %></th>
<% end %>
...
<% if @table.row_headers.include? problem %>
  <% table.rows[table.row_headers.index(problem)].data.each do |cell| %>
    <td><%= cell %></td>
  <% end %>
<% end %>
...

但这非常笨重，并没有给我留下好的方法，例如，按这些额外的列进行排序。 据我所知，这些表只是一个网格、一个对象，并且不能与我的Problems.all表进行LEFT JOIN ，这将是理想的解决方案。

我曾尝试查找各种纯 SQL 方法，但所有方法似乎都以这样的假设开始：这些额外的列将被硬编码，这是我试图避免的。 我遇到了crosstab ，但我还没有设法让它正常工作。

sql = "CREATE EXTENSION IF NOT EXISTS tablefunc;
    SELECT * FROM crosstab(
      'SELECT problem_id, info_type, info_value
      FROM pre_maslas
      ORDER BY 1,2'
    ) AS ct(problem_id bigint, info_type varchar(255), info_value varchar(255))"

@try = ActiveRecord::Base.connection.execute(sql)

这给了我结果{"problem_id"=>44, "info_type"=>"6", "info_value"=>"15"} {"problem_id"=>45, "info_type"=>"6", "info_value"=>"15"}这显然是不正确的。

另一种方法似乎是创建一个单独的引用表，其中包含所有可能的 infoType 的列表，然后将由 ExtraInfos 表引用，从而更容易连接表。 但是，我根本不想编码 infoTypes。 我希望用户能够给我任何类型和值字符串，我的表应该能够处理这个问题。

实现这一目标的最佳解决方案是什么？

Answer 1

ActiveRecord 建立在 AST 查询汇编器Arel之上。

如果您可以将其手动键入为 Arel 可以构建的 SQL 查询，则基本上可以使用此汇编程序根据需要构建动态查询。

在这种情况下，以下内容将根据帖子中提供的表结构构建您想要的交叉表查询。

# Get all distinct info_types to build columns
cols = ExtraInfo.distinct.pluck(:info_type).sort
# extra_info Arel::Table
extra_infos_tbl = ExtraInfo.arel_table
# Arel::Table to use for querying 
tbl = Arel::Table.new('ct')

# SQL data type for the extra_infos.info_type column 
info_type_sql_type = ExtraInfo.columns.find {|c| c.name == 'info_type' }&.sql_type

# Part 1 of crosstab 
qry_txt = extra_infos_tbl.project( 
  extra_infos_tbl[:problem_id],
  extra_infos_tbl[:info_type],
  extra_infos_tbl[:info_value]
) 
# Part 2 of the crosstab  
cats =  extra_infos_tbl.project(extra_infos_tbl[:info_type]).distinct

# construct the ct portion of the crosstab query
ct = Arel::Nodes::NamedFunction.new('ct',[
  Arel::Nodes::TableAlias.new(Arel.sql('problem_id'), Arel.sql('bigint')),
  *cols.map {|name|  Arel::Nodes::TableAlias.new(Arel.sql(name), Arel.sql(info_type_sql_type))}
])

# build the crosstab(...) AS ct(...) statement
crosstab = Arel::Nodes::As.new(
  Arel::Nodes::NamedFunction.new('crosstab', [Arel.sql("'#{qry_txt.to_sql}'"),
    Arel.sql("'#{cats.to_sql}'")]),
  ct
)

# final query construction
q = tbl.project(tbl[Arel.star]).from(crosstab)

使用这个q.to_sql将产生：

SELECT 
  ct.* 
FROM 
  crosstab('SELECT 
              extra_infos.problem_id, 
              extra_infos.info_type, 
              extra_infos.info_value 
            FROM 
              extra_infos', 
           'SELECT DISTINCT 
              extra_infos.info_type 
            FROM 
              extra_infos') AS ct(problem_id bigint, 
                                  info_1 varchar(255), 
                                  info_2 varchar(255), 
                                  info_3 varchar(255))

并导致

问题编号	信息_1	信息_2	信息_3
0	v1	v2	v3
1个	v4		v5

我们可以将其加入问题表中

sub = Arel::Table.new('subq')
sub_q = Arel::Nodes::As.new(q,Arel.sql(sub.name)) 

out = Problem
  .joins(Arel::Nodes::InnerJoin.new(sub_q,            
            Arel::Nodes::On.new(Problem.arel_table[:id].eq(sub[:problem_id]))
  )).select(
     Problem.arel_table[Arel.star],
     *cols.map {|c| sub[c.intern]}
  )

这将返回Problem对象，其中info_type列是虚拟属性。 例如out.first.info_1 #=> 'v1'

注意：我个人会在一个类中分解零件以使装配更清晰，但以上会产生所需的结果

Answer 2

在 postgres 中，当列列表可能随时间变化时，数据透视表或交叉表不相关，即列info_type中的值列表可能增加或减少。

还有另一种解决方案，包括动态创建composite type ，然后使用标准函数jsonb_build_agg和jsonb_populate_record ：

动态创建复合类型column_list ：

CREATE OR REPLACE PROCEDURE column_list() LANGUAGE plpgsql AS $$
DECLARE
  clist text ;
BEGIN
  SELECT string_agg(DISTINCT info_type || ' text', ',')
    INTO clist
    FROM ExtraInfos ;
 
  EXECUTE 'DROP TYPE IF EXISTS column_list' ;
  EXECUTE 'CREATE TYPE column_list AS (' || clist || ')' ;
END ; $$ ;

然后第一次设置复合类型column_list ：

CALL column_list() ;

但是这种复合类型必须在每次更改 ExtraInfos 列后更新。 这可以通过触发功能来实现：

CREATE OR REPLACE FUNCTION After_Insert_Update_Delete_ExtraInfos () RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
  CALL column_list() ;
  RETURN NULL ;
END ; $$ ;

CREATE OR REPLACE TRIGGER After_Insert_Update_Delete_ExtraInfos AFTER INSERT OR UPDATE OF info_type OR DELETE ON ExtraInfos
FOR EACH STATEMENT EXECUTE FUNCTION After_Insert_Update_Delete_ExtraInfos () ;

最终查询是：

SELECT p.id, p. problem_type, p.problem_group, (jsonb_populate_record(NULL :: column_list, jsonb_object_agg(info_type, info_value))).*
  FROM Problems AS p
 INNER JOIN ExtraInfos AS ei
    ON ei.problem_id = p.id
 GROUP BY p.id, p. problem_type, p.problem_group

结果是：

ID	问题类型	问题组	信息_1	信息_2	信息_3
0	类型_x	grp_a	v1	v2	v3
1个	类型_y	grp_b	v4	无效的	v5

在dbfiddle中查看测试结果

Rails、Postgres：如何在不对列进行硬编码的情况下创建数据透视表并将其左连接到另一个表

问题描述

2 个解决方案

解决方案1
1 2022-12-21 21:03:16

解决方案2
0 2022-12-21 20:27:52

Rails、Postgres：如何在不对列进行硬编码的情况下创建数据透视表并将其左连接到另一个表

问题描述

2 个解决方案

解决方案1 1 2022-12-21 21:03:16

解决方案2 0 2022-12-21 20:27:52

解决方案1
1 2022-12-21 21:03:16

解决方案2
0 2022-12-21 20:27:52