简体   繁体   English

SQL转置行作为列

[英]SQL Transpose Rows as Columns

I have an interesting conundrum which I believe can be solved in purely SQL. 我有一个有趣的难题,我相信可以用纯SQL来解决。 I have tables similar to the following: 我有类似于以下表格:

responses:

user_id | question_id | body
----------------------------
1       | 1           | Yes
2       | 1           | Yes
1       | 2           | Yes
2       | 2           | No
1       | 3           | No
2       | 3           | No


questions:

id | body
-------------------------
1 | Do you like apples?
2 | Do you like oranges?
3 | Do you like carrots?

and I would like to get the following output 我想得到以下输出

user_id | Do you like apples? | Do you like oranges? | Do you like carrots?
---------------------------------------------------------------------------
1       | Yes                 | Yes                  | No
2       | Yes                 | No                   | No

I don't know how many questions there will be, and they will be dynamic, so I can't just code for every question. 我不知道会有多少个问题,它们是动态的,所以我不能只为每个问题编写代码。 I am using PostgreSQL and I believe this is called transposition, but I can't seem to find anything that says the standard way of doing this in SQL. 我使用的是PostgreSQL,我相信这称为转置,但是我似乎找不到任何能说明在SQL中执行此操作的标准方法的内容。 I remember doing this in my database class back in college, but it was in MySQL and I honestly don't remember how we did it. 我记得我上大学时在数据库课上这样做,但是那是在MySQL中做的,老实说我不记得我们是如何做到的。

I'm assuming it will be a combination of joins and a GROUP BY statement, but I can't even figure out how to start. 我假设它将是联接和GROUP BY语句的组合,但是我什至不知道如何开始。

Anybody know how to do this? 有人知道该怎么做吗? Thanks very much! 非常感谢!

Edit 1: I found some information about using a crosstab which seems to be what I want, but I'm having trouble making sense of it. 编辑1:我发现了一些有关使用交叉表的信息,这似乎是我想要的,但是我很难理解它。 Links to better articles would be greatly appreciated! 链接到更好的文章将不胜感激!

Use: 采用:

  SELECT r.user_id,
         MAX(CASE WHEN r.question_id = 1 THEN r.body ELSE NULL END) AS "Do you like apples?",
         MAX(CASE WHEN r.question_id = 2 THEN r.body ELSE NULL END) AS "Do you like oranges?",
         MAX(CASE WHEN r.question_id = 3 THEN r.body ELSE NULL END) AS "Do you like carrots?"
    FROM RESPONSES r
    JOIN QUESTIONS q ON q.id = r.question_id
GROUP BY r.user_id

This is a standard pivot query, because you are "pivoting" the data from rows to columnar data. 这是标准的透视查询,因为您是将数据从行“透视”到列数据。

I implemented a truly dynamic function to handle this problem without having to hard code any specific class of answers or use external modules/extensions. 我实现了一个真正的动态函数来处理此问题,而无需对任何特定类别的答案进行硬编码或使用外部模块/扩展。 It also gives full control over column ordering and supports multiple key and class/attribute columns. 它还可以完全控制列的排序,并支持多个键和类/属性列。

You can find it here: https://github.com/jumpstarter-io/colpivot 您可以在这里找到它: https : //github.com/jumpstarter-io/colpivot

Example that solves this particular problem: 解决此特定问题的示例:

begin;

create temporary table responses (
    user_id integer,
    question_id integer,
    body text
) on commit drop;

create temporary table questions (
    id integer,
    body text
) on commit drop;

insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');

select colpivot('_output', $$
    select r.user_id, q.body q, r.body a from responses r
        join questions q on q.id = r.question_id
$$, array['user_id'], array['q'], '#.a', null);

select * from _output;

rollback;

This outputs: 输出:

 user_id | 'Do you like apples?' | 'Do you like carrots?' | 'Do you like oranges?' 
---------+-----------------------+------------------------+------------------------
       1 | Yes                   | No                     | Yes
       2 | Yes                   | No                     | No

You can solve this example with the crosstab function in this way 您可以通过交叉表函数以这种方式解决此示例

drop table if exists responses;
create table responses (
user_id integer,
question_id integer,
body text
);

drop table if exists questions;
create table questions (
id integer,
body text
);

insert into responses values (1,1,'Yes'), (2,1,'Yes'), (1,2,'Yes'), (2,2,'No'), (1,3,'No'), (2,3,'No');
insert into questions values (1, 'Do you like apples?'), (2, 'Do you like oranges?'), (3, 'Do you like carrots?');

select * from crosstab('select responses.user_id, questions.body, responses.body from responses, questions where questions.id = responses.question_id order by user_id') as ct(userid integer, "Do you like apples?" text, "Do you like oranges?" text, "Do you like carrots?" text);

First, you must install tablefunc extension. 首先,您必须安装tablefunc扩展。 Since 9.1 version you can do it using create extension: 从9.1版本开始,您可以使用create extension来实现:

CREATE EXTENSION tablefunc;

I wrote a function to generate the dynamic query. 我编写了一个函数来生成动态查询。 It generates the sql for the crosstab and creates a view (drops it first if it exists). 它为交叉表生成sql并创建视图(如果存在,则将其首先删除)。 You can than select from the view to get your results. 然后,您可以从视图中选择以获取结果。

Here is the function: 这是函数:

CREATE OR REPLACE FUNCTION public.c_crosstab (
  eavsql_inarg varchar,
  resview varchar,
  rowid varchar,
  colid varchar,
  val varchar,
  agr varchar
)
RETURNS void AS
$body$
DECLARE
    casesql varchar;
    dynsql varchar;    
    r record;
BEGIN   
 dynsql='';

 for r in 
      select * from pg_views where lower(viewname) = lower(resview)
  loop
      execute 'DROP VIEW ' || resview;
  end loop;   

 casesql='SELECT DISTINCT ' || colid || ' AS v from (' || eavsql_inarg || ') eav ORDER BY ' || colid;
 FOR r IN EXECUTE casesql Loop
    dynsql = dynsql || ', ' || agr || '(CASE WHEN ' || colid || '=''' || r.v || ''' THEN ' || val || ' ELSE NULL END) AS ' || agr || '_' || r.v;
 END LOOP;
 dynsql = 'CREATE VIEW ' || resview || ' AS SELECT ' || rowid || dynsql || ' from (' || eavsql_inarg || ') eav GROUP BY ' || rowid;
 RAISE NOTICE 'dynsql %1', dynsql; 
 EXECUTE dynsql;
END

$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

And here is how I use it: 这是我的用法:

SELECT c_crosstab('query_txt', 'view_name', 'entity_column_name', 'attribute_column_name', 'value_column_name', 'first');

Example: Fist you run: 示例:您运行的拳头:

SELECT c_crosstab('Select * from table', 'ct_view', 'usr_id', 'question_id', 'response_value', 'first');

Than: 比:

Select * from ct_view;

contrib/tablefunc/有一个示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM