[英]Combining multiple rows into one
I have a database structure in PostgreSQL that looks something like this:我在 PostgreSQL 中有一个数据库结构,看起来像这样:
DROP TABLE IF EXISTS medium CASCADE;
DROP TABLE IF EXISTS works CASCADE;
DROP DOMAIN IF EXISTS nameVal CASCADE;
DROP DOMAIN IF EXISTS numID CASCADE;
DROP DOMAIN IF EXISTS alphaID CASCADE;
CREATE DOMAIN alphaID AS VARCHAR(10);
CREATE DOMAIN numID AS INT;
CREATE DOMAIN nameVal AS VARCHAR(40);
CREATE TABLE works (
w_alphaID alphaID NOT NULL,
w_numID numID NOT NULL,
w_title nameVal NOT NULL,
PRIMARY KEY(w_alphaID,w_numID));
CREATE TABLE medium (
m_alphaID alphaID NOT NULL,
m_numID numID NOT NULL,
m_title nameVal NOT NULL,
FOREIGN KEY(m_alphaID,m_numID) REFERENCES
works ON UPDATE CASCADE ON DELETE CASCADE);
INSERT INTO works VALUES('AB',1,'Sunset'),
('CD',2,'Beach'),
('EF',3,'Flower');
INSERT INTO medium VALUES('AB',1,'Wood'),
('AB',1,'Oil'),
('CD',2,'Canvas'),
('CD',2,'Oil'),
('CD',2,'Bronze'),
('EF',3,'Paper'),
('EF',3,'Pencil');
SELECT * FROM works;
SELECT * FROM medium;
SELECT w_alphaID AS alphaID, w_numID AS numID, w_title AS
Name_of_work, m_title AS Material_used
FROM works, medium WHERE
works.w_alphaID = medium.m_alphaID
AND works.w_numID = medium.m_numID;
The output looks something like this:输出如下所示:
w_alphaid | w_numid | w_title
-----------+---------+---------
AB | 1 | Sunset
CD | 2 | Beach
EF | 3 | Flower
(3 rows)
m_alphaid | m_numid | m_title
-----------+---------+---------
AB | 1 | Wood
AB | 1 | Oil
CD | 2 | Canvas
CD | 2 | Oil
CD | 2 | Bronze
EF | 3 | Paper
EF | 3 | Pencil
(7 rows)
alphaid | numid | name_of_work | material_used
---------+-------+--------------+---------------
AB | 1 | Sunset | Wood
AB | 1 | Sunset | Oil
CD | 2 | Beach | Canvas
CD | 2 | Beach | Oil
CD | 2 | Beach | Bronze
EF | 3 | Flower | Paper
EF | 3 | Flower | Pencil
(7 rows)
Now my question is what query should I use to have the format of the last SELECT
statement to look something like this:现在我的问题是我应该使用什么查询来使最后一个
SELECT
语句的格式看起来像这样:
alphaid | numid | name_of_work | material_used_1 | material_used_2 | material_used_3
---------+-------+--------------+-----------------+-----------------+---------------
AB | 1 | Sunset | Wood | Oil |
CD | 2 | Beach | Canvas | Oil | Bronze
EF | 3 | Flower | Paper | Pencil |
(3 rows)
I looked into using string_agg()
but that puts the values into one cell but I am looking to have a separate cell for each value.我研究过使用
string_agg()
但这会将值放入一个单元格中,但我希望为每个值设置一个单独的单元格。 I tried using join to see if I can achieve such output but with no success so far.我尝试使用 join 来查看是否可以实现这样的输出,但到目前为止没有成功。 I appreciate you taking the time to look at this question.
我很感谢您花时间看这个问题。
You can use string_agg() in a subquery and then break the string into separate columns.您可以在子查询中使用 string_agg(),然后将字符串分成单独的列。 See also this question on how to split string into columns
另请参阅有关如何将字符串拆分为列的问题
SELECT alphaID, numID, Name_of_Work
,split_part(Material_used, ',', 1) AS Material_used_1
,split_part(Material_used, ',', 2) AS Material_used_2
,split_part(Material_used, ',', 3) AS Material_used_3
,split_part(Material_used, ',', 4) AS Material_used_4
FROM (
SELECT w_alphaID AS alphaID, w_numID AS numID, w_title AS Name_of_work,
String_Agg( m_title, ',' ) AS Material_used
FROM works, medium
WHERE works.w_alphaID = medium.m_alphaID
AND works.w_numID = medium.m_numID
GROUP BY w_alphaID, w_numID, w_title ) t
This would be simpler with a simpler schema:使用更简单的架构会更简单:
medium
medium
serial
column) instead of the multicolumn PK and FK over two domain types.serial
列)而不是多列 PK 和 FK。alpha_id
instead of m_alphaID
and w_alphaID
etc.alpha_id
而不是m_alphaID
和w_alphaID
等。 That aside, here are solutions for your setup as is :这且不说,这里有您的设置解决方案是:
crosstab()
solutioncrosstab()
解决方案There are several specific difficulties for your crosstab()
query:您的
crosstab()
查询有几个具体的困难:
Basics ( read this first! ):基础知识(请先阅读! ):
For your special case:对于您的特殊情况:
Solution:解决方案:
SELECT alphaid, numid, name_of_work, material_1, material_2, material_3
FROM crosstab(
'SELECT rn, w.alphaid, w.numid, w.name_of_work
, row_number() OVER (PARTITION BY rn) AS mat_nr -- order undefined!
, m_title AS Material_used
FROM (
SELECT w_alphaID AS alphaid, w_numID AS numid, w_title AS name_of_work
, row_number() OVER (ORDER BY w_alphaID, w_numID) AS rn
FROM works
) w
JOIN medium m ON w.alphaid = m.m_alphaID
AND w.numid = m.m_numID
ORDER BY rn, mat_nr'
, 'VALUES (1), (2), (3)' -- add more ...
)
AS ct (
rn bigint, alphaid text, numid int, name_of_work text
, material_1 text, material_2 text, material_3 text -- add more ...
);
If the additional module tablefunc cannot be installed or if top performance is not important, this simpler query does the same, slower:如果无法安装附加模块tablefunc或者顶级性能不重要,则这个更简单的查询执行相同的操作,但速度更慢:
SELECT w_alphaid AS alphaid, w_numid AS numid, w_title AS name_of_work
, arr[1] AS material_used_1
, arr[2] AS material_used_2
, arr[3] AS material_used_3 -- add more?
FROM works w
LEFT JOIN (
SELECT m_alphaid, m_numid, array_agg(m_title::text) AS arr
FROM medium
GROUP BY m_alphaid, m_numid
) m ON w.w_alphaid = m.m_alphaid
AND w.w_numid = m.m_numid;
The cast to text
(or varchar
...) is necessary because there is no predefined array type for your custom domain.强制转换为
text
(或varchar
...)是必要的,因为您的自定义域没有预定义的数组类型。 Alternatively you could define the missing array type.或者,您可以定义缺少的数组类型。
One subtle difference to the above: using LEFT JOIN
here instead of just JOIN
to preserve rows from works
that have no related materials in medium
at all.与上述的一个细微差别:在此处使用
LEFT JOIN
而不是仅使用JOIN
来保留在medium
根本没有相关材料的works
中的行。
Since you return the whole table, it's cheaper to aggregate rows in medium
before you join.由于您返回整个表,因此在加入之前聚合
medium
行会更便宜。 For a small selection it might be cheaper to join first and then aggregate .对于少量选择,先加入然后聚合可能更便宜。 Related:
有关的:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.