简体   繁体   English

Postgres数组查询

[英]Postgres array query

(The following is a highly simplified description of my problem. The company policy does not allow me to describe the actual scenario in any detail.) (以下是对我的问题的高度简化的描述。公司政策不允许我详细描述实际情况。)

The DB tables involved are: 涉及的数据库表是:

PRODUCTS:
ID   Name
---------
1    Ferrari
2    Lamborghini
3    Volvo


CATEGORIES:
ID    Name
----------
10    Sports cars
20    Safe cars
30    Red cars

PRODUCTS_CATEGORIES
ProductID    CategoryID
-----------------------
1            10
1            30
2            10
3            20

LOCATIONS:
ID      Name
------------
100     Sports car store
200     Safe car store
300     Red car store
400     All cars r us


LOCATIONS_CATEGORIES:
LocationID    CategoryID
------------------------
100           10
200           20
300           30
400           10
400           20
400           30

Note that the locations are not directly connected to the products, only the categories. 请注意,这些位置并非直接连接到产品,而只是类别。 The customer should be able to see a list of locations that can provide all the product categories that the products they want to buy belong to. 客户应该能够看到可以提供他们想要购买的产品所属的所有产品类别的位置列表。 So, for example: 所以,例如:

A customer wants to buy a Ferrari. 一位顾客想买一辆法拉利。 This would be available from stores in categories 10 or 30. This gives us stores 100, 300 and 400 but not 200. 这可以从类别10 30的商店获得。这给我们商店100,300和400但不是200。

However, if a customer wants to buy a Volvo and a Lamborghini this would be available from stores in categories 10 and 20. Which only gives us store 400. 但是,如果客户想购买沃尔沃和兰博基尼,可以从10类 20类商店购买。这只能为我们提供400商店。

Another customer wants to buy a Ferrari and a Volvo. 另一位客户想购买法拉利和沃尔沃。 This they could get from a store in either categories 10 + 20 (sporty and safe) or categories 30 + 20 (red and safe). 他们可以从10 + 20(运动和安全)或30 + 20(红色和安全)类别的商店获得。

What I need is a postgres query that takes a number of products and returns the locations where all of them can be found. 我需要的是一个postgres查询,它接受大量产品并返回可以找到所有产品的位置。 I got started with arrays and the <@ operator but got lost quickly. 我开始使用数组和<@运算符但很快就迷路了。 Here follows some example SQL that attempts to get stores where a Ferrari and a Lamborghini can be bought. 下面是一些示例SQL,它试图获得可以购买法拉利和兰博基尼的商店。 It does not work correctly since it requires the locations to satisfy all the categories that all the selected cars belong to. 它,因为它需要的位置,以满足所有选定的车属于所有类别无法正常工作。 It returns location 400 only but should return locations 400 and 100. 它仅返回位置400,但应返回位置400和100。

SELECT l.* FROM locations l
WHERE 
(SELECT array_agg(DISTINCT(categoryid)) FROM products_categories WHERE productid IN (1,2))
<@
(SELECT array_agg(categoryid) FROM locations_categories WHERE locationid = l.id);

I hope my description makes sense. 我希望我的描述有意义。

Here is the query. 这是查询。 You should insert a list of selected cars Ids pc.ProductId in (1,3) and in the end you should correct condition to selected cars count so if you select 1 and 3 you should write HAVING COUNT(DISTINCT pc.ProductId) = 2 if you select 3 cars then there have to be 3. This condition in HAVING give you condition that ALL cars are in these Locations: 您应该pc.ProductId in (1,3)插入选定汽车Ids pc.ProductId in (1,3)的列表,最后您应该将条件更正为选定的汽车数量,因此如果您选择1和3,您应该写入HAVING COUNT(DISTINCT pc.ProductId) = 2如果您选择3辆汽车,那么必须有3.这样的条件在HAVING条件下,所有汽车都在这些位置:

SELECT Id FROM Locations l
JOIN Locations_Categories lc on l.Id=lc.LocationId
JOIN Products_Categories pc on lc.CategoryId=pc.CategoryID
where pc.ProductId in (1,3)
GROUP BY l.id
HAVING COUNT(DISTINCT pc.ProductId) = 2

Sqlfiddle demo Sqlfiddle演示

For example for one car it will be: 例如,对于一辆汽车,它将是:

SELECT Id FROM Locations l
JOIN Locations_Categories lc on l.Id=lc.LocationId
JOIN Products_Categories pc on lc.CategoryId=pc.CategoryID
where pc.ProductId in (1)
GROUP BY l.id
HAVING COUNT(DISTINCT pc.ProductId) = 1

Only Ferrary demo Volvo and a Lamborghini demo 只有Ferrary演示 沃尔沃和兰博基尼演示

(This basically elaborates on @valex's answer, though I didn't realise that until I posted; please accept @valex's not this one). (这基本上详细阐述了@ valex的答案,虽然我没有意识到,直到我发布;请接受@ valex不是这个)。


This can be done using only joins and aggregation. 这可以仅使用连接和聚合来完成。

Build a join tree mapping locations to products, as normal. 像往常一样构建连接树,将位置映射到产品。 Then join it with the list of desired products (one-column values rows) and filter the join to only matching product names. 然后将其与所需产品列表(单列值行)连接,并将连接过滤为仅匹配的产品名称。 You now have one row with the location of a product wherever that product can be found. 现在,无论在何处找到该产品,您都可以获得一行产品的位置。

Now group by location and return locations where the number of products present is equal to the number we're looking for (for ALL). 现在按位置和返回位置分组,其中存在的产品数量等于我们正在寻找的数量(对于所有)。 For ANY we omit the HAVING filter because any location row returned by the join is what we want. 对于任何我们省略HAVING过滤器,因为连接返回的任何位置行都是我们想要的。

So: 所以:

WITH wantedproducts(productname) AS (VALUES('Volvo'), ('Lamborghini'))
SELECT l."ID"
FROM locations l
INNER JOIN locations_categories lc ON (l."ID" = lc."LocationID")
INNER JOIN categories c ON (c."ID" = lc."CategoryID")
INNER JOIN products_categories pc ON (pc."CategoryID" = c."ID")
INNER JOIN products p ON (p."ID" = pc."ProductID")
INNER JOIN wantedproducts wp ON (wp.productname = p."Name")
GROUP BY l."ID"
HAVING count(DISTINCT p."ID") = (SELECT count(*) FROM wantedproducts);

is what you want, basically. 基本上就是你想要的。

For "stores with any of the wanted products" queries, drop the HAVING clause. 对于“包含任何所需产品的商店”查询,请删除HAVING子句。

You an also ORDER BY the aggregate if you want to show stores with any match but sort based on number of matches. 如果您希望显示具有任何匹配的商店但是根据匹配数进行排序,那么您也可以对聚合进行ORDER BY

You can also add a string_agg(p."Name") to the SELECT values-list if you want to list products that can be found at that store. 如果要列出可在该商店中找到的产品,还可以将string_agg(p."Name")SELECT值列表中。

If you want your input to be an array rather than a values-list, just replace the VALUES (...) with SELECT unnest($1) and pass your array as the parameter $1 , or write it literally in place of $1 . 如果您希望输入是数组而不是值列表,只需将SELECT unnest($1)替换为VALUES (...)并将数组作为参数$1传递,或者用字面代替$1

ANSWER IN PROGRESS: (I will add answers as I get the required result) 答案:(我将在获得所需结果时添加答案)

For your first question: 对于你的第一个问题:

A customer wants to buy a Ferrari. 一位顾客想买一辆法拉利。 This would be available from stores in categories 10 or 30. This gives us stores 100, 300 and 400 but not 200. 这可以从类别10或30的商店获得。这给我们商店100,300和400但不是200。

SELECT DISTINCT l.id, l.name
FROM Products p
LEFT JOIN Product_Categories p_c
ON p.id = p_c.ProductId
LEFT JOIN Categories c
ON p_c.CategoryId = c.id
LEFT JOIN Locations_Categories l_c
ON c.id = l_c.CategoryId
LEFT JOIN Locations l
ON l_c.LocationId = l.id
WHERE p.id = 1

Second question: 第二个问题:

However, if a customer wants to buy a Volvo and a Lamborghini this would be available from stores in categories 10 and 20. Which only gives us store 400. 但是,如果客户想购买沃尔沃和兰博基尼,可以从10类和20类商店购买。这只能为我们提供400商店。

SELECT DISTINCT l.id, l.name
FROM Products p
LEFT JOIN Product_Categories p_c
ON p.id = p_c.ProductId
LEFT JOIN Categories c
ON p_c.CategoryId = c.id
LEFT JOIN Locations_Categories l_c
ON c.id = l_c.CategoryId
LEFT JOIN Locations l
ON l_c.LocationId = l.id
WHERE l.id in (select id
               from locations loc
               join locations_categories locat1              
               on loc.id = locat1.LocationId
               join locations_categories locat2
               on loc.id = locat2.LocationId
               where locat1.CategoryId = 10
               AND locat2.categoryId = 20)

RESULT FOR SECOND QUESTION USING INTERSECT: intersect will cross reference all the stores where 1 product can be found each time: 使用INTERSECT的第二个问题的结果:intersect将交叉引用每次可以找到1个产品的所有商店:

SELECT DISTINCT l.id, l.name
FROM Products p
LEFT JOIN Product_Categories p_c
ON p.id = p_c.ProductId
LEFT JOIN Categories c
ON p_c.CategoryId = c.id
LEFT JOIN Locations_Categories l_c
ON c.id = l_c.CategoryId
LEFT JOIN Locations l
ON l_c.LocationId = l.id
WHERE p.id = 2
INTERSECT
SELECT DISTINCT l.id, l.name
FROM Products p
LEFT JOIN Product_Categories p_c
ON p.id = p_c.ProductId
LEFT JOIN Categories c
ON p_c.CategoryId = c.id
LEFT JOIN Locations_Categories l_c
ON c.id = l_c.CategoryId
LEFT JOIN Locations l
ON l_c.LocationId = l.id
WHERE p.id = 3

For every new product you add a new INTERSECT statement and create a new select with the wanted product id SQLFIDDLE: http://sqlfiddle.com/#!15/ce97d/15 对于每个新产品,您都要添加一个新的INTERSECT语句并使用所需的产品ID SQLFIDDLE创建一个新的选择: http ://sqlfiddle.com/#!15 / ce97d / 15

Well, it's hard to totally avoid arrays here but I think I found a solution with less array functions. 嗯,这里很难完全避免数组,但我认为我找到了一个阵列函数较少的解决方案。

Instead of selecting needed locations, I excluded non valid ones. 我没有选择所需的位置,而是排除了无效的位置。

WITH needed_categories AS (
  SELECT p."ID", array_agg(pc."CategoryID") AS at_least_one_should_match
  FROM Products p
  JOIN Products_Categories pc ON p."ID" = pc."ProductID"
  WHERE p."ID" IN (1, 3)
  GROUP BY p."ID"
),
not_valid_locations AS (
  SELECT DISTINCT lc."LocationID", unnest(nc.at_least_one_should_match)
  FROM Locations_Categories lc
  JOIN needed_categories nc ON NOT ARRAY[lc."CategoryID"] && nc.at_least_one_should_match 
  EXCEPT
  SELECT * FROM Locations_Categories
) 
SELECT * 
FROM Locations
WHERE "ID" NOT IN (
  SELECT "LocationID" FROM not_valid_locations
);

Here is the SQLFiddle: http://sqlfiddle.com/#!15/e138d/78 这是SQLFiddle: http ://sqlfiddle.com/#!15 / e138d / 78

This works but I'm still trying to avoid double seq scan of the Location_Categories . 这有效,但我仍然试图避免Location_Categories双seq扫描。 The fact that cars can belong to multiple categories is a bit tricky, I solved this using arrays but I'm trying to get rid of these too. 汽车可以属于多个类别的事实有点棘手,我用阵列解决了这个问题,但我也试图摆脱这些。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM