简体   繁体   English

仅在GROUP BY上不存在行A的情况下如何选择行B

[英]How to SELECT row B only if row A doesn't exist on GROUP BY

I'm passing through the following situation and have not found a good solution to this problem. 我正在经历以下情况,但没有找到解决此问题的好方法。 I am going through a optimization of a API so am looking for fastest possible solution. 我正在对API进行优化,因此正在寻找最快的解决方案。

The following description is not exactly what I am doing, but I think it represents the problem well. 以下描述与我所做的不完全相同,但是我认为它很好地说明了问题。

Let's say I have a table of products: 假设我有一张产品表:

+----+----------+
| id |   name   |
+----+----------+
|  1 | product1 |
|  2 | product2 |
+----+----------+

And I have a table of attachments to each product, separate by language: 我有每个产品的附件表,按语言分开:

+----+----------+------------+-----------------------+
| id | language | product_id |     attachment_url    |
+----+----------+------------+-----------------------+
|  1 |    bb    |      1     |     image1_bb.jpg     |
|  1 |    en    |      1     |     image1_en.jpg     |
|  1 |    pt    |      1     |     image1_pt.jpg     |
|  2 |    bb    |      1     |     image2_bb.jpg     |
|  2 |    pt    |      1     |     image2_pt.jpg     |
+----+----------+------------+-----------------------+

What I intend to do is to get the correct attachment according to the language selected on the request. 我打算根据请求中选择的语言获得正确的附件。 As you can see above, I can have several attachments to each product. 如您在上面看到的,每个产品可以有多个附件。 We use Babel ( bb ) as a generic language, so every time I don't have a attachment to the right language, I should get the babel version. 我们使用Babel( bb )作为通用语言,因此,每当我没有正确语言的附件时,我都应该获取babel版本。 Is also important to consider that the Primary Key of the attachments table is a composite of id + language . 考虑附件表的主键是id + language的组合也是很重要的。

So, supposing I try to get all the data in pt , my first option to create a SQL query was: 因此,假设我尝试获取pt所有数据,那么创建SQL查询的第一个选择是:

SELECT p.id, p.name, 
    GROUP_CONCAT( '{',a.id,',',a.attachment_url, '}' ) as attachments_list 
FROM products p 
LEFT JOIN attachments a 
    ON (a.product_id=p.id AND (a.language='pt' OR a.language='bb')) 

The problem is that, with this query I always get the bb data and I only want to get it when there is no attachment on the right language. 问题是,通过此查询,我始终会获得bb数据,并且仅在没有正确语言的附件时才希望获得它。

I already tried to do a subquery changing attachments for: 我已经尝试做一个子查询来更改附件:

(SELECT * FROM attachments GROUP BY id ORDER BY id ASC, language DESC)

but it doubles the time of the request. 但这会使请求时间加倍。

I also tried using DISTINCT inside the GROUP_CONCAT , but it only works if the whole result of each row is equal, so it does not work for me. 我还尝试在GROUP_CONCAT内使用DISTINCT ,但是它仅在每行的整个结果相等时才起作用,因此对我不起作用。

Does anyone knows any other solution that I can apply directly into the query? 有谁知道我可以直接应用于查询的任何其他解决方案?

EDIT: 编辑:

Combining the answers of @Vulcronos and @Barmar made the final solution at least 2x faster than the one I first suggested. 结合@Vulcronos和@Barmar的答案,最终解决方案至少比我最初建议的解决方案快2倍。

Just to add some context, for anybody else who is looking for it. 只是为了添加一些上下文,供其他正在寻找它的人使用。 I am using Phalcon. 我正在使用Phalcon。 Because of it, I had a lot of trouble putting the pieces together, as Phalcon PHQL does not support subqueries, nor a lot of the other stuff I had to use. 因此,由于Phalcon PHQL不支持子查询,也不支持我不得不使用的许多其他东西,因此在将各个部分组合在一起时遇到了很多麻烦。

For my scenario , where I had to deliver approximatelly 1.2MB of JSON content, with more than 2100 objects, using custom queries made the total request time up to 3x faster than Phalcon native relations management methods ( hasMany() , hasManyToMany() , etc.) and 10x faster than my original solution (which used a lot the find() method). 在我的场景中 ,我必须交付约1.2MB的JSON内容,并包含2100个以上的对象,使用自定义查询使总请求时间比Phalcon原生关系管理方法( hasMany()hasManyToMany()hasMany()快了3倍。 ),比我的原始解决方案(使用很多find()方法)快10倍。

Try doing two joins instead of one: 尝试执行两个联接而不是一个:

SELECT p.id, p.name, 
    GROUP_CONCAT( '{',COALESCE(a.id, b.id),',',COALESCE(a.attachment_url, b.attachment_url), '}' ) as attachments_list 
FROM products p 
LEFT JOIN attachments a 
    ON (a.product_id=p.id AND a.language='pt') 
LEFT JOIN attachments b
    ON (a.product_id=p.id AND a.language='bb') 

and then using COALESCE to return b instead of a if a doesn't exist. 然后使用COALESCE返回b而不是a(如果a不存在)。 You can also do it with a subselect if the above doesn't work. 如果上述方法不起作用,您也可以使用子选择来完成。

OR conditions tend to make queries slow, because it's hard to optimize them with indexes. OR条件会使查询变慢,因为很难用索引优化查询。 Try joining separately using the two different languages. 尝试使用两种不同的语言分别加入。

SELECT p.id, p.name, 
    IFNULL(apt.attachment_url, abb.attachment_url) AS attachment_url
FROM products AS p
JOIN attachments AS abb ON abb.product_id = p.id
LEFT JOIN attachments AS apt ON alang.product_id = p.id AND apt.language = 'pt'
WHERE abb.language = 'bb'

This assumes that all products have a bb attachment, while pt is optional. 假设所有产品都带有bb附件,而pt是可选的。

I left out the join of Product, because it's not relevant for this problem. 我没有加入Product,因为它与这个问题无关。 It's only needed to include the product name in the resultset. 只需要在结果集中包含产品名称。

SELECT a.product_id, a.id, a.attachment_url FROM attachments a
WHERE a.language = ?
OR (a.language = 'bb' 
   AND NOT EXISTS
       (SELECT * FROM attachments
        WHERE language = ?
        AND id = a.id
        AND product_id = a.product_id));

Notes: problems like this usually have many possible solutions. 注意:像这样的问题通常有许多可能的解决方案。 This is not necessarily the most efficient one. 这不一定是最有效的一种。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM