简体   繁体   English

如何在 Jooq 查询中生成任意子查询/连接

[英]How to generate arbitrary subqueries/joins in a Jooq query

Situation : I am porting our application to Jooq to eliminate several n+1 problems and ensure custom queries are type-safe (DB server is Postgresql 13).情况:我将我们的应用程序移植到 Jooq 以消除几个 n+1 问题并确保自定义查询是类型安全的(数据库服务器是 Postgresql 13)。 In my example we have documents (ID, file name, file size).在我的示例中,我们有文档(ID、文件名、文件大小)。 Each document can have several unqique document attributes (Document ID as FK, archive attribute ID - the type of the attribute and the value).每个文档可以有几个唯一的文档属性(文档 ID 作为 FK,存档属性 ID - 属性的类型和值)。 Example data:示例数据:

Document:文档:

acme=> select id, file_name, file_size from document;
                  id                  |        file_name        | file_size 
--------------------------------------+-------------------------+-----------
 1ae56478-d27c-4b68-b6c0-a8bdf36dd341 | My Really cool book.pdf |     13264
(1 row)

Document Attributes:文档属性:

acme=> select * from document_attribute ;
             document_id              |         archive_attribute_id         |   value    
--------------------------------------+--------------------------------------+------------
 1ae56478-d27c-4b68-b6c0-a8bdf36dd341 | b334e287-887f-4173-956d-c068edc881f8 | JustReleased
 1ae56478-d27c-4b68-b6c0-a8bdf36dd341 | 2f86a675-4cb2-4609-8e77-c2063ab155f1 | Tax
 1ae56478-d27c-4b68-b6c0-a8bdf36dd341 | 30bb9696-fc18-4c87-b6bd-5e01497ca431 | ShippingRequired
 1ae56478-d27c-4b68-b6c0-a8bdf36dd341 | 2eb04674-1dcb-4fbc-93c3-73491deb7de2 | Bestseller
 1ae56478-d27c-4b68-b6c0-a8bdf36dd341 | a8e2f902-bf04-42e8-8ac9-94cdbf4b6778 | Paperback
(5 rows)

One can search via custom created JDBC prepared statement for these documents and their attribute.可以通过自定义创建的 JDBC 准备语句搜索这些文档及其属性。 A user was able to create this query for a document ID and two document attributes with matching value, which returned the book 'My Really cool book.pdf':用户能够为一个文档 ID 和两个具有匹配值的文档属性创建此查询,返回书“我真的很酷 book.pdf”:

SELECT d.id FROM document d WHERE d.id = '1ae56478-d27c-4b68-b6c0-a8bdf36dd341'
AND d.id IN(SELECT da.document_id AS id0 FROM document_attribute da WHERE da.archive_attribute_id = '2eb04674-1dcb-4fbc-93c3-73491deb7de2' AND da.value = 'Bestseller')
AND d.id IN(SELECT da.document_id AS id1 FROM document_attribute da WHERE da.archive_attribute_id = 'a8e2f902-bf04-42e8-8ac9-94cdbf4b6778' AND da.value = 'Paperback');

(After that the application fetches all document attributes for the returned document IDs - thus the n + 1 problem we want to solve) (之后应用程序为返回的文档 ID 获取所有文档属性 - 因此我们要解决 n + 1 问题)

Please note that all document values and document attributes are optional.请注意,所有文档值和文档属性都是可选的。 One can only search for the file name or file size of a document but also several document attributes.只能搜索文档的文件名或文件大小,还可以搜索多个文档属性。

Question/Problems:问题/问题:

I wanted to port this code to Jooq and use a multiset, but I am struggeling how to apply the arbitrary subquery or join condition to the document attributes:我想将此代码移植到 Jooq 并使用多重集,但我正在努力如何将任意子查询或连接条件应用于文档属性:

1.) How can I achieve this arbitrary adding of subqueries? 1.) 如何实现这种任意添加子查询?

2.) Is a INNER JOIN more performant than a subquery? 2.) INNER JOIN 是否比子查询更高效?

Code:代码:

import org.jooq.Condition;
import org.jooq.impl.DSL;
import org.junit.jupiter.api.Test;

import java.util.List;
import java.util.Map;
import java.util.UUID;

import static org.jooq.impl.DSL.multiset;
import static org.jooq.impl.DSL.selectDistinct;

public class InSelectExample extends BaseTest {

    private record CustomDocumentAttribute(
        UUID documentId, // ID of the document the attribute belongs to
        UUID archiveAttributeId, // There are predefined attribute types in our application. This ID  references them
        String value // Real value of this attribute for the document
    ) {
    }

    private record CustomDocument(
        UUID documentId, // ID of the document
        String fileName, // File name of the document
        Integer fileSize, // File size in bytes of the document
        List<CustomDocumentAttribute> attributes // Attributes the document has
    ) {
    }

    @Test
    public void findPdfDocumentsWithParameters() {
        // Should print the single book
        List<CustomDocument> documents = searchDocuments(UUID.fromString("1ae56478-d27c-4b68-b6c0-a8bdf36dd341"), "My Really cool book.pdf", 13264, Map.of(
            UUID.fromString("2eb04674-1dcb-4fbc-93c3-73491deb7de2"), "Bestseller",
            UUID.fromString("a8e2f902-bf04-42e8-8ac9-94cdbf4b6778"), "Paperback"
        ));
        System.out.println("Size: " + documents.size()); // Should return 1 document

        // Should print no books because one of the document attribute value doesn't match (Booklet instead of Paperback)
        documents = searchDocuments(UUID.fromString("1ae56478-d27c-4b68-b6c0-a8bdf36dd341"), "My Really cool book.pdf", 13264, Map.of(
            UUID.fromString("2eb04674-1dcb-4fbc-93c3-73491deb7de2"), "Bestseller",
            UUID.fromString("a8e2f902-bf04-42e8-8ac9-94cdbf4b6778"), "Booklet"
        ));
        System.out.println("Size: " + documents.size()); // Should return 0 documents
    }

    private List<CustomDocument> searchDocuments(UUID documentId, String fileName, Integer fileSize, Map<UUID, String> attributes) {
        // Get the transaction manager
        TransactionManager transactionManager = getBean(TransactionManager.class);

        // Get the initial condition
        Condition condition = DSL.noCondition();

        // Check for an optional document ID
        if (documentId != null) {
            condition = condition.and(DOCUMENT.ID.eq(documentId));
        }

        // Check for an optional file name
        if (fileName != null) {
            condition = condition.and(DOCUMENT.FILE_NAME.eq(fileName));
        }

        // Check for an optional file size
        if (fileSize != null) {
            condition = condition.and(DOCUMENT.FILE_SIZE.eq(fileSize));
        }

        // Create the query
        var step1 = transactionManager.getDslContext().select(
            DOCUMENT.ID,
            DOCUMENT.FILE_NAME,
            DOCUMENT.FILE_SIZE,
            multiset(
                selectDistinct(
                    DOCUMENT_ATTRIBUTE.DOCUMENT_ID,
                    DOCUMENT_ATTRIBUTE.ARCHIVE_ATTRIBUTE_ID,
                    DOCUMENT_ATTRIBUTE.VALUE
                ).from(DOCUMENT_ATTRIBUTE).where(DOCUMENT_ATTRIBUTE.DOCUMENT_ID.eq(DOCUMENT.ID))
            ).convertFrom(record -> record.map(record1 -> new CustomDocumentAttribute(record1.value1(), record1.value2(), record1.value3())))
        ).from(DOCUMENT
        ).where(condition);

        // TODO: What to do here?
        var step3 = ...? What type?
        for (Map.Entry<UUID, String> attributeEntry : attributes.entrySet()) {
            // ???
            // Reference: AND d.id IN(SELECT da.document_id AS id0 FROM document_attribute da WHERE da.archive_attribute_id = ? AND da.value = ?)
            var step2 = step1.and(...??????)
        }

        // Finally fetch and return
        return step1.fetch(record -> new CustomDocument(record.value1(), record.value2(), record.value3(), record.value4()));
    }
}

After reading another question jOOQ - join with nested subquery (and not realizing the solution) and playing around with generating Java code via https://www.jooq.org/translate/ , it clicked.在阅读了另一个问题jOOQ - join with nested subquery (并没有实现解决方案)并通过https://www.jooq.org/translate/玩弄生成 Java 代码之后,它点击了。 In combination with reading https://www.jooq.org/doc/latest/manual/sql-building/column-expressions/scalar-subqueries/ one can simple add the subquery as IN() condition before executing the query.结合阅读https://www.jooq.org/doc/latest/manual/sql-building/column-expressions/scalar-subqueries/可以在执行查询之前简单地将子查询添加为IN()条件。 To be honest I am not sure if this is the most performant solution.老实说,我不确定这是否是最高效的解决方案。 The searchDocuments method then looks like this: searchDocuments方法如下所示:

    private List<CustomDocument> searchDocuments(UUID documentId, String fileName, Integer fileSize, Map<UUID, String> attributes) {
        // Get the transaction manager
        TransactionManager transactionManager = getBean(TransactionManager.class);

        // Get the initial condition
        Condition condition = DSL.noCondition();

        // Check for an optional document ID
        if (documentId != null) {
            condition = condition.and(DOCUMENT.ID.eq(documentId));
        }

        // Check for an optional file name
        if (fileName != null) {
            condition = condition.and(DOCUMENT.FILE_NAME.eq(fileName));
        }

        // Check for an optional file size
        if (fileSize != null) {
            condition = condition.and(DOCUMENT.FILE_SIZE.eq(fileSize));
        }

        // Check for optional document attributes
        if (attributes != null && !attributes.isEmpty()) {
            for (Map.Entry<UUID, String> entry : attributes.entrySet()) {
                condition = condition.and(DOCUMENT.ID.in(select(DOCUMENT_ATTRIBUTE.DOCUMENT_ID).from(DOCUMENT_ATTRIBUTE).where(DOCUMENT_ATTRIBUTE.DOCUMENT_ID.eq(DOCUMENT.ID).and(DOCUMENT_ATTRIBUTE.ARCHIVE_ATTRIBUTE_ID.eq(entry.getKey()).and(DOCUMENT_ATTRIBUTE.VALUE.eq(entry.getValue()))))));
            }
        }

        // Create the query
        return transactionManager.getDslContext().select(
            DOCUMENT.ID,
            DOCUMENT.FILE_NAME,
            DOCUMENT.FILE_SIZE,
            multiset(
                selectDistinct(
                    DOCUMENT_ATTRIBUTE.DOCUMENT_ID,
                    DOCUMENT_ATTRIBUTE.ARCHIVE_ATTRIBUTE_ID,
                    DOCUMENT_ATTRIBUTE.VALUE
                ).from(DOCUMENT_ATTRIBUTE).where(DOCUMENT_ATTRIBUTE.DOCUMENT_ID.eq(DOCUMENT.ID))
            ).convertFrom(record -> record.map(record1 -> new CustomDocumentAttribute(record1.value1(), record1.value2(), record1.value3())))
        ).from(DOCUMENT
        ).where(condition
        ).fetch(record -> new CustomDocument(record.value1(), record.value2(), record.value3(), record.value4()));
    }

Regarding your questions关于你的问题

1.) How can I achieve this arbitrary adding of subqueries? 1.) 如何实现这种任意添加子查询?

You already found a solution to that question in your own answer , though I'll suggest an alternative that I personally prefer.您已经在自己的答案中找到了该问题的解决方案,但我会建议我个人更喜欢的替代方案。 Your approach creates N subqueries hitting your table N times.您的方法创建了 N 个子查询,N 次击中您的表。

2.) Is a INNER JOIN more performant than a subquery? 2.) INNER JOIN 是否比子查询更高效?

There's no general rule to this.这没有一般规则。 It's all just relational algebra.这只是关系代数。 If the optimiser can prove two expressions are the same thing, they can be transformed to each other.如果优化器可以证明两个表达式是同一事物,那么它们可以相互转换。 However, an INNER JOIN is not the exact same thing as a semi join, ie an IN predicate (although sometimes it is, in the presence of constraints).但是, INNER JOIN与半连接(即IN谓词)并不完全相同(尽管有时在存在约束的情况下确实如此)。 So the two operators aren't exactly equivalent, logically所以这两个运算符在逻辑上并不完全等价

An alternative approach另一种方法

Your own approach maps the Map<UUID, String> to subqueries, hitting the DOCUMENT_ATTRIBUTE N times.您自己的方法将Map<UUID, String>映射到子查询,点击DOCUMENT_ATTRIBUTE N 次。 I'm guessing that the PG optimiser might not be able to see through this and factor out the common parts into a single subquery (though technically, it could).我猜测 PG 优化器可能无法看穿这一点并将公共部分分解为单个子查询(尽管从技术上讲,它可以)。 So, I'd rather create a single subquery of the form:所以,我宁愿创建一个表单的子查询:

WHERE document.id IN (
  SELECT a.document_id
  FROM document_attribute AS a
  WHERE (a.archive_attribute_id, a.value) IN (
    (?, ?),
    (?, ?), ...
  )
)

Or, dynamically, with jOOQ:或者,动态地,使用 jOOQ:

DOCUMENT.ID.in(
  select(DOCUMENT_ATTRIBUTE_DOCUMENT_ID)
  .from(DOCUMENT_ATTRIBUTE)
  .where(row(DOCUMENT_ATTRIBUTE.ARCHIVE_ATTRIBUTE_ID, DOCUMENT_ATTRIBUTE.VALUE).in(
    attributes.entrySet().stream().collect(Rows.toRowList(
      Entry::getKey,
      Entry::getValue
    ))
  ))
)

Using org.jooq.Rows::toRowList collectors.使用org.jooq.Rows::toRowList收集器。

Note: I don't think you have to further correlate the IN predicate's subquery by specifying a DOCUMENT_ATTRIBUTE.DOCUMENT_ID.eq(DOCUMENT.ID) predicate.注意:我认为您不必通过指定DOCUMENT_ATTRIBUTE.DOCUMENT_ID.eq(DOCUMENT.ID)谓词来进一步关联IN谓词的子查询。 That correlation is already implied by using IN itself.使用IN本身已经暗示了这种相关性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM