使用 JPA 和 Hibernate 時 DISTINCT 是如何工作的

Question

DISTINCT 在 JPA 中使用什么列，是否可以更改它？

這是使用 DISTINCT 的示例 JPA 查詢：

select DISTINCT c from Customer c

哪個沒有多大意義——不同的列基於什么？ 它是否在實體上指定為注釋，因為我找不到？

我想指定要區分的列，例如：

select DISTINCT(c.name) c from Customer c

我正在使用 MySQL 和 Hibernate。

Answer 1

你很親密。

select DISTINCT(c.name) from Customer c

Answer 2

根據底層 JPQL 或 Criteria API 查詢類型， DISTINCT在 JPA 中有兩種含義。

標量查詢

對於返回標量投影的標量查詢，如以下查詢：

List<Integer> publicationYears = entityManager
.createQuery(
    "select distinct year(p.createdOn) " +
    "from Post p " +
    "order by year(p.createdOn)", Integer.class)
.getResultList();

LOGGER.info("Publication years: {}", publicationYears);

應該將DISTINCT關鍵字傳遞給底層 SQL 語句，因為我們希望數據庫引擎在返回結果集之前過濾重復項：

SELECT DISTINCT
    extract(YEAR FROM p.created_on) AS col_0_0_
FROM
    post p
ORDER BY
    extract(YEAR FROM p.created_on)

-- Publication years: [2016, 2018]

實體查詢

對於實體查詢， DISTINCT有不同的含義。

在不使用DISTINCT的情況下，查詢如下：

List<Post> posts = entityManager
.createQuery(
    "select p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.getResultList();

LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

將像這樣加入post和post_comment表：

SELECT p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'

-- Fetched the following Post entity identifiers: [1, 1]

但是父post記錄在每個關聯的post_comment行的結果集中重復。 因此， Post實體List將包含重復的Post實體引用。

為了消除Post實體引用，我們需要使用DISTINCT ：

List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.getResultList();
 
LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

但隨后DISTINCT也被傳遞給 SQL 查詢，這根本不可取：

SELECT DISTINCT
       p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'
 
-- Fetched the following Post entity identifiers: [1]

通過將DISTINCT傳遞給 SQL 查詢，EXECUTION PLAN 將執行一個額外的排序階段，這會增加開銷而不會帶來任何價值，因為父子組合總是返回唯一記錄，因為子 PK 列：

Unique  (cost=23.71..23.72 rows=1 width=1068) (actual time=0.131..0.132 rows=2 loops=1)
  ->  Sort  (cost=23.71..23.71 rows=1 width=1068) (actual time=0.131..0.131 rows=2 loops=1)
        Sort Key: p.id, pc.id, p.created_on, pc.post_id, pc.review
        Sort Method: quicksort  Memory: 25kB
        ->  Hash Right Join  (cost=11.76..23.70 rows=1 width=1068) (actual time=0.054..0.058 rows=2 loops=1)
              Hash Cond: (pc.post_id = p.id)
              ->  Seq Scan on post_comment pc  (cost=0.00..11.40 rows=140 width=532) (actual time=0.010..0.010 rows=2 loops=1)
              ->  Hash  (cost=11.75..11.75 rows=1 width=528) (actual time=0.027..0.027 rows=1 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 9kB
                    ->  Seq Scan on post p  (cost=0.00..11.75 rows=1 width=528) (actual time=0.017..0.018 rows=1 loops=1)
                          Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
                          Rows Removed by Filter: 3
Planning time: 0.227 ms
Execution time: 0.179 ms

帶有 HINT_PASS_DISTINCT_THROUGH 的實體查詢

要從執行計划中消除排序階段，我們需要使用HINT_PASS_DISTINCT_THROUGH JPA 查詢提示：

List<Post> posts = entityManager
.createQuery(
    "select distinct p " +
    "from Post p " +
    "left join fetch p.comments " +
    "where p.title = :title", Post.class)
.setParameter(
    "title", 
    "High-Performance Java Persistence eBook has been released!"
)
.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
.getResultList();
 
LOGGER.info(
    "Fetched the following Post entity identifiers: {}", 
    posts.stream().map(Post::getId).collect(Collectors.toList())
);

現在，SQL 查詢將不包含DISTINCT ，但Post實體引用重復項將被刪除：

SELECT
       p.id AS id1_0_0_,
       pc.id AS id1_1_1_,
       p.created_on AS created_2_0_0_,
       p.title AS title3_0_0_,
       pc.post_id AS post_id3_1_1_,
       pc.review AS review2_1_1_,
       pc.post_id AS post_id3_1_0__
FROM   post p
LEFT OUTER JOIN
       post_comment pc ON p.id=pc.post_id
WHERE
       p.title='High-Performance Java Persistence eBook has been released!'
 
-- Fetched the following Post entity identifiers: [1]

執行計划將確認這次我們不再有額外的排序階段：

Hash Right Join  (cost=11.76..23.70 rows=1 width=1068) (actual time=0.066..0.069 rows=2 loops=1)
  Hash Cond: (pc.post_id = p.id)
  ->  Seq Scan on post_comment pc  (cost=0.00..11.40 rows=140 width=532) (actual time=0.011..0.011 rows=2 loops=1)
  ->  Hash  (cost=11.75..11.75 rows=1 width=528) (actual time=0.041..0.041 rows=1 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on post p  (cost=0.00..11.75 rows=1 width=528) (actual time=0.036..0.037 rows=1 loops=1)
              Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
              Rows Removed by Filter: 3
Planning time: 1.184 ms
Execution time: 0.160 ms

Answer 3

@Entity
@NamedQuery(name = "Customer.listUniqueNames", 
            query = "SELECT DISTINCT c.name FROM Customer c")
public class Customer {
        ...

        private String name;

        public static List<String> listUniqueNames() {
             return = getEntityManager().createNamedQuery(
                   "Customer.listUniqueNames", String.class)
                   .getResultList();
        }
}

Answer 4

更新：請查看投票最多的答案。

我自己的現在已經過時了。 僅出於歷史原因保留在這里。

連接中通常需要 HQL 中的不同，而不是像您自己這樣的簡單示例。

另請參閱如何在 HQL 中創建 Distinct 查詢

Answer 5

我同意kazanaki的回答，它幫助了我。 我想選擇整個實體，所以我用

 select DISTINCT(c) from Customer c

就我而言，我有多對多的關系，我想在一個查詢中加載帶有集合的實體。

我使用了 LEFT JOIN FETCH，最后我必須使結果與眾不同。

Answer 6

我會使用 JPA 的構造函數表達式功能。 另請參閱以下答案：

JPQL 構造函數表達式 - org.hibernate.hql.ast.QuerySyntaxException：表未映射

按照問題中的示例，它將是這樣的。

SELECT DISTINCT new com.mypackage.MyNameType(c.name) from Customer c

Answer 7

我正在添加一個稍微具體的答案，以防有人遇到與我相同的問題並找到這個問題。

我將 JPQL 與查詢注釋一起使用（沒有查詢構建）。 而且我需要為嵌入到另一個實體中的實體獲取不同的值，這種關系是通過多對一注釋斷言的。

我有兩個數據庫表：

MainEntity ，我想要不同的值
LinkEntity ，這是 MainEntity 和另一個表之間的關系表。 它有一個由三列組成的復合主鍵。

在 Java Spring 代碼中，這導致實現了三個類：

鏈接實體：

@Entity
@Immutable
@Table(name="link_entity")
public class LinkEntity implements Entity {

    @EmbeddedId
    private LinkEntityPK pk;

    // ... Getter, setter, toString()
}

鏈接實體PK：

@Embeddable
public class LinkEntityPK implements Entity, Serializable {

    /** The main entity we want to have distinct values of */
    @ManyToOne
    @JoinColumn(name = "code_entity")
    private MainEntity mainEntity;

    /** */
    @Column(name = "code_pk2")
    private String codeOperation;

    /** */
    @Column(name = "code_pk3")
    private String codeFonction;

主要實體：

@Entity
@Immutable
@Table(name = "main_entity")
public class MainEntity implements Entity {

    /** We use this for LinkEntity*/
    @Id
    @Column(name="code_entity")
    private String codeEntity;


    private String name;
    // And other attributes, getters and setters

因此，獲取主實體的不同值的最終查詢是：

@Repository
public interface EntityRepository extends JpaRepository<LinkEntity, String> {

    @Query(
        "Select " +
            "Distinct linkEntity.pk.intervenant " +
        "From " +
            "LinkEntity as linkEntity " +
            "Join MainEntity as mainEntity On " +
                 "mainEntity = linkEntity.pk.mainEntity ")
    List<MainEntity> getMainEntityList();

}

希望這可以幫助某人。

使用 JPA 和 Hibernate 時 DISTINCT 是如何工作的

問題描述

7 個解決方案

解決方案1
74 2012-10-24 13:58:10

解決方案2
24 2018-11-21 06:03:04

標量查詢

實體查詢

帶有 HINT_PASS_DISTINCT_THROUGH 的實體查詢

解決方案3
13 2012-08-28 22:44:58

解決方案4
13 已采納 2009-08-28 15:40:50

解決方案5
12 2016-10-19 15:45:14

解決方案6
5 2017-06-29 06:32:04

解決方案7
0 2022-06-28 09:23:55

使用 JPA 和 Hibernate 時 DISTINCT 是如何工作的

問題描述

7 個解決方案

解決方案1 74 2012-10-24 13:58:10

解決方案2 24 2018-11-21 06:03:04

標量查詢

實體查詢

帶有 HINT_PASS_DISTINCT_THROUGH 的實體查詢

解決方案3 13 2012-08-28 22:44:58

解決方案4 13 已采納 2009-08-28 15:40:50

解決方案5 12 2016-10-19 15:45:14

解決方案6 5 2017-06-29 06:32:04

解決方案7 0 2022-06-28 09:23:55

解決方案1
74 2012-10-24 13:58:10

解決方案2
24 2018-11-21 06:03:04

解決方案3
13 2012-08-28 22:44:58

解決方案4
13 已采納 2009-08-28 15:40:50

解決方案5
12 2016-10-19 15:45:14

解決方案6
5 2017-06-29 06:32:04

解決方案7
0 2022-06-28 09:23:55