简体   繁体   English

Oracle 中的 LISTAGG 返回不同的值

[英]LISTAGG in Oracle to return distinct values

I am trying to use the LISTAGG function in Oracle.我正在尝试在 Oracle 中使用LISTAGG function。 I would like to get only the distinct values for that column.我只想获取该列的不同值。 Is there a way in which I can get only the distinct values without creating a function or a procedure?有没有一种方法可以在不创建 function 或过程的情况下仅获取不同的值?

  col1  col2 Created_by
   1     2     Smith 
   1     2     John 
   1     3     Ajay 
   1     4     Ram 
   1     5     Jack 

I need to select col1 and the LISTAGG of col2 (column 3 is not considered).我需要 select col1 和 col2 的LISTAGG (不考虑第 3 列)。 When I do that, I get something like this as the result of LISTAGG : [2,2,3,4,5]当我这样做时,我会得到类似LISTAGG的结果: [2,2,3,4,5]

I need to remove the duplicate '2' here;我需要在这里删除重复的“2”; I need only the distinct values of col2 against col1.我只需要 col2 与 col1 的不同值。

19c and later: 19c 及更高版本:

select listagg(distinct the_column, ',') within group (order by the_column)
from the_table

18c and earlier: 18c 及更早版本:

select listagg(the_column, ',') within group (order by the_column)
from (
   select distinct the_column 
   from the_table
) t

If you need more columns, something like this might be what you are looking for:如果您需要更多列,您可能正在寻找类似这样的内容:

select col1, listagg(col2, ',') within group (order by col2)
from (
  select col1, 
         col2,
         row_number() over (partition by col1, col2 order by col1) as rn
  from foo
  order by col1,col2
)
where rn = 1
group by col1;

From oracle 19C it is built in see here从 oracle 19C 开始,它是内置的,请参见此处

From 18C and earlier try within group see here从 18C 及更早的组内尝试请参阅此处

Otherwise use regular expressions否则使用正则表达式

Here's how to solve your issue.以下是解决您的问题的方法。

select  
      regexp_replace(
    '2,2,2.1,3,3,3,3,4,4' 
     ,'([^,]+)(,\1)*(,|$)', '\1\3')

from dual

returns返回

2,2.1,3,4 2,2.1,3,4

ANSWER below:答案如下:

select col1, 

regexp_replace(
    listagg(
     col2 , ',') within group (order by col2)  -- sorted
    ,'([^,]+)(,\1)*(,|$)', '\1\3') )
   from tableX
where rn = 1
group by col1; 

Note: The above will work in most cases - list should be sorted , you may have to trim all trailing and leading space depending on your data.注意:以上方法在大多数情况下都有效 - 列表应该排序,您可能需要根据您的数据修剪所有尾随和前导空格。

If you have a alot of items in a group > 20 or big string sizes you might run into oracle string size limit 'result of string concatenation is too long'.如果您在一个组中有很多项目 > 20 或大字符串大小,您可能会遇到 oracle 字符串大小限制“字符串连接的结果太长”。

From oracle 12cR2 you can suppress this error see here .从 oracle 12cR2 您可以抑制此错误,请参见此处 Alternatively put a max number on the members in each group.或者,为每个组中的成员设置一个最大数量。 This will only work if its ok to list only the first members.这仅在可以仅列出第一个成员的情况下才有效。 If you have very long variable strings this may not work.如果您有很长的变量字符串,这可能不起作用。 you will have to experiment.你将不得不进行实验。

select col1,

case 
    when count(col2) < 100 then 
       regexp_replace(
        listagg(col2, ',') within group (order by col2)
        ,'([^,]+)(,\1)*(,|$)', '\1\3')
 
    else
    'Too many entries to list...'
end
    
from sometable
where rn = 1
group by col1;

Another solution (not so simple) to hopefully avoid oracle string size limit - string size is limited to 4000. Thanks to this post here by user3465996希望避免 oracle 字符串大小限制的另一种解决方案(不是那么简单) - 字符串大小限制为 4000。感谢user3465996这篇文章

select col1  ,
    dbms_xmlgen.convert(  -- HTML decode
    dbms_lob.substr( -- limit size to 4000 chars
    ltrim( -- remove leading commas
    REGEXP_REPLACE(REPLACE(
         REPLACE(
           XMLAGG(
             XMLELEMENT("A",col2 )
               ORDER BY col2).getClobVal(),
             '<A>',','),
             '</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'),
                  ','), -- remove leading XML commas ltrim
                      4000,1) -- limit to 4000 string size
                      , 1)  -- HTML.decode
                       as col2
 from sometable
where rn = 1
group by col1;

V1 - some test cases - FYI V1 - 一些测试用例 - 仅供参考

regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1')
-> 2.1,3,4 Fail
regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1')
-> 2 ,2.1,3,4 Success  - fixed length items

V2 -items contained within items eg. V2 - 项目中包含的项目,例如。 2,21 2,21

regexp_replace('2.1,1','([^,]+)(,\1)+', '\1')
-> 2.1 Fail
regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2')
-> 2 ,2.1,1 ,3 ,4  -- success - NEW regex
 regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2')
-> a,b,b,c fail!

v3 - regex thank Igor! v3 - 正则表达式感谢伊戈尔! works all cases.适用于所有情况。

select  
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') ,
---> 2,2.1,3,4 works
regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'),
--> 2.1,1 works
regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3')
---> a,b,c works

from dual

you can use undocumented wm_concat function.您可以使用未记录wm_concat函数。

select col1, wm_concat(distinct col2) col2_list 
from tab1
group by col1;

this function returns clob column, if you want you can use dbms_lob.substr to convert clob to varchar2.此函数返回 clob 列,如果需要,可以使用dbms_lob.substr将 clob 转换为 varchar2。

If you want distinct values across MULTIPLE columns, want control over sort order, don't want to use an undocumented function that may disappear, and do not want more than one full table scan, you may find this construct useful:如果您想要跨 MULTIPLE 列的不同值,想要控制排序顺序,不想使用可能会消失的未记录函数,并且不想进行多次全表扫描,您可能会发现此构造很有用:

with test_data as 
(
      select 'A' as col1, 'T_a1' as col2, '123' as col3 from dual
union select 'A', 'T_a1', '456' from dual
union select 'A', 'T_a1', '789' from dual
union select 'A', 'T_a2', '123' from dual
union select 'A', 'T_a2', '456' from dual
union select 'A', 'T_a2', '111' from dual
union select 'A', 'T_a3', '999' from dual
union select 'B', 'T_a1', '123' from dual
union select 'B', 'T_b1', '740' from dual
union select 'B', 'T_b1', '846' from dual
)
select col1
     , (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col2)) as col2s
     , (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col3)) as col3s
from 
(
select col1
     , collect(distinct col2) as collect_col2
     , collect(distinct col3) as collect_col3
from test_data
group by col1
);

I overcame this issue by grouping on the values first, then do another aggregation with the listagg.我通过首先对值进行分组来克服这个问题,然后使用 listagg 进行另一个聚合。 Something like this:像这样的东西:

select a,b,listagg(c,',') within group(order by c) c, avg(d)
from (select a,b,c,avg(d)
      from   table
      group by (a,b,c))
group by (a,b)

only one full table access, relatively easy to expand to more complex queries只有一个全表访问,相对容易扩展到更复杂的查询

If the intent is to apply this transformation to multiple columns, I have extended a_horse_with_no_name's solution:如果打算将此转换应用于多个列,我扩展了 a_horse_with_no_name 的解决方案:

SELECT * FROM
(SELECT LISTAGG(GRADE_LEVEL, ',') within group(order by GRADE_LEVEL) "Grade Levels" FROM (select distinct GRADE_LEVEL FROM Students) t)                     t1,
(SELECT LISTAGG(ENROLL_STATUS, ',') within group(order by ENROLL_STATUS) "Enrollment Status" FROM (select distinct ENROLL_STATUS FROM Students) t)          t2,
(SELECT LISTAGG(GENDER, ',') within group(order by GENDER) "Legal Gender Code" FROM (select distinct GENDER FROM Students) t)                               t3,
(SELECT LISTAGG(CITY, ',') within group(order by CITY) "City" FROM (select distinct CITY FROM Students) t)                                                  t4,
(SELECT LISTAGG(ENTRYCODE, ',') within group(order by ENTRYCODE) "Entry Code" FROM (select distinct ENTRYCODE FROM Students) t)                             t5,
(SELECT LISTAGG(EXITCODE, ',') within group(order by EXITCODE) "Exit Code" FROM (select distinct EXITCODE FROM Students) t)                                 t6,
(SELECT LISTAGG(LUNCHSTATUS, ',') within group(order by LUNCHSTATUS) "Lunch Status" FROM (select distinct LUNCHSTATUS FROM Students) t)                     t7,
(SELECT LISTAGG(ETHNICITY, ',') within group(order by ETHNICITY) "Race Code" FROM (select distinct ETHNICITY FROM Students) t)                              t8,
(SELECT LISTAGG(CLASSOF, ',') within group(order by CLASSOF) "Expected Graduation Year" FROM (select distinct CLASSOF FROM Students) t)                     t9,
(SELECT LISTAGG(TRACK, ',') within group(order by TRACK) "Track Code" FROM (select distinct TRACK FROM Students) t)                                         t10,
(SELECT LISTAGG(GRADREQSETID, ',') within group(order by GRADREQSETID) "Graduation ID" FROM (select distinct GRADREQSETID FROM Students) t)                 t11,
(SELECT LISTAGG(ENROLLMENT_SCHOOLID, ',') within group(order by ENROLLMENT_SCHOOLID) "School Key" FROM (select distinct ENROLLMENT_SCHOOLID FROM Students) t)       t12,
(SELECT LISTAGG(FEDETHNICITY, ',') within group(order by FEDETHNICITY) "Federal Race Code" FROM (select distinct FEDETHNICITY FROM Students) t)                         t13,
(SELECT LISTAGG(SUMMERSCHOOLID, ',') within group(order by SUMMERSCHOOLID) "Summer School Key" FROM (select distinct SUMMERSCHOOLID FROM Students) t)                               t14,
(SELECT LISTAGG(FEDRACEDECLINE, ',') within group(order by FEDRACEDECLINE) "Student Decl to Prov Race Code" FROM (select distinct FEDRACEDECLINE FROM Students) t)          t15

This is Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production.这是 Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production。
I was unable to use STRAGG because there is no way to DISTINCT and ORDER.我无法使用 STRAGG,因为无法区分和排序。

Performance scales linearly, which is good, since I am adding all columns of interest.性能线性扩展,这很好,因为我添加了所有感兴趣的列。 The above took 3 seconds for 77K rows.以上 77K 行耗时 3 秒。 For just one rollup, .172 seconds.仅一次汇总,0.172 秒。 I do with there was a way to distinctify multiple columns in a table in one pass.我有一种方法可以一次性区分表中的多个列。

Upcoming Oracle 19c will support DISTINCT with LISTAGG .即将推出的 Oracle 19c 将通过LISTAGG支持DISTINCT

LISTAGG with DISTINCT option :带有 DISTINCT 选项的 LISTAGG

This feature is coming with 19c: 19c 附带此功能:

 SQL> select deptno, listagg (distinct sal,', ') within group (order by sal) 2 from scott.emp 3 group by deptno;

EDIT:编辑:

Oracle 19C LISTAGG DISTINCT Oracle 19C 列表不同

The LISTAGG aggregate function now supports duplicate elimination by using the new DISTINCT keyword. LISTAGG 聚合函数现在支持使用新的 DISTINCT 关键字消除重复。 The LISTAGG aggregate function orders the rows for each group in a query according to the ORDER BY expression and then concatenates the values into a single string. LISTAGG 聚合函数根据 ORDER BY 表达式对查询中的每个组的行进行排序,然后将值连接到单个字符串中。 With the new DISTINCT keyword, duplicate values can be removed from the specified expression before concatenation into a single string.使用新的 DISTINCT 关键字,可以在连接成单个字符串之前从指定的表达式中删除重复值。 This removes the need to create complex query processing to find the distinct values prior to using the aggregate LISTAGG function.这消除了在使用聚合 LISTAGG 函数之前创建复杂查询处理以查找不同值的需要。 With the DISTINCT option, the processing to remove duplicate values can be done directly within the LISTAGG function.使用 DISTINCT 选项,删除重复值的处理可以直接在 LISTAGG 函数中完成。 The result is simpler, faster, more efficient SQL.结果是更简单、更快、更高效的 SQL。

To get around the string length issue you can use XMLAGG which is similar to listagg but it returns a clob.要解决字符串长度问题,您可以使用类似于listaggXMLAGG ,但它返回一个 clob。

You can can then parse using regexp_replace and get the unique values and then turn it back into a string using dbms_lob.substr() .然后,您可以使用regexp_replace解析并获取唯一值,然后使用dbms_lob.substr()将其转换回字符串。 If you have a huge amount of distinct values you will still run out of space this way but for a lot of cases the code below should work.如果您有大量不同的值,您仍然会以这种方式耗尽空间,但在很多情况下,下面的代码应该可以工作。

You can also change the delimiters you use.您还可以更改使用的分隔符。 In my case I wanted '-' instead of ',' but you should be able to replace the dashes in my code and use commas if you want that.在我的情况下,我想要“-”而不是“,”,但你应该能够替换我的代码中的破折号,如果你愿意的话可以使用逗号。

select col1,
    dbms_lob.substr(ltrim(REGEXP_REPLACE(REPLACE(
         REPLACE(
           XMLAGG(
             XMLELEMENT("A",col2)
               ORDER BY col2).getClobVal(),
             '<A>','-'),
             '</A>',''),'([^-]*)(-\1)+($|-)', 
           '\1\3'),'-'), 4000,1) as platform_mix
from table

What about creating a dedicated function that will make the "distinct" part :如何创建一个专门的功能来制作“独特”的部分:

create or replace function listagg_distinct (t in str_t, sep IN VARCHAR2 DEFAULT ',') 
  return VARCHAR2
as 
  l_rc VARCHAR2(4096) := '';
begin
  SELECT listagg(val, sep) WITHIN GROUP (ORDER BY 1)
    INTO l_rc
    FROM (SELECT DISTINCT column_value val FROM table(t));
  RETURN l_rc;
end;
/

And then use it to do the aggregation :然后用它来做聚合:

SELECT col1, listagg_distinct(cast(collect(col_2) as str_t ), ', ')
  FROM your_table
  GROUP BY col_1;

Further refining @YoYo's correction to @a_horse_with_no_name's row_number() based approach using DECODE vs CASE ( i saw here ).使用 DECODE vs CASE( 我在这里看到)进一步完善 @YoYo 对 @a_horse_with_no_name 的基于 row_number() 的方法的更正。 I see that @Martin Vrbovsky also has this case approach answer.我看到@Martin Vrbovsky 也有这种案例方法的答案。

select
  col1, 
  listagg(col2, ',') within group (order by col2) AS col2_list,
  listagg(col3, ',') within group (order by col3) AS col3_list,
  SUM(col4) AS col4
from (
  select
    col1, 
    decode(row_number() over (partition by col1, col2 order by null),1,col2) as col2,
    decode(row_number() over (partition by col1, col3 order by null),1,col3) as col3
  from foo
)
group by col1;

listagg() ignores NULL values, so in a first step you could use the lag() function to analyse whether the previous record had the same value, if yes then NULL, else 'new value'. listagg() 忽略 NULL 值,因此在第一步中,您可以使用 lag() 函数来分析之前的记录是否具有相同的值,如果是则为 NULL,否则为“新值”。

WITH tab AS 
(           
          SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT col1
     , CASE 
       WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN 
         NULL 
       ELSE 
         col2 
       END as col2_with_nulls
     , created_by
  FROM tab;

Results结果

      COL1 COL2_WITH_NULLS CREAT
---------- --------------- -----
         1               2 Smith
         1                 John
         1               3 Ajay
         1               4 Ram
         1               5 Jack

Note that the second 2 is replaced by NULL.请注意,第二个 2 被 NULL 替换。 Now you can wrap a SELECT with the listagg() around it.现在你可以用 listagg() 包裹一个 SELECT 。

WITH tab AS 
(           
          SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list
  FROM ( SELECT col1
              , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls
              , created_by
           FROM tab );

Result结果

COL2_LIST
---------
2,3,4,5

You can do this over multiple columns too.您也可以在多个列上执行此操作。

WITH tab AS 
(           
          SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual
UNION ALL SELECT 1 as col1, 2 as col2, 'John'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay'  as created_by FROM dual
UNION ALL SELECT 1 as col1, 4 as col2, 'Ram'   as created_by FROM dual
UNION ALL SELECT 1 as col1, 5 as col2, 'Jack'  as created_by FROM dual
)
SELECT listagg(col1_with_nulls, ',') WITHIN GROUP (ORDER BY col1_with_nulls) col1_list
     , listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list
     , listagg(created_by, ',')      WITHIN GROUP (ORDER BY created_by) created_by_list
  FROM ( SELECT CASE WHEN lag(col1) OVER (ORDER BY col1) = col1 THEN NULL ELSE col1 END as col1_with_nulls
              , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls
              , created_by
           FROM tab );

Result结果

COL1_LIST COL2_LIST CREATED_BY_LIST
--------- --------- -------------------------
1         2,3,4,5   Ajay,Jack,John,Ram,Smith

I implemented this stored function :我实现了这个存储功能:

CREATE TYPE LISTAGG_DISTINCT_PARAMS AS OBJECT (ELEMENTO VARCHAR2(2000), SEPARATORE VARCHAR2(10));

CREATE TYPE T_LISTA_ELEMENTI AS TABLE OF VARCHAR2(2000);

CREATE TYPE T_LISTAGG_DISTINCT AS OBJECT (

    LISTA_ELEMENTI T_LISTA_ELEMENTI,
        SEPARATORE VARCHAR2(10),

    STATIC FUNCTION ODCIAGGREGATEINITIALIZE(SCTX  IN OUT            T_LISTAGG_DISTINCT) 
                    RETURN NUMBER,

    MEMBER FUNCTION ODCIAGGREGATEITERATE   (SELF  IN OUT            T_LISTAGG_DISTINCT, 
                                            VALUE IN                    LISTAGG_DISTINCT_PARAMS ) 
                    RETURN NUMBER,

    MEMBER FUNCTION ODCIAGGREGATETERMINATE (SELF         IN     T_LISTAGG_DISTINCT,
                                            RETURN_VALUE OUT    VARCHAR2, 
                                            FLAGS        IN     NUMBER      )
                    RETURN NUMBER,

    MEMBER FUNCTION ODCIAGGREGATEMERGE       (SELF               IN OUT T_LISTAGG_DISTINCT,
                                                                                        CTX2                 IN         T_LISTAGG_DISTINCT    )
                    RETURN NUMBER
);

CREATE OR REPLACE TYPE BODY T_LISTAGG_DISTINCT IS 

    STATIC FUNCTION ODCIAGGREGATEINITIALIZE(SCTX IN OUT T_LISTAGG_DISTINCT) RETURN NUMBER IS 
    BEGIN
                SCTX := T_LISTAGG_DISTINCT(T_LISTA_ELEMENTI() , ',');
        RETURN ODCICONST.SUCCESS;
    END;

    MEMBER FUNCTION ODCIAGGREGATEITERATE(SELF IN OUT T_LISTAGG_DISTINCT, VALUE IN LISTAGG_DISTINCT_PARAMS) RETURN NUMBER IS
    BEGIN

                IF VALUE.ELEMENTO IS NOT NULL THEN
                        SELF.LISTA_ELEMENTI.EXTEND;
                        SELF.LISTA_ELEMENTI(SELF.LISTA_ELEMENTI.LAST) := TO_CHAR(VALUE.ELEMENTO);
                        SELF.LISTA_ELEMENTI:= SELF.LISTA_ELEMENTI MULTISET UNION DISTINCT SELF.LISTA_ELEMENTI;
                        SELF.SEPARATORE := VALUE.SEPARATORE;
                END IF;
        RETURN ODCICONST.SUCCESS;
    END;

    MEMBER FUNCTION ODCIAGGREGATETERMINATE(SELF IN T_LISTAGG_DISTINCT, RETURN_VALUE OUT VARCHAR2, FLAGS IN NUMBER) RETURN NUMBER IS
      STRINGA_OUTPUT            CLOB:='';
            LISTA_OUTPUT                T_LISTA_ELEMENTI;
            TERMINATORE                 VARCHAR2(3):='...';
            LUNGHEZZA_MAX           NUMBER:=4000;
    BEGIN

                IF SELF.LISTA_ELEMENTI.EXISTS(1) THEN -- se esiste almeno un elemento nella lista

                        -- inizializza una nuova lista di appoggio
                        LISTA_OUTPUT := T_LISTA_ELEMENTI();

                        -- riversamento dei soli elementi in DISTINCT
                        LISTA_OUTPUT := SELF.LISTA_ELEMENTI MULTISET UNION DISTINCT SELF.LISTA_ELEMENTI;

                        -- ordinamento degli elementi
                        SELECT CAST(MULTISET(SELECT * FROM TABLE(LISTA_OUTPUT) ORDER BY 1 ) AS T_LISTA_ELEMENTI ) INTO LISTA_OUTPUT FROM DUAL;

                        -- concatenazione in una stringa                        
                        FOR I IN LISTA_OUTPUT.FIRST .. LISTA_OUTPUT.LAST - 1
                        LOOP
                            STRINGA_OUTPUT := STRINGA_OUTPUT || LISTA_OUTPUT(I) || SELF.SEPARATORE;
                        END LOOP;
                        STRINGA_OUTPUT := STRINGA_OUTPUT || LISTA_OUTPUT(LISTA_OUTPUT.LAST);

                        -- se la stringa supera la dimensione massima impostata, tronca e termina con un terminatore
                        IF LENGTH(STRINGA_OUTPUT) > LUNGHEZZA_MAX THEN
                                    RETURN_VALUE := SUBSTR(STRINGA_OUTPUT, 0, LUNGHEZZA_MAX - LENGTH(TERMINATORE)) || TERMINATORE;
                        ELSE
                                    RETURN_VALUE:=STRINGA_OUTPUT;
                        END IF;

                ELSE -- se non esiste nessun elemento, restituisci NULL

                        RETURN_VALUE := NULL;

                END IF;

        RETURN ODCICONST.SUCCESS;
    END;

    MEMBER FUNCTION ODCIAGGREGATEMERGE(SELF IN OUT T_LISTAGG_DISTINCT, CTX2 IN T_LISTAGG_DISTINCT) RETURN NUMBER IS
    BEGIN
        RETURN ODCICONST.SUCCESS;
    END;

END; -- fine corpo

CREATE
FUNCTION LISTAGG_DISTINCT (INPUT LISTAGG_DISTINCT_PARAMS) RETURN VARCHAR2
    PARALLEL_ENABLE AGGREGATE USING T_LISTAGG_DISTINCT;

// Example
SELECT LISTAGG_DISTINCT(LISTAGG_DISTINCT_PARAMS(OWNER, ', ')) AS LISTA_OWNER
FROM SYS.ALL_OBJECTS;

I'm sorry, but in some case (for a very big set), Oracle could return this error:很抱歉,但在某些情况下(对于一个非常大的集合),Oracle 可能会返回此错误:

Object or Collection value was too large. The size of the value
might have exceeded 30k in a SORT context, or the size might be
too big for available memory.

but I think this is a good point of start ;)但我认为这是一个很好的开始;)

Has anyone thought of using a PARTITION BY clause?有没有人想过使用 PARTITION BY 子句? It worked for me in this query to get a list of application services and the access.在此查询中,它对我有用,以获取应用程序服务和访问权限的列表。

SELECT DISTINCT T.APP_SVC_ID, 
       LISTAGG(RTRIM(T.ACCESS_MODE), ',') WITHIN GROUP(ORDER BY T.ACCESS_MODE) OVER(PARTITION BY T.APP_SVC_ID) AS ACCESS_MODE 
  FROM APP_SVC_ACCESS_CNTL T 
 GROUP BY T.ACCESS_MODE, T.APP_SVC_ID

I had to cut out my where clause for NDA, but you get the idea.我不得不为 NDA 删掉我的 where 子句,但你明白了。

I think this could help - CASE the columns value to NULL if it's duplicate - then it's not appended to LISTAGG string:我认为这可能会有所帮助 - 如果列重复,则将列值设为 NULL - 然后它不会附加到 LISTAGG 字符串:

with test_data as 
(
      select 1 as col1, 2 as col2, 'Smith' as created_by from dual
union select 1, 2, 'John' from dual
union select 1, 3, 'Ajay' from dual
union select 1, 4, 'Ram' from dual
union select 1, 5, 'Jack' from dual
union select 2, 5, 'Smith' from dual
union select 2, 6, 'John' from dual
union select 2, 6, 'Ajay' from dual
union select 2, 6, 'Ram' from dual
union select 2, 7, 'Jack' from dual
)
SELECT col1  ,
      listagg(col2 , ',') within group (order by col2 ASC) AS orig_value,
      listagg(CASE WHEN rwn=1 THEN col2 END , ',') within group (order by col2 ASC) AS distinct_value
from 
    (
    select row_number() over (partition by col1,col2 order by 1) as rwn, 
           a.*
    from test_data a
    ) a
GROUP BY col1   

Results in:结果是:

COL1  ORIG         DISTINCT
1   2,2,3,4,5   2,3,4,5
2   5,6,6,6,7   5,6,7

You can do it via RegEx replacement.您可以通过正则表达式替换来做到这一点。 Here is an example:这是一个例子:

-- Citations Per Year - Cited Publications main query. Includes list of unique associated core project numbers, ordered by core project number.
SELECT ptc.pmid AS pmid, ptc.pmc_id, ptc.pub_title AS pubtitle, ptc.author_list AS authorlist,
  ptc.pub_date AS pubdate,
  REGEXP_REPLACE( LISTAGG ( ppcc.admin_phs_org_code || 
    TO_CHAR(ppcc.serial_num,'FM000000'), ',') WITHIN GROUP (ORDER BY ppcc.admin_phs_org_code || 
    TO_CHAR(ppcc.serial_num,'FM000000')),
    '(^|,)(.+)(,\2)+', '\1\2')
  AS projectNum
FROM publication_total_citations ptc
  JOIN proj_paper_citation_counts ppcc
    ON ptc.pmid = ppcc.pmid
   AND ppcc.citation_year = 2013
  JOIN user_appls ua
    ON ppcc.admin_phs_org_code = ua.admin_phs_org_code
   AND ppcc.serial_num = ua.serial_num
   AND ua.login_id = 'EVANSF'
GROUP BY ptc.pmid, ptc.pmc_id, ptc.pub_title, ptc.author_list, ptc.pub_date
ORDER BY pmid;

Also posted here: Oracle - unique Listagg values还在这里发布: Oracle - 唯一的 Listagg 值

I wrote a function to handle this using regular expressions.我写了一个函数来使用正则表达式来处理这个问题。 The in parameters are: 1) the listagg call itself 2) A repeat of the delimiter in 参数是:1)listagg 调用自身 2)分隔符的重复

create or replace function distinct_listagg
  (listagg_in varchar2,
   delimiter_in varchar2)

   return varchar2
   as
   hold_result varchar2(4000);
   begin

   select rtrim( regexp_replace( (listagg_in)
      , '([^'||delimiter_in||']*)('||
      delimiter_in||'\1)+($|'||delimiter_in||')', '\1\3'), ',')
      into hold_result
      from dual;

return hold_result;

end;

Now you don't have to repeat the regular expression every time you do this, simply say:现在您不必每次执行此操作时都重复正则表达式,只需说:

select distinct_listagg(
                       listagg(myfield,', ') within group (order by 1),
                       ', '
                       )
     from mytable;

If you do not need a particular order of concatenated values, and the separator can be a comma, you can do:如果您不需要特定顺序的连接值,并且分隔符可以是逗号,则可以执行以下操作:

select col1, stragg(distinct col2)
  from table
 group by col1

I neded a DISTINCT version of this and got this one working out.我需要一个 DISTINCT 版本,并得到了这个。

RTRIM(REGEXP_REPLACE(
                       (value, ', ') WITHIN GROUP( ORDER BY value)), 
                            '([^ ]+)(, \1)+','\1'),', ') 

One annoying aspect with LISTAGG is that if the total length of concatenated string exceeds 4000 characters( limit for VARCHAR2 in SQL ), the below error is thrown, which is difficult to manage in Oracle versions upto 12.1 LISTAGG的一个令人讨厌的方面是,如果连接字符串的总长度超过 4000 个字符(SQL 中VARCHAR2的限制),则会引发以下错误,这在 Oracle 版本高达 12.1 中很难管理

ORA-01489: result of string concatenation is too long ORA-01489: 字符串连接的结果太长

A new feature added in 12cR2 is the ON OVERFLOW clause of LISTAGG . 12cR2 中添加的一个新特性是LISTAGGON OVERFLOW子句。 The query including this clause would look like:包含此子句的查询如下所示:

SELECT pid, LISTAGG(Desc, ' ' on overflow truncate) WITHIN GROUP (ORDER BY seq) AS desc
FROM B GROUP BY pid;

The above will restrict the output to 4000 characters but will not throw the ORA-01489 error.以上将限制输出为 4000 个字符,但不会抛出ORA-01489错误。

These are some of the additional options of ON OVERFLOW clause:这些是ON OVERFLOW子句的一些附加选项:

  • ON OVERFLOW TRUNCATE 'Contd..' : This will display 'Contd..' at the end of string (Default is ... ) ON OVERFLOW TRUNCATE 'Contd..' :这将在字符串末尾显示'Contd..' (默认为...
  • ON OVERFLOW TRUNCATE '' : This will display the 4000 characters without any terminating string. ON OVERFLOW TRUNCATE '' :这将显示 4000 个字符而没有任何终止字符串。
  • ON OVERFLOW TRUNCATE WITH COUNT : This will display the total number of characters at the end after the terminating characters. ON OVERFLOW TRUNCATE WITH COUNT :这将在终止字符之后显示最后的字符总数。 Eg:- ' ...(5512) '例如:-' ...(5512) '
  • ON OVERFLOW ERROR : If you expect the LISTAGG to fail with the ORA-01489 error ( Which is default anyway ). ON OVERFLOW ERROR :如果您希望LISTAGGORA-01489错误而失败(无论如何这是默认设置)。

Using SELECT DISTINCT ... as part of a Subquery before calling LISTAGG is probably the best way for simple queries, as noted by @a_horse_with_no_name正如@a_horse_with_no_name 所指出的,在调用 LISTAGG 之前使用SELECT DISTINCT ...作为子查询的一部分可能是简单查询的最佳方式

However, in more complex queries, it might not be possible, or easy, to accomplish this.但是,在更复杂的查询中,可能无法或不容易做到这一点。 I had this come up in a scenario that was using top-n approach using an analytic function.我在一个使用分析函数的 top-n 方法的场景中遇到了这个问题。

So I found the COLLECT aggregate function.所以我找到了COLLECT聚合函数。 It is documented to have the UNIQUE or DISTINCT modifier available.它被记录为具有可用的UNIQUEDISTINCT修饰符。 Only in 10g , it quietly fails (it ignores the modifier without error).在 10g 中,它悄悄地失败了(它忽略了修饰符而没有错误)。 However, to overcome this, from another answer , I came to this solution:但是,为了克服这个问题,我从另一个答案中得出了这个解决方案:

SELECT
  ...
  (
    SELECT LISTAGG(v.column_value,',') WITHIN GROUP (ORDER BY v.column_value)
    FROM TABLE(columns_tab) v
  ) AS columns,
  ...
FROM (
  SELECT
    ...
    SET(CAST(COLLECT(UNIQUE some_column ORDER BY some_column) AS tab_typ)) AS columns_tab,
    ...
)

Basically, by using SET , I remove the duplicates in my collection.基本上,通过使用SET ,我删除了我收藏中的重复项。

You would still need to define the tab_typ as a basic collection type, and in the case of a VARCHAR , this would be for example:您仍然需要将tab_typ定义为基本集合类型,对于VARCHAR ,例如:

CREATE OR REPLACE type tab_typ as table of varchar2(100)
/

Also as a correction to the answer from @a_horse_with_no_name on the multi column situation, where you might want to aggregate still on a third (or more) columns:同样作为对@a_horse_with_no_name 在多列情况下的答案的更正,您可能仍希望在第三(或更多)列上进行聚合:

select
  col1, 
  listagg(CASE rn2 WHEN 1 THEN col2 END, ',') within group (order by col2) AS col2_list,
  listagg(CASE rn3 WHEN 1 THEN col3 END, ',') within group (order by col3) AS col3_list,
  SUM(col4) AS col4
from (
  select
    col1, 
    col2,
    row_number() over (partition by col1, col2 order by null) as rn2,
    row_number() over (partition by col1, col3 order by null) as rn3
  from foo
)
group by col1;

If you would leave the rn = 1 as a where condition to the query, you would aggregate other columns incorrectly.如果您将rn = 1作为查询的 where 条件,则会错误地聚合其他列。

select col1, listaggr(col2,',') within group(Order by col2) from table group by col1 meaning aggregate the strings (col2) into list keeping the order n then afterwards deal with the duplicates as group by col1 meaning merge col1 duplicates in 1 group. select col1, listaggr(col2,',') within group(Order by col2) from table group by col1表示将字符串 (col2) 聚合到列表中,保持顺序 n 然后将重复项作为 group by col1 处理,表示合并 col1 重复项在 1 组中。 perhaps this looks clean and simple as it should be and if in case you want col3 as well just you need to add one more listagg() that is select col1, listaggr(col2,',') within group(Order by col2),listaggr(col3,',') within group(order by col3) from table group by col1也许这看起来应该是干净和简单的,如果你也想要col3,你只需要在select col1, listaggr(col2,',') within group(Order by col2),listaggr(col3,',') within group(order by col3) from table group by col1

Use listagg_clob function created like this:使用这样创建的 listagg_clob 函数:

create or replace package list_const_p
is
list_sep varchar2(10) := ',';
end list_const_p;
/
sho err

create type listagg_clob_t as object(
v_liststring varchar2(32767),
v_clob clob,
v_templob number,

static function ODCIAggregateInitialize(
sctx IN OUT listagg_clob_t
) return number,
member function ODCIAggregateIterate(
self IN OUT listagg_clob_t, value IN varchar2
) return number,
member function ODCIAggregateTerminate(
self IN OUT listagg_clob_t, returnValue OUT clob, flags IN number
) return number,
member function ODCIAggregateMerge(
self IN OUT listagg_clob_t, ctx2 IN OUT listagg_clob_t
) return number
);
/
sho err

create or replace type body listagg_clob_t is

static function ODCIAggregateInitialize(sctx IN OUT listagg_clob_t)
return number is
begin
sctx := listagg_clob_t('', '', 0);
return ODCIConst.Success;
end;

member function ODCIAggregateIterate(
self IN OUT listagg_clob_t,
value IN varchar2
) return number is
begin
if nvl(lengthb(v_liststring),0) + nvl(lengthb(value),0) <= 4000 then
self.v_liststring:=self.v_liststring || value || list_const_p.list_sep;
else
if self.v_templob = 0 then
dbms_lob.createtemporary(self.v_clob, true, dbms_lob.call);
self.v_templob := 1;
end if;
dbms_lob.writeappend(self.v_clob, length(self.v_liststring), v_liststring);
self.v_liststring := value || list_const_p.list_sep;
end if;
return ODCIConst.Success;
end;

member function ODCIAggregateTerminate(
self IN OUT listagg_clob_t,
returnValue OUT clob,
flags IN number
) return number is
begin
if self.v_templob != 0 then
dbms_lob.writeappend(self.v_clob, length(self.v_liststring), self.v_liststring);
dbms_lob.trim(self.v_clob, dbms_lob.getlength(self.v_clob) - 1);
else
self.v_clob := substr(self.v_liststring, 1, length(self.v_liststring) - 1);
end if;
returnValue := self.v_clob;
return ODCIConst.Success;
end;

member function ODCIAggregateMerge(self IN OUT listagg_clob_t, ctx2 IN OUT listagg_clob_t) return number is
begin
if ctx2.v_templob != 0 then
if self.v_templob != 0 then
dbms_lob.append(self.v_clob, ctx2.v_clob);
dbms_lob.freetemporary(ctx2.v_clob);
ctx2.v_templob := 0;
else
self.v_clob := ctx2.v_clob;
self.v_templob := 1;
ctx2.v_clob := '';
ctx2.v_templob := 0;
end if;
end if;
if nvl(lengthb(self.v_liststring),0) + nvl(lengthb(ctx2.v_liststring),0) <= 4000 then
self.v_liststring := self.v_liststring || ctx2.v_liststring;
ctx2.v_liststring := '';
else
if self.v_templob = 0 then
dbms_lob.createtemporary(self.v_clob, true, dbms_lob.call);
self.v_templob := 1;
end if;
dbms_lob.writeappend(self.v_clob, length(self.v_liststring), self.v_liststring);
dbms_lob.writeappend(self.v_clob, length(ctx2.v_liststring), ctx2.v_liststring);
self.v_liststring := '';
ctx2.v_liststring := '';
end if;
return ODCIConst.Success;
end;
end;
/
sho err

CREATE or replace FUNCTION listagg_clob (input varchar2) RETURN clob
PARALLEL_ENABLE AGGREGATE USING listagg_clob_t;
/
sho err 

The simplest way to handle multiple listagg's is to use 1 WITH (subquery factor) per column containing a listagg of that column from a select distinct:处理多个 listagg 的最简单方法是每列使用 1 个 WITH(子查询因子),其中包含来自 select distinct 的该列的 listagg:

    WITH tab AS 
    (           
        SELECT 1 as col1, 2 as col2, 3 as col3, 'Smith' as created_by FROM dual
        UNION ALL SELECT 1 as col1, 2 as col2, 3 as col3,'John'  as created_by FROM dual
        UNION ALL SELECT 1 as col1, 3 as col2, 4 as col3,'Ajay'  as created_by FROM dual
        UNION ALL SELECT 1 as col1, 4 as col2, 4 as col3,'Ram'   as created_by FROM dual
        UNION ALL SELECT 1 as col1, 5 as col2, 6 as col3,'Jack'  as created_by FROM dual
    )
    , getCol2 AS
    (
        SELECT  DISTINCT col1, listagg(col2,',') within group (order by col2)  over (partition by col1) AS col2List
        FROM ( SELECT DISTINCT col1,col2 FROM tab)
    )
    , getCol3 AS
    (
        SELECT  DISTINCT col1, listagg(col3,',') within group (order by col3)  over (partition by col1) AS col3List
        FROM ( SELECT DISTINCT col1,col3 FROM tab)
    )
    select col1,col2List,col3List
    FROM getCol2
    JOIN getCol3
    using (col1)

Which gives:这使:

col1  col2List  col3List
1     2,3,4,5   3,4,6

Very simple - use in your query a sub-query with a select distinct:非常简单 - 在您的查询中使用带有 select distinct 的子查询:

SELECT question_id,
       LISTAGG(element_id, ',') WITHIN GROUP (ORDER BY element_id)
FROM
       (SELECT distinct question_id, element_id
       FROM YOUR_TABLE)
GROUP BY question_id;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM