简体   繁体   English

从内部表中查找、删除和提取重复项

[英]Find, delete and extract duplicates from an internal table

I have an internal table with 2 million rows that's been uploaded from a file.我有一个内部表,其中包含从文件上传的 200 万行。 I want to delete any lines that are duplicates and extract the row numbers of the duplicates and add them to another table.我想删除任何重复的行并提取重复的行号并将它们添加到另一个表中。 What's the best/most efficient way to do this with ABAP 7.40?使用 ABAP 7.40 执行此操作的最佳/最有效方法是什么? Classic ABAP is also fine.经典的 ABAP 也不错。

So here's an example of my original table and I want to find duplicates by comparing columns A and B所以这是我的原始表的示例,我想通过比较 A 列和 B 列来查找重复项

A  | B  | C
-----------
a1 | b1 | c1
a1 | b2 | c1
a2 | b1 | C2
a1 | b1 | c2
a2 | b2 | c2

Rows 1 and 4 are duplicates so I'd want to remove both of them to end up with第 1 行和第 4 行是重复的,所以我想删除它们以结束

A  | B  | C
-----------
a1 | b2 | c1
a2 | b1 | C2
a2 | b2 | c2

and also have another table that stores duplicates:并且还有另一个存储重复项的表:

Row number  | Error 
-------------------
1           | Duplicate
4           | Duplicate      

I've seen similar requests on this site but they work a bit differently to what I need.我在这个网站上看到过类似的请求,但它们的工作方式与我需要的有点不同。 Thanks.谢谢。

This is the code to find which lines are duplicates (valid >= 7.40) :这是查找哪些行重复的代码(有效 >= 7.40):

TYPES : BEGIN OF ty_line,
          a TYPE c LENGTH 2,
          b TYPE c LENGTH 2,
          c TYPE c LENGTH 2,
        END OF ty_line,
        ty_lines TYPE STANDARD TABLE OF ty_line WITH EMPTY KEY.

DATA(itab) = VALUE ty_lines(
( a = 'a1' b = 'b1' c = 'c1' )
( a = 'a1' b = 'b2' c = 'c1' )
( a = 'a2' b = 'b1' c = 'c2' )
( a = 'a1' b = 'b1' c = 'c2' )
( a = 'a2' b = 'b2' c = 'c2' ) ).

DATA(duplicates) = VALUE string_table(
    FOR GROUPS <group> OF <line> IN itab
    GROUP BY ( a = <line>-a b = <line>-b size = GROUP SIZE )
    ( LINES OF COND #( WHEN <group>-size > 1 THEN VALUE string_table( (
        concat_lines_of(
            table = VALUE string_table( 
                    FOR <line2> IN GROUP <group> INDEX INTO tabix ( |{ tabix }| ) )
            sep   = ',' ) ) ) ) ) ).

ASSERT duplicates = VALUE string_table( ( `1,4` ) ).

I use LINES OF to not generate a line if the group has a size of 1.如果组的大小为 1 LINES OF我使用LINES OF不生成一行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM