简体   繁体   中英

Python: Remove duplicates and indicate index

Below is the tuple for 3D point group (nine points)

f = [[10, 20, 0], 
    [40, 20, 30], 
    [20, 0, 30], 
    [10, 10, 0], 
    [30, 10, 10], 
    [20, 0, 30], 
    [20, 10, 20], 
    [10, 10, 0],
    [20, 0, 30]]

Each point corresponds with a certain number (index) indicating the type of point (assumption)

ic=[1,2,3,2,1,3,2,3,1]

Hence, the previous tuple could be presented as

f = [[10, 20, 0, 1], 
    [40, 20, 30, 2], 
    [20, 0, 30, 3], 
    [10, 10, 0, 2], 
    [30, 10, 10, 1], 
    [20, 0, 30, 3], 
    [20, 10, 20, 2], 
    [10, 10, 0, 3],
    [20, 0, 30, 1]]

Here is my code:

uniq = []
dup = []
count = 0
for i, j, k  in f:
    if not [f.index([i,j,k]),i,j,k] in uniq:
        uniq.append([count,i,j,k])
    else:
        dup.append([count,i,j,k,"duplicate"])
    count += 1
uniq.extend(dup)
print(uniq)

for i,j in enumerate(uniq):
    j.append(ic[j[0]])
print(unique)

The result I want to obtain is shown below:

Unique part:

index       point         equivalent points    index for same point
  0      [10, 20, 0, 1]           1                   [1]
  1      [40, 20, 30, 2]          1                   [2]
  2      [20, 0, 30, 3]           3                 [3,3,1]
  3      [10, 10, 0, 2]           2                  [2,3]
  4      [30, 10, 10, 1]          1                   [1]
  6      [20, 10, 20, 2]          1                   [2]

Duplicate part:

index       point         Duplicate or not
  5      [20, 0, 30, 3]       duplicate
  7      [10, 10, 0, 3]       duplicate
  8      [20, 0, 30, 1]       duplicate

My code is intended to realize the function of picking the duplicated points out and also indicating its index in the list. In addition, I also need to realize function showing how many equivalent points in my unique part and also the index for these equivalent points.

How can I revise it?

I'm not sure I follow where you're getting your index points, but here's how I'd count for duplicates. First you need immutable datatypes to count, so change your sublists to real tuples, and use collections.Counter to count them:

import pprint # do your imports first
import collections


f = [[10, 20, 0], [40, 20, 30], [20, 0, 30], [10, 10, 0], [30, 10, 10], [20, 0, 30], [20, 10, 20], [10, 10, 0], [20, 0, 30]]
t = [tuple(i) for i in f] # we need immutable datatypes to count

counts = collections.Counter(t)
pprint.pprint(counts)

prints

{(10, 10, 0): 2,
 (10, 20, 0): 1,
 (20, 0, 30): 3,
 (20, 10, 20): 1,
 (30, 10, 10): 1,
 (40, 20, 30): 1}

And as you may intuit, Counter is just a subclassed dict , and has all the normal dict methods.

To get your uniques and dupes:

uniques = [k for k, v in counts.items() if v == 1]

which returns

[(10, 20, 0), (30, 10, 10), (40, 20, 30), (20, 10, 20)]

and

dupes = [k for k, v in counts.items() if v > 1]

returns

[(20, 0, 30), (10, 10, 0)]
for j in uniq+dup:
    if "duplicate" not in j:
        j += ic[j[0]],f.count(j[1:4]), [ic[j[0]]]
    else:
        j.append(ic[j[0]])    

for i in dup:
    for j in uniq:
        if i[1:4] == j[1:4]:
            j[-1].append(i[-1])

[[5, 20, 0, 30, 'duplicate', 3], [7, 10, 10, 0, 'duplicate', 3], [8, 20, 0, 30, 'duplicate', 1]]

[[0, 10, 20, 0, 1, 1, [1]], [1, 40, 20, 30, 2, 1, [2]], [2, 20, 0, 30, 3, 3, [3, 3, 1]], [3, 10, 10, 0, 2, 2, [2, 3]], [4, 30, 10, 10, 1, 1, [1]], [6, 20, 10, 20, 2, 1, [2]]]

This will add the count to each sublist without changing your original structure.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM