Remove duplicates from list in python

Question

Code below, gets the answer through get request and writes the result to the list "RESULT"

for i in url:
    df = pd.read_html(i,header=0)[0]
    df = df.as_matrix().tolist()
    for item in df:           
        RESULT.append(item)

I use the code below to exclude duplicate entries:

def unique_items(RESULT):
found = set()
for item in RESULT:
    if item[0] not in found:
        yield item
        found.add(item[0])
NOT_DUBLICATE = (list(unique_items(RESULT)))
print(NOT_DUBLICATE)

It seems to me it is not optimal since it is necessary to get a list of all the rows to exclude duplicates.

How can I find duplicates before loading a rows into the list RESULT?

for example, the rows I write to the list RESULT:

[[55323602, 'system]
,[55323603, 'system]]
[[55323602, 'system]
,[55323603, 'system]]

Answer 1

Instead of use another method to exclude duplicate entries, append item to the list if item doesn't exist in the list RESULT . Then you don't need method unique_items() .

You can find duplicates before loading a row into the list RESULT using this:

for i in url:
    df = pd.read_html(i,header=0)[0]
    df = df.as_matrix().tolist()
    for item in df:  
        if item not in RESULT         
            RESULT.append(item)

Answer 2

Just use a set instead of a list.

result = set()
for i in url:
    df = pd.read_html(i,header=0)[0]
    df_list = df.as_matrix().tolist()
    for item in df_list:          
       result.add(tuple(item))

Above code will exclude any duplicates. The only difference from your case will be that elements of result will be tuples instead of lists.

At the end, you can recast the set to a list by:

result = list(result)

Remove duplicates from list in python

Question

2 answers

solution1
1 ACCPTED 2018-05-07 13:55:27

solution2
1 2018-05-07 13:57:23

Remove duplicates from list in python

Question

2 answers

solution1 1 ACCPTED 2018-05-07 13:55:27

solution2 1 2018-05-07 13:57:23

solution1
1 ACCPTED 2018-05-07 13:55:27

solution2
1 2018-05-07 13:57:23