简体   繁体   中英

Getting error while using itertools in Python

This is the continuation of the OP1 and OP2 .

Specifically, the objective is to remove duplicates if more than one dict has the same content for the key paper_title .

However, the line throw an error if there inconsistency in the way the list is imputed, such that if there is a combination of dict and str

TypeError: string indices must be integers

The complete code which generates the aforementioned error is as below: -

from itertools import groupby



def extract_secondary():
    # 
    test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
                 {"paper_title": 'This is duplicate', 'Paper_year': 3}, \
                 {"paper_title": 'Unique One', 'Paper_year': 3}, \
                 {"paper_title": 'Unique two', 'Paper_year': 3}, 'all_result']
    f = lambda x: x["paper_title"]
    already_removed = [next(g) for k, g in groupby(sorted(test_list, key=f), key=f)]


extract_secondary()

May I know which part of the code needs further tweaks? Appreciate any insight.

PS: Please notify me if this thread is being considered duplicate to OP1 . However, I believe this thread merits its own existence due to the uniqueness of the issue.

Thanks to @Chris for pointing about the existence of str in test_lis t instead of dict ("all_result")

To address whereby sorted is raise an error that it cannot use f for str, the str need to be removed from the list.

As of OP , the str can be removed by

list(filter('all_result'.__ne__, test_list))

Note that, for this case, the str only have the value of 'all_result' .

The complete code then

def extract_secondary():

        test_list = [{"paper_title": 'This is duplicate', 'Paper_year': 2}, \
                     {"paper_title": 'This is duplicate', 'Paper_year': 3}, \
                     {"paper_title": 'Unique One', 'Paper_year': 3}, \
                     {"paper_title": 'Unique two', 'Paper_year': 3},'all_result','all_result']
        test_list=list(filter('all_result'.__ne__, test_list))
        f = lambda x: x["paper_title"]
        already_removed = [next(g) for k, g in groupby(sorted(test_list, key=f), key=f)]

extract_secondary()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM