简体   繁体   中英

Finding number of repeats inside a list in a file with Python

I need to find the number of times an entry in the list repeats consecutively. For example, consider the following file

"hello hello [A B C]"
"my world [D C F L]"
"tick tock [A L]"

In this file, the number of times C repeats is 2
A repeat is not counted as it is not repeating consecutively.

I am not sure of using re as it wouldnt tell me if it repeats consecutively. Any help would be apprecited.

the most simple way is to use re to parse the file.

regular expression that could work : \\[([AZ]\\s)+[AZ]\\]

then with the list of "list string" (aka ["[ABC]","[ FGR]"] ) convert it to a list.

the format must be like this for "[ABC]" "ABC", so remove spaces and [] for each one.

converted_string_list = list(str_list)

so a print converted_string_list will result in a list like this one for a string like "ADF":

['A', 'D', 'F']

then merge all list and find duplicates.

this is straigh forward solution! I am sure a better solution exists

For counting the duplicates once you get them into a list:

initial_length = len(my_list)
new_length = len(set(my_list))
duplicates = initial_length - new_length
def find_repeats_in_list(lines):
    # get lists from every line
    all_items = []
    for line in lines:
        open_bracket = line.index('[')
        close_bracket = line.index(']')
        items = line[open_bracket+1:close_bracket].split()
        all_items.append(items)

    # initialize dictionaries to hold consecutive counts
    counts = dict()
    final = dict()

    # seed counts with list from first line
    for item in all_items[0]:
        counts[item] = 1

    # check for first line list items in subsequent lines
    for items in all_items[1:]:
        for counted in counts:
            remove = []
            if counted not in items:      # not in current line, stop counting
                remove.append(counted)
                if counts[counted] > 1:   # but put in final if more than one
                    final[counted] = counts[counted]
        for item in remove:
            del counts[item]
        for item in items:                # now increment anything consecutive
            if item in counts:
                counts[item] += 1
    return final

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM