简体   繁体   中英

Count number of occurrences of list of tuples

I need to write a function that takes 3 arguments: data , year_start , year_end .

The data is a list of tuples. year_start and year_end are inputs by the user.

The function needs to count the number of occurrences in data , where any year in the date range is in position [0] (position [0] in data is the year).

I need to generate lists of tuples for earthquake_count_by_year = [] , and total_damage_by_year = [] in the format [(year, value), (year, value)] for each year in the range.

Here's what I have:

def summary_statistics(data, year_start, year_end):
    earthquake_count_by_year = []
    total_damages_by_year = []
    casualties_by_year = []
    count = 0
    years = []
    year_start = int(year_start)
    year_end = int(year_end)
    
    if year_end >= year_start:
        # store range of years into list
        years = list(range(year_start, year_end+1))
        for index, tuple in enumerate(data):
            if tuple[0] in years:
                count[tuple[0]] += 1
        print(count)

The above is just my attempt to count the number of occurrences in the input for each year. I feel like if I can get this much, I can figure out the rest.

Here is the input for data :

[(2020, 1, 6.0, 'CHINA:  XINJIANG PROVINCE', 39.831, 77.106, 1, 0, 2, 0), (2020, 1, 6.7, 'TURKEY:  ELAZIG AND MALATYA PROVINCES', 38.39, 39.081, 41, 0, 1600, 0), (2018, 1, 7.7, 'CUBA: GRANMA;  CAYMAN IS;  JAMAICA', 19.44, -78.755, 0, 0, 0, 0), (2019, 2, 6.0, 'TURKEY: VAN;  IRAN', 38.482, 44.367, 10, 0, 60, 0), (2018, 3, 5.4, 'BALKANS NW:  CROATIA:  ZAGREB', 45.897, 15.966, 1, 0, 27, 6000.0), (2020, 3, 5.7, 'USA: UTAH', 40.751, -112.078, 0, 0, 0, 48.5), (2020, 3, 7.5, 'RUSSIA:  KURIL ISLANDS', 48.986, 157.693, 0, 0, 0, 0)]

Expected output for list_of_earthquake_count_by_year(data, 2018, 2020):

[(2020, 3), (2019, 0), (2018, 2)]

Ultimately, the rest of what I need is: casualties_by_year(data, 2018, 2020):

(year, (total_deaths, total_missing, total_injured))

Which ends up in:

L = [[earthquake_count_by_year], [casualties_by_year]]
return L

Any suggestion is appreciated.

for item in data:
    if year_start <= item[0] <= year_end:
        # this year is in the range

The line count = 0 initializes count as an integer but in the line count[tuple[0]] += 1 , you seem to be treating it as a dictionary which is the source of the problem. You should initialize the variable count as a dictionary like so:

count = {}

Now since dictionary is being used, minor changes have to be done to the code:

if tuple[0] in years:
    # If the key does not exist in the dictionary, create one
    if tuple[0] not in count:
        count[tuple[0]] = 0

    count[tuple[0]] += 1

All the data will be stored within the count dictionary as:

{
    2020: 3,
    2018: 2,
    2019: 0
}

Now, all you need to do is convert the data from dictionary to list of tuples, which couldn't be easier than this:

list_of_tuples = list(count.items())    # Returns list of tuples
return list_of_tuples

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM