简体   繁体   中英

How to find the min and max values of an index in nested lists with matching strings?

I have a text file that contains tide data, the first few lines are:

Thursday 4 January,11.58,0.38 meters

Thursday 4 January,16.95,0.73 meters

Thursday 4 January,23.68,0.02 meters

Friday 5 January,6.48,0.83 meters

Friday 5 January,12.42,0.33 meters

etc. and it continues for 5 more days of data.

I've turned this data into a nested list so it now prints as:

[['Thursday 4 January', 11.58, 0.38], ['Thursday 4 January', 16.95, 0.73], ['Thursday 4 January', 23.68, 0.02], ['Friday 5 January', 6.48, 0.83], ['Friday 5 January', 12.42, 0.33].....]

And so on for every line of data in the file, with the first index of each list being the day in a string, and the 2nd and 3rd index being floats.

For each nested list with a matching [0] index (the day), I need to find the lowest and highest float of each matching nested lists [2] index (the tide height) and print these to the screen for each day.

for example:

Thursday 4 January: 0.02 meters at lowest and 0.73 meters at highest
Friday 5 January: 0.33 meters at lowest and 0.83 meters at highest

Since the lists are in order by day already, I was thinking to just count the matching names and then compare it manually, eg for 'Thursday 4 January', since there are 3 of them, I just compare split_tides[0-2][2] and find the min and max, and repeat for each day. But I thought there must be a way to automatically compare nested lists that have a matching string.

Thanks for the help.

EDIT: Image of text file

在此处输入图片说明

Here below is the snippet which would help you

from pandas import DataFrame

data = [['Thursday 4 January', 11.58, 0.38], ['Thursday 4 January', 16.95, 0.73], ['Thursday 4 January', 23.68, 0.02],
         ['Friday 5 January', 6.48, 0.83], ['Friday 5 January', 12.42, 0.33]]

df = DataFrame.from_records(data)

df.columns = ["Date", "Value", "Height"]

df.groupby(['Date'])['Height'].max()
df.groupby(['Date'])['Height'].min()

Output:

Date Friday 5 January 0.83 Thursday 4 January 0.73

Date Friday 5 January 0.33 Thursday 4 January 0.02

first you can use groupby to group by date and sort by the 3rd index height. Sample code here:

    from itertools import groupby
    from operator import itemgetter

    data = [['Thursday 4 January', 11.58, 0.38], ['Thursday 4 January', 16.95, 0.73], ['Thursday 4 January', 23.68, 0.02],
         ['Friday 5 January', 6.48, 0.83], ['Friday 5 January', 12.42, 0.33]]

    for k, g in groupby(data, key=itemgetter(0)):
        a = sorted(g, key=itemgetter(2))
        print('{}: {} meters at lowest and {} meters at highest'.format(k, a[0][2], a[-1][2]))

output:

Thursday 4 January: 0.02 meters at lowest and 0.73 meters at highest
Friday 5 January: 0.33 meters at lowest and 0.83 meters at highest

By the way, make sure your data is ordered by date before using groupby in date.

Hope that will help you, and comment if you have further questions. : )

data = [['Thursday 4 January', 11.58, 0.38], ['Thursday 4 January', 16.95, 0.73], ['Thursday 4 January', 23.68, 0.02], ['Friday 5 January', 6.48, 0.83], ['Friday 5 January', 12.42, 0.33]]

dic={'day':[],'min':[],'max':[]}
for i in data :
    dic['day']+=[i[0]]
    dic['min']+=[i[1]]
    dic['max']+=[i[2]]



from collections import defaultdict
x = defaultdict(dict)

for i in dic['day']:
    x[i]={'min':[],'max':[]}

for i in range(len(dic['min'])):
    x[dic['day'][i]]['min']+=[dic['min'][i]]
    x[dic['day'][i]]['max']+=[dic['max'][i]]

print(dict(x))
"""
structure data for future use

{'Friday 5 January': {'max': [0.83, 0.33], 'min': [6.48, 12.42]},
 'Thursday 4 January': {'max': [0.38, 0.73, 0.02],
  'min': [11.58, 16.95, 23.68]}}
"""



result =[]
for i in x:
    result.append(r'{}: {} meters at lowest an {} meters at highest'.format(i,min(x[i]['min']),max(x[i]['max'])))


print(result)

"""output

['Thursday 4 January: 11.58 meters at lowest an 0.73 meters at highest', 'Friday 5 January: 6.48 meters at lowest an 0.83 meters at highest']
""""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM