简体   繁体   中英

Finding maximum temperature for every month in a csv file?

I need some help. So I have a large csv file (+8785 rows) .

So, what I basically need is to get every month's maximum temprature. For instance (output):

Month Max Temperature

January 5.3
February 6.1
March 25.5
...

I wrote this:

temp = open("weather_2012.csv","r")
total = 0
maxt = 0.0

for line in temp:
    try:
        p = float(line.split(",")[1])
        total += 1
        maxt = max(maxt,p)
    except:
        pass

print("Maximum:",maxt)

But it gets only one maximum temperature of all month (overall):

Maximum: 33.0

Here's what I think would be a good way, because it avoids hardcoding many if not most values into the code required (so would work for any year, and uses locale-specific month names):

from calendar import month_name
import csv
from datetime import datetime
import sys

filename = 'weather_2012.csv'
max_temps = [-sys.maxsize] * 13  # has extra [0] entry

with open(filename, 'r', newline='') as csvfile:
    reader = csv.reader(csvfile); next(reader)  # skip header row
    for date, high_temp, *_ in reader:
        month = datetime.strptime(date, '%Y-%m-%d %H:%M:%S').month
        max_temps[month] = max(max_temps[month], float(high_temp))

print('Monthly Max Temperatures\n')
longest = max(len(month) for month in month_name)  # length of longest month name
for month, temp in enumerate(max_temps[1:], 1):
    print('{:>{width}}: {:5.1f}'.format(month_name[month], temp, width=longest))

Output:

Monthly Max Temperatures

  January:   5.3
 February:   6.1
    March:  25.5
    April:  27.8
      May:  31.2
     June:  33.0
     July:  33.0
   August:  32.8
September:  28.4
  October:  21.1
 November:  17.5
 December:  11.9

You have to find not one, but all twelve maxima. You could start with a list of month names and find the maximum for every month in this list. In your csv file, the month is in the character positions 5 to 6 of the first element.

With this data format …

Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),Visibility (km),Stn Press (kPa),Weather
2012-01-01 00:00:00,-1.8,-3.9,86,4,8.0,101.24,Fog
2012-01-01 01:00:00,-1.8,-3.7,87,4,8.0,101.24,Fog
2012-01-01 02:00:00,-1.8,-3.4,89,7,4.0,101.26,"Freezing Drizzle,Fog"
2012-01-01 03:00:00,-1.5,-3.2,88,6,4.0,101.27,"Freezing Drizzle,Fog"
2012-01-01 04:00:00,-1.5,-3.3,88,7,4.8,101.23,Fog
… to be continued

… you could find the maxima with this program:

month=["January","February","March","April","May","June","July",
       "August","September","October","November","December"]
maxt = {}
with open("weather_2012.csv","r") as temp:
    for line in temp:
        try: # is there valid data in line?
            m0, p0, *junk = line.split(",")
            p = float(p0)
            m = month[int(m0[5:7])-1]
            try: # do we already have data for this month?
                maxt[m] = max (p, maxt[m])
            except: # first data of this month 
                maxt[m] = p
        except: # skip this line
            pass

print("Maxima:")        
for m in month:
    print("%s: %g"%(m,maxt[m]))

First you have to filter each value by each month in your first column, then you can find the max temp to each month

Hope that the next code can helps you:

import csv
months= {
    "01": "January",
    "02": "February",
    "03": "March",
    "04": "April",
    "05": "May",
    "06": "June",
    "07": "July",
    "08": "August",
    "09": "September",
    "10": "October",
    "11": "November","12": "December"
}

weather_file = csv.DictReader(open("weather_2012.csv", 'r'), delimiter=',', quotechar='"')

results = {}

for row in weather_file:
    # get month
    month = row["Date/Time"].split(" ")[0].split("-")[1]
    if not (month in results):
        results[month] = {
            "max": float(row["Temp (C)"])
        }
        continue

    if float(row["Temp (C)"]) > results[month]["max"]:
        results[month]["max"] = float(row["Temp (C)"])

# ordering and showing
print "Max temp by month:"
for month in sorted(results, key=lambda results: results):
    # do some stuff about month, to this case only show
    print "%s: %.2f" % (months[month], results[month]["max"])

Output: Max temp by month: January: 5.3 February: 6.1 March: 25.5 April: 27.8 May: 31.2 June: 33.0 July: 33.0 August: 32.8 September: 28.4 October: 21.1 November: 17.5 December: 11.9

Another solution could be like that:

#-*- coding: utf-8 -*-
import csv
import datetime
import itertools
import collections

fd = open('weather_2012.csv', 'rb')
reader = csv.DictReader(fd, delimiter=',')
rows = []
for row in reader:
    row['yearmonth'] = datetime.datetime.strptime(row['Date/Time'],  '%Y-%m-%d %H:%M:%S').strftime('%Y%m')
    rows.append(row)
fd.close()
# sort them
rows.sort(key=lambda r: r['yearmonth'])
ans = collections.OrderedDict()
for yearmonth, values in itertools.groupby(rows, lambda r: r['yearmonth']):
    ans[yearmonth] = max([float(r['Temp (C)']) for r in values])

print ans

This solutions firsts sorts the data based on yearmonth string and then uses groupby builtin function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM