Python CSV 作业程序

Question

我有一项家庭作业要做，与通过 csv 和函数读取文件有关。

基本思想是计算足球运动员在两年内的冲击等级。 我们使用提供给我们的文件中的数据。 示例文件是：

name, ,pos,team,g,rush,ryds,rtd,rtdr,ravg,fum,fuml,fpts,year 
A.J.,Feeley,QB,STL,5,3,4,0,0,1.3,3,2,20.3,2011
Aaron,Brown,RB,DET,1,1,0,0,0,0,0,0,0.9,2011
Aaron,Rodgers,QB,GB,15,60,257,3,5,4.3,4,0,403.4,2011
Adrian,Peterson,RB,MIN,12,208,970,12,5.8,4.7,1,0,188.9,2011
Ahmad,Bradshaw,RB,NYG,12,171,659,9,5.3,3.9,1,1,156.6,2011

换句话说，我们必须从文件中删除第一行，并读取第 rest 行，以逗号分隔。

要计算冲击等级，我们需要：

Yds 是每次尝试的平均码数增益。 这是[总码数/（4.05 * 尝试次数）]。 如果此数字大于 2.375，则应改用 2.375。

perTDs 是每次进位达阵的百分比。 这是 [(39.5 * 触地得分) / 尝试次数]。 如果此数字大于 2.375，则应取消使用 2.375。

perFumbles 是每次进位失败的百分比。 这是 [2.375 - ((21.5 * fumbles) / attempts)]。

Rusher 评级为 [Yds + perTDs + perFumbles] * (100 / 4.5)。

我到目前为止的代码：

playerinfo = []
teaminfo10 = []
teaminfo11 = []

import csv

file = raw_input("Enter filename: ")
read = open(file,"rU")
read.readline()
fileread = csv.reader(read)

#Each line is iterated through, and if rush attempts are greater than 10, the
#player may be used for further statistics.
for playerData in fileread:
    if int(playerData[5]) > 10:
    
        attempts = int(playerData[5])
        totalYards = int(playerData[6])
        touchdowns = int(playerData[7])
        fumbles = int(playerData[10])
    
        #Rusher rating for each player is found. This rating, coupled with other
        #data about the player is formatted and appended into a list of players.
        rushRating = ratingCalc(attempts,totalYards,touchdowns,fumbles)
        rusherData = rushFunc(playerData,rushRating)
        playerinfo.append(rusherData)
    
        #Different data about the player is formatted and added to one of two
        #lists of teams, based on year. 
        teamData = teamFunc(playerData)
        if playerData[13] == '2010':
            teaminfo10.append(teamData)
        else:
            teaminfo11.append(teamData)

#The list of players is sorted in order of decreasing rusher rating.
playerinfo.sort(reverse = True)
#The two team lists of players are sorted by team.
teaminfo10.sort()
teaminfo11.sort()

print "The following statistics are only for the years 2010 and 2011."
print "Only those rushers who have rushed more than 10 times are included."
print
print "The top 50 rushers based on their rusher rating in individual years are:"

#50 players, in order of decreasing rusher ratings, are printed along with other
#data.
rushPrint(playerinfo,50)

#A similar list of running backs is created, in order of decreasing rusher
#ratings.
RBlist = []
for player in playerinfo:
    if player[2] == 'RB':
        RBlist.append(player)

print "\nThe top 20 running backs based on their rusher rating in individual\
years are:"
#The top 20 running backs on the RBlist are printed, with other data.
rushPrint(RBlist,20)


#The teams with the greatest overall rusher rating (if their attempts are
#greater than 10) are listed in order of decreasing rusher rating, for both 2010
#and 2011.
teamListFunc(teaminfo10,'2010')

teamListFunc(teaminfo11,'2011')

#The player(s) with the most yardage is printed.
yardsList = mostStat(6,fObj,False)
print "\nThe people who rushed for the most yardage are:"
for item in yardsList:
    print "%s rushing for %d yards for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most touchdowns is printed.
TDlist = mostStat(7,fObj,False)
print"\nThe people who have scored the most rushing touchdowns are:"
for item in TDlist:
    print "%s rushing for %d touchdowns for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most yardage per rushing attempt is printed.
ypaList = mostStat(6,fObj,True)
print"\nThe people who have the highest yards per rushing attempt with over 10\
rushes are:"
for item in ypaList:
    print "%s with a %.2f yards per attempt rushing average for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most fumbles is printed.
fmblList = mostStat(10,fObj,False)
print"\nThere are %d people with the most fumbles. They are:" % (len(fmblList))
for item in fmblList:
    print "%s with %d fumbles for %s in %s." % (item[1],item[0],item[2],item[3])


def ratingCalc(atts,totalYrds,TDs,fmbls):
    """Calculates rusher rating."""
    yrds = totalYrds / (4.05 * atts)
    if yrds > 2.375:
        yrds = 2.375

    perTDs = 39.5 * TDs / atts
    if perTDs > 2.375:
        perTDs = 2.375

    perFumbles = 2.375 - (21.5 * fmbls / atts)

    rating = (yrds + perTDs + perFumbles) * (100/4.5)

    return rating    

def rushFunc(information,rRating):
    """Formats player info into [rating,name,pos,team,yr,atts]"""
    rusherInfo = []
    rusherInfo.append(rRating)
    name = information[0] + ' ' + information[1]
    rusherInfo.append(name)
    rusherInfo.append(information[2])
    rusherInfo.append(information[3])
    rusherInfo.append(information[13])
    rusherInfo.append(information[5])

    return rusherInfo


def teamFunc(plyrInfo):
    """Formats player info into [team,atts,yrds,TDs,fmbls] for team sorting"""
    teamInfo = []
    teamInfo.append(plyrInfo[3])
    teamInfo.append(plyrInfo[5])
    teamInfo.append(plyrInfo[6])
    teamInfo.append(plyrInfo[7])
    teamInfo.append(plyrInfo[10])

    return teamInfo

def rushPrint(lst,num):
    """Prints players and their data in order of rusher rating."""
    print "Name                           Pos   Year  Attempts   Rating  Team"
    count = 0
    while count < num:
        index = lst[count]
        print "%-30s %-5s %4s  %5s      %3.2f  %s"\
              % (index[1],index[2],index[4],index[5],index[0],index[3])
        count += 1

所以，是的，我仍然需要定义很多功能。 但是到目前为止，您如何看待代码？ 效率低下吗？ 你能告诉我它有什么问题吗？ 因为在我看来这段代码会很长（大概300行左右），但是老师说应该是一个比较短的项目。

Answer 1

这是一段代码，可以大大简化您的整个项目。

理解手头的任务可能需要一点时间，但总的来说，当你处理正确的数据类型和“关联数组”（dicts）时，这会让你的生活更轻松

import csv

reader = csv.DictReader(open('mycsv.txt', 'r'))
#opens the csv file into a dictionary

list_of_players = map(dict, reader)
#puts all the dictionaries (by row) as a separate element in a list. 
#this way, its not a one-time iterator and all your info is easily accessible

for i in list_of_players:
    for stat in ['rush','ryds','rtd','fum','fuml','year']:
        i[stat] = int(i[stat])
    #the above loop makes all the intended integers..integers instead of strings
    for stat in ['fpts','ravg','rtdr']:
        i[stat] = float(i[stat])
    #the above loop makes all the intended floats..floats instead of strings

for i in list_of_players:
    print i['name'], i[' '], i['fpts']
    #now you can easily access and loop through your players with meaningful names
    #using 'fpts' rather than predetermined numbers [5]

此示例代码显示了使用他们的姓名和统计数据（即名字、姓氏和 fpts）是多么容易：

>>> 
A.J. Feeley 20.3
Aaron Brown 0.9
Aaron Rodgers 403.4
Adrian Peterson 188.9
Ahmad Bradshaw 156.6

当然，需要进行一些调整才能获得所有请求的统计信息（最大值等），但这可以通过从一开始就保持数据类型正确来减少执行这些任务的冗长程度。

这个任务现在可以完成（使用这些结构），只需不到 300 行，而且你使用 python 的次数越多，你就会学到完成它们的传统习语。 lambda 和 sorted() 是您很快就会爱上的函数示例！

Python CSV 作业程序

问题描述

1 个解决方案

解决方案1
3 2012-04-05 22:55:01

Python CSV 作业程序

问题描述

1 个解决方案

解决方案1 3 2012-04-05 22:55:01

解决方案1
3 2012-04-05 22:55:01