繁体   English   中英

解析CSV和分析数据

[英]Parsing CSV and Analysing Data

我正在通过Hacker Rank做一些功课,但似乎无法弄清楚为什么它不接受我的答案。

这是原始存储库的链接。

目标是打印在“目标”和“允许的目标”值之间差异最小的团队的名称。

似乎存在两种可能性,莱斯特和阿斯顿维拉,因为莱斯特在进球数与允许进球之间存在负差(-37),而阿斯顿维拉的绝对差值最小(-1)。 但是,这些都不被接受。

有什么想法吗?

import sys
import os
import csv

text = '''Team,Games,Wins,Losses,Draws,Goals,Goals Allowed,Points
Arsenal,38,26,9,3,79,36,87
Liverpool,38,24,8,6,67,30,80
Manchester United,38,24,5,9,87,45,77
Newcastle,38,21,8,9,74,52,71
Leeds,38,18,12,8,53,37,66
Chelsea,38,17,13,8,66,38,64
West_Ham,38,15,8,15,48,57,53
Aston_Villa,38,12,14,12,46,47,50
Tottenham,38,14,8,16,49,53,50
Blackburn,38,12,10,16,55,51,46
Southampton,38,12,9,17,46,54,45
Middlesbrough,38,12,9,17,35,47,45
Fulham,38,10,14,14,36,44,44
Charlton,38,10,14,14,38,49,44
Everton,38,11,10,17,45,57,43
Bolton,38,9,13,16,44,62,40
Sunderland,38,10,10,18,29,51,40
Ipswich,38,9,9,20,41,64,36
Derby,38,8,6,24,33,63,30
Leicester,38,5,13,20,30,64,28'''

with open('football.csv', 'w') as f:
    f.write(text)



def read_data(filename):
    """Returns a list of lists representing the rows of the csv file data.

    Arguments: filename is the name of a csv file (as a string)
    Returns: list of lists of strings, where every line is split into a list of values. 
        ex: ['Arsenal', 38, 26, 9, 3, 79, 36, 87]
    """ 
    ifile = open('football.csv', 'rt')
    reader = csv.reader(ifile)

    listed = []
    for row in reader:
        print(row)
        listed.append(row)

    return listed

data = read_data('football.csv')

def get_index_with_min_abs_score_difference(goals):
    net_goals = []

    for i in goals[1:]:
        net_goals.append(int(i[5]) - int(i[6]))

    return net_goals.index(min(net_goals))+1

def get_team(index_value, parsed_data):
    return parsed_data[index_value][0]

footballTable = read_data('football.csv')
minRow = get_index_with_min_abs_score_difference(footballTable)
print(str(get_team(minRow, footballTable)))

我还尝试了替代解决方案(即,在得分和允许的目标之间的绝对差最小的团队)。

def get_index_with_min_abs_score_difference(goals):
    """Returns the index of the team with the smallest difference
    between 'for' and 'against' goals, in terms of absolute value.

    Arguments: parsed_data is a list of lists of cleaned strings
    Returns: integer row index
    """
    net_goals = []

    for i in goals[1:]:
        net_goals.append(abs(int(i[5]) - int(i[6])))

    return net_goals.index(min(net_goals))+1

这不是一个确切的答案,但是我对您的解决方案有一些评论。

您花了很多行一行一行地读取一个csv文件,只是将其放入一个列表中(稍后您将逐项处理该列表),然后又有了一些特殊的逻辑来跳过标题行。 如果要改用csv.DictReader ,而直接使用结果迭代器,而不是先尝试将其读入列表,则解决方案将简单得多。 考虑以下输出:

with open('football.csv', 'rt') as ifile:                                       
    footballTable = csv.DictReader(ifile)                                       
    for row in footballTable:                                                   
        print row

这将向您显示以下内容:

{'Draws': '3', 'Wins': '26', 'Losses': '9', 'Goals Allowed': '36', 'Points': '87', 'Games': '38', 'Goals': '79', 'Team': 'Arsenal'}
{'Draws': '6', 'Wins': '24', 'Losses': '8', 'Goals Allowed': '30', 'Points': '80', 'Games': '38', 'Goals': '67', 'Team': 'Liverpool'}
{'Draws': '9', 'Wins': '24', 'Losses': '5', 'Goals Allowed': '45', 'Points': '77', 'Games': '38', 'Goals': '87', 'Team': 'Manchester United'}
...

您会注意到:

  • 标头行会自动为您处理
  • 现在,您可以按名称引用列,而无需依赖代码中的魔术索引( i[5] )。 也就是说,您可以要求i['Goals']i['Goals Allowed']

在该循环中仅需几行,您就可以解决您的问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM