简体   繁体   English

Python 2.7-最小内置函数无法按预期工作

[英]Python 2.7 - min built-in function not working as expected

I'm making Google Python exercises and don't understand the behaviour of min() built-in function, which seems not to produce the expected result. 我正在做Google Python练习,但不了解min()内置函数的行为,这似乎无法产生预期的结果。 The exercise is "babynames", and I'm testing the code with 'baby1990.html' file ( https://developers.google.com/edu/python/exercises/baby-names ) 练习是“ babynames”,我正在使用“ baby1990.html”文件( https://developers.google.com/edu/python/exercises/baby-names )测试代码。

def extract_names(filename):
    f = open(filename, 'r').read()
    res = []
    d = {}
    match = re.search(r'<h3(.*?)in (\d+)</h3>', f)
    if match:
            res.append(match.group(2))

    vals = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f)
    for n, m, f in vals:
            if m=='Adrian' or f=='Adrian':
                    if m not in d:
                            d[m] = n
                    else:
                            d[m] = min(n, d[m])

                    if f not in d:       
                            d[f] = n
                    else:
                            print "min( "+str(n)+", "+str(d[f])+") = "+str( min(n, d[f]) ) 
                            d[f] = min( [n, d[f]] )

    for name,rank in sorted(d.items()):
    res.append(name+" "+str(rank))

    return res

vals is a list of tuples (rank, male_name, female_name) and I want to store each name (male and female) in the dictionary 'd' with name as key and rank as value. vals是元组的列表(等级,男性名称,女性名称),我想将每个名称(男性和女性)存储在字典“ d”中,名称作为键,等级作为值。 If there's a duplicate, i want to keep the lower rank value. 如果有重复,我想保留较低的等级值。

I noticed that the name 'Adrian' appears two times in the collection, the first time as male name with rank 94 and the second time as female with rank 603, and i want the smaller of the two values. 我注意到“ Adrian”这个名称在集合中出现了两次,第一次是男性,排名为94,第二次是女性,排名为603,我希望这两个值中的较小者。

So, the first time 'Adrian' is matched, it's stored in the dictionary with rank 94 (correctly). 因此,第一次匹配“ Adrian”时,它以正确的等级存储在字典中,排名为94。 When it's matched the second time, the execution flow correctly enters the second branch of the second if, but the result becames 603, even if min(94, 603) = 94. So the result is: 如果第二次匹配,则执行流正确地进入if的第二个分支,但是即使min(94,603)= 94,结果也变成603。所以结果是:

min( 603, 94) = 603
1990
Adrian 603
Anton 603
Ariel 94

I don't understand where the bug is. 我不知道错误在哪里。 Via interpreter, min(94, 603) = 94, as expected. 通过解释器,min(94,603)= 94,如预期的那样。 What am I missing? 我想念什么?

Thanks for help 感谢帮助

PS: I also tried min( n, d[f] ) that is the same function without list, but the result is always 603 PS:我也尝试过min(n,d [f]),它是没有列表的相同函数,但结果始终是603

You are comparing strings, not numbers: 您正在比较字符串, 而不是数字:

>>> min('603', '94')
'603'

Lexographically, '6' sorts before '9' . 在字典上, '6''9'之前排序。 Regular expressions work on strings, returned matches are strings even when digits are matched. 正则表达式适用于字符串,即使数字匹配,返回的匹配也是字符串。 Use int() to turn your strings into integers: 使用int()将字符串转换为整数:

vals = re.findall(r'<td>(\d+)</td><td>(\w+)</td><td>(\w+)</td>', f)
for n, m, f in vals:
    n = int(n)
    # ...

When trying to debug Python code, use repr() instead of str() to detect type problems; 尝试调试Python代码时,请使用repr()而不是str()来检测类型问题。 had you used repr() you would have seen that '94' would be printed instead of 94 (so with quotes to denote a string). 如果您使用repr()您会看到将打印'94'而不是94 (因此用引号表示字符串)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM