简体   繁体   English

带有条件的奇怪 Python 行为

[英]Strange Python Behavior with Conditionals

Hello I have the following list:您好,我有以下清单:

lines = ['', '', ' ', '', ' ', '', 'FA19 CHEM 102 3', 'FA19 CHEM 104 4', 'FA19 CLCV 115 ADB', 'FA19 CS 101 6', 'FA19 CS 125 PRO', 'FA19 CS 126 SL1', 'FA19 CS 173 BL2', 'FA19 ECON 1-- 7', 'FA19 ENG 100 CSA', '', '3.0 PS', '3.0 PS', '3.0 A', '3.0 PS', '4.0 PS', '3.0 A-', '3.0 A-', '3.0 PS', '0.0 S', '', '', '', '5/6', '', '', '5/4/2020', '', '', '', 'FA19 MATH 220 1', 'FA19 MATH 220 5', 'FA19 MATH 231 2', 'FA19 MATH 241 BL1', 'FA19 RHET 105 1', 'SP20 CS 225 AL2', 'SP20 CS 233 AL2', 'SP20 CWL 207 AL1', 'SP20 KIN 249 ON', 'FA20 CLCV 224 ADB', 'FA20 EPSY 220 OL', 'FA20 MATH 415 AL3', '', '0.0 PS >D', '5.0 PS', '3.0 PS', '4.0 A', '4.0 PS', '4.0 IP', '4.0 IP', '3.0 IP', '3.0 IP', '3.0 IP', '3.0 IP', '3.0 IP', '', '', '', '', '', '', '', '', '', '\uf00c', '']

I am trying to remove all the strings in the list that are either empty strings, spaces, or follow the date formats for dates '5/6' and '5/4/2020'.我正在尝试删除列表中的所有字符串,它们要么是空字符串、空格,要么遵循日期“5/6”和“5/4/2020”的日期格式。 I have the following code to do this:我有以下代码来做到这一点:

    for s in lines:
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])[/](19|20)\d\d$', '', s)
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])$', '', trans)
        if (trans == '' or trans == ' '):
            lines.remove(s)

I can assure you that my regex is working correctly however this code does not remove the string '5/6' from the list.我可以向您保证我的正则表达式工作正常,但是此代码不会从列表中删除字符串 '5/6'。 But if a change the for loop to skip string in the list that are empty string or spaces as such:但是,如果更改 for 循环以跳过列表中为空字符串或空格的字符串:

    for s in lines:
        if (s == '' or s == ' '):
            continue
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])[/](19|20)\d\d$', '', s)
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])$', '', trans)
        if (trans == '' or trans == ' '):
            lines.remove(s)

then the string '5/6' is removed from the list.然后从列表中删除字符串 '5/6'。 What in the world is going on?世界正在发生什么? Am I just missing something embarrassingly obvious?我只是错过了一些令人尴尬的明显吗?

You're mutating the iterable while looping over it, which is a big no-no.你在循环迭代时改变它,这是一个很大的禁忌。 (Dicts notice that and fail with a RuntimeError, lists don't.) (字典注意到并因 RuntimeError 而失败,列表没有。)

I'd suggest refactoring to a filter predicate function and a list comprehension:我建议重构为过滤谓词函数和列表理解:

def is_valid(s):
    trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])[/](19|20)\d\d$', '', s)
    trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])$', '', trans)
    return not trans.strip()  # String without leading/trailing spaces is empty

lines = [line for line in lines if is_valid(line)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM