简体   繁体   中英

Strange Python Behavior with Conditionals

Hello I have the following list:

lines = ['', '', ' ', '', ' ', '', 'FA19 CHEM 102 3', 'FA19 CHEM 104 4', 'FA19 CLCV 115 ADB', 'FA19 CS 101 6', 'FA19 CS 125 PRO', 'FA19 CS 126 SL1', 'FA19 CS 173 BL2', 'FA19 ECON 1-- 7', 'FA19 ENG 100 CSA', '', '3.0 PS', '3.0 PS', '3.0 A', '3.0 PS', '4.0 PS', '3.0 A-', '3.0 A-', '3.0 PS', '0.0 S', '', '', '', '5/6', '', '', '5/4/2020', '', '', '', 'FA19 MATH 220 1', 'FA19 MATH 220 5', 'FA19 MATH 231 2', 'FA19 MATH 241 BL1', 'FA19 RHET 105 1', 'SP20 CS 225 AL2', 'SP20 CS 233 AL2', 'SP20 CWL 207 AL1', 'SP20 KIN 249 ON', 'FA20 CLCV 224 ADB', 'FA20 EPSY 220 OL', 'FA20 MATH 415 AL3', '', '0.0 PS >D', '5.0 PS', '3.0 PS', '4.0 A', '4.0 PS', '4.0 IP', '4.0 IP', '3.0 IP', '3.0 IP', '3.0 IP', '3.0 IP', '3.0 IP', '', '', '', '', '', '', '', '', '', '\uf00c', '']

I am trying to remove all the strings in the list that are either empty strings, spaces, or follow the date formats for dates '5/6' and '5/4/2020'. I have the following code to do this:

    for s in lines:
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])[/](19|20)\d\d$', '', s)
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])$', '', trans)
        if (trans == '' or trans == ' '):
            lines.remove(s)

I can assure you that my regex is working correctly however this code does not remove the string '5/6' from the list. But if a change the for loop to skip string in the list that are empty string or spaces as such:

    for s in lines:
        if (s == '' or s == ' '):
            continue
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])[/](19|20)\d\d$', '', s)
        trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])$', '', trans)
        if (trans == '' or trans == ' '):
            lines.remove(s)

then the string '5/6' is removed from the list. What in the world is going on? Am I just missing something embarrassingly obvious?

You're mutating the iterable while looping over it, which is a big no-no. (Dicts notice that and fail with a RuntimeError, lists don't.)

I'd suggest refactoring to a filter predicate function and a list comprehension:

def is_valid(s):
    trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])[/](19|20)\d\d$', '', s)
    trans = re.sub('^([1-9]|1[012])[/]([0-9]|[1-9][0-9])$', '', trans)
    return not trans.strip()  # String without leading/trailing spaces is empty

lines = [line for line in lines if is_valid(line)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM