简体   繁体   English

Python if 语句不循环遍历 elif 和 else 语句部分

[英]Python if-statement does not loop through the elif and else statement parts

I'm trying to open an XML file and parse through it, looking through its tags and finding the text within each specific tag.我正在尝试打开一个 XML 文件并对其进行解析,查看其标签并在每个特定标签中查找文本。 If the text within the tag matches a string, I want it remove a part of the string or substitute it with something else.如果标记中的文本与字符串匹配,我希望它删除字符串的一部分或用其他内容替换它。

However, it looks like for some reason the code stays inside the third if-statement and thinks that end_int always equals none.但是,由于某种原因,代码似乎停留在第三个 if 语句中,并认为 end_int 始终等于 none。 I'm not sure why because when finding the value of the variable end_int, I had printed out the values and it gets all the 'end_char' tag values from the xml file, which is what end_int should be.我不知道为什么,因为在找到变量 end_int 的值时,我打印了这些值,它从 xml 文件中获取了所有的“end_char”标签值,这就是 end_int 应该是什么。 But inside the if statement, it thinks end_char is always None.但在 if 语句中,它认为 end_char 始终为 None。

The mfn_pn variable is a barcode inputted by the user, something similar to ATL-157-1815, DFW-184-8378., ATL-324-3243., DFW-432-2343, ATL 343 8924, DFW 342 3413, DFW-324 3423 T&R. mfn_pn 变量是用户输入的条形码,类似于 ATL-157-1815, DFW-184-8378., ATL-324-3243., DFW-432-2343, ATL 343 8924, DFW 342 3413, DFW- 324 3423 T&R。

The XML file has the following data: XML文件有以下数据:

<?xml version="1.0" encoding="utf-8"?>
<metadata>
    <filter>
        <regex>ATL|LAX|DFW</regex >
        <start_char>3</start_char>
        <end_char></end_char>
        <action>remove</action>
    </filter>
    <filter>
        <regex>DFW.+\.$</regex >
        <start_char>3</start_char>
        <end_char>-1</end_char>
        <action>remove</action>
    </filter>
    <filter>
        <regex>\-</regex >
        <replacement></replacement>
        <action>substitute</action>
    </filter>
    <filter>
        <regex>\s</regex >
        <replacement></replacement>
        <action>substitute</action>
    </filter>
    <filter>
        <regex>1P</regex >
        <start_char>2</start_char>
        <end_char></end_char>
        <action>remove</action>
    </filter>
    <filter>
        <regex>T&#038;R$</regex >
        <start_char></start_char>
        <end_char>-4</end_char>
        <action>remove</action>
    </filter>
</metadata>

The Python code I'm using is:我使用的 Python 代码是:

import re
from xml.etree.ElementTree import ElementTree

# filters.xml is the file that holds the things to be filtered
tree = ElementTree()
tree.parse("filters.xml")

# Get the data in the XML file 
root = tree.getroot()

# Loop through filters
for x in root.findall('filter'):

    # Find the text inside the regex tag
    regex = x.find('regex').text
    # Find the text inside the start_char tag
    start_prim = x.find('start_char')
    
    # If the element exists assign its text to start variable
    start = start_prim.text if start_prim is not None else None
    start_int = int(start) if start is not None else None
    print('start: ', start_int)

    # Find the text inside the end_char tag
    end_prim = x.find('end_char')

    # If the element exists assign its text to end variable
    end = end_prim.text if end_prim is not None else None
    end_int = int(end) if end is not None else None
    print('end: ', end_int)

    # Find the text inside the action tag
    action = x.find('action').text

    if action == 'remove':
        if re.match(r'%s' % regex, mfn_pn, re.IGNORECASE):
            print('if statement start:', start_int)
            print('if statement end:', end_int)
            if end_int == None:
                print('if statement start_int:', start_int)
                print('if statement end_int:', end_int)
                mfn_pn = mfn_pn[start_int:]
            elif start_int == None:
                print('elif statement start_int:' ,start_int)
                print('elif statement end_int:', end_int)
                mfn_pn = mfn_pn[:end_int]
            else: 
                print('else statement start_int:', start_int)
                print('else statement end_int:', end_int)
                mfn_pn = mfn_pn[start_int:end_int]
    elif action == 'substitute':
        mfn_pn = re.sub(r'%s' % regex, '', mfn_pn)

For the print statements inside the elif and else statements, nothing prints out because for some reason, the code thinks start_int never equals "None" and all the other cases for the else statement don't work either.对于 elif 和 else 语句中的 print 语句,没有任何输出,因为由于某种原因,代码认为 start_int 永远不会等于“None”,并且 else 语句的所有其他情况也不起作用。 It thinks that end_int == 'None' is always true and I'm not sure why it would think that because printing out "end_int" outside the if-statements get all the end_char values from the XML file.它认为 end_int == 'None' 总是正确的,我不确定它为什么会这样认为,因为在 if 语句之外打印出“end_int”会从 XML 文件中获取所有 end_char 值。

try 'DFW-324 3423 T&R'试试“DFW-324 3423 T&R”

mfn_pn = 'DFW-324 3423 T&R'
  • the first filter removes the first three characters第一个过滤器删除前三个字符
    • mfn_pn = '-324 3423 T&R'
  • the second filter regex does not match because the pattern requires the string to start with 'DFW'.第二个过滤器正则表达式不匹配,因为该模式要求字符串以“DFW”开头。
    •  mfn_pn = '-324 3423 T&R'
  • the third filter removes the dash第三个过滤器删除破折号
    • mfn_pn = '324 3423 T&R'
  • the fourth filter removes all the spaces第四个过滤器删除所有空格
    • mfn_pn='3243423T&R'
  • the fifth filter fails to remove T&R because the regex pattern is ' T&R$' notice the space in the pattern.第五个过滤器无法删除T&R ,因为正则表达式模式是' T&R$'请注意模式中的空格。
    •  mfn_pn='3243423T&R'

Your xml data for the fourth filter is wrong - change it to您的第四个过滤器的 xml 数据错误 - 将其更改为

...
    <filter>
        <regex>.*T&amp;R$</regex >
        <start_char></start_char>
        <end_char>-4</end_char>
        <action>remove</action>
    </filter>

or change it to或将其更改为

...
    <filter>
        <regex>T&amp;R$</regex >
        <start_char></start_char>
        <end_char>-4</end_char>
        <action>substitute</action>
    </filter>

If you want the second filter to remove a single period at the end change it to如果您希望第二个过滤器在最后删除单个句点,请将其更改为

...
    <filter>
        <regex>[.]$</regex >
        <start_char>3</start_char>
        <end_char>-1</end_char>
        <action>substitute</action>
    </filter>

Be careful every filter tag iteration may mutate the string so the order of removal and substitution is important.请注意,每个过滤器标记迭代都可能使字符串发生变异,因此删除和替换的顺序很重要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM