簡體   English   中英

如何在 python 中返回,讀取多個.xml 文件

[英]How to Return in python, reading multiple .xml files

我正在 Python 中編寫一個腳本,它將通過文件夾和子文件夾 go 讀取包含超過 100 個文件的 XML 文件。 If i hard code this code outside the function it reads all 100 XML files in temp0, however if i put this code inside the function and use return, function always returns only one 1 file, I mean it reads only one file. 任何人都可以解釋為什么“返回”以這種方式工作嗎? 提前致謝。

def raw_input(doc):
    for root, dirs, packs in doc:
        for files in packs:
            if files == 'abc.xml':
                filename = os.path.join(root, files)
                open_file = open(filename, 'r')
                perpX_ = open_file.read()
                # print(perpX_)
                outputX_ = re.compile('<test (.*?)</text>', re.DOTALL | re.IGNORECASE).findall(perpX_)
                temp0 = str("|".join(outputX_))
                #print(temp0)
                return temp0

doc=os.walk('./data/')
raw_input(doc)

temp0 = raw_input(doc)
print(temp0)

return returns the function result, so as soon as return is reached, Python exits the function and takes the result of the expression next to return as an output of a function.

你的returnfor循環中,這意味着每次迭代都會達到它,但是 Python 解釋器假定temp0是你的 function 調用的最終結果,所以它退出了。

您可以在一個列表中返回多個值,例如,像這樣:

def raw_input(doc):
    result = []    # this is where your output will be aggregated
    for root, dirs, packs in doc:
        for files in packs:
            if files == 'abc.xml':
                filename = os.path.join(root, files)
                open_file = open(filename, 'r')
                perpX_ = open_file.read()
                # print(perpX_)
                outputX_ = re.compile('<test (.*?)</text>', re.DOTALL | re.IGNORECASE).findall(perpX_)
                # We append the output for current file to the list
                result.append(str("|".join(outputX_)))
    # And now we return our string, at the end of the function.
    # AFTER the for loops
    return '|'.join(result)

doc=os.walk('./data/')

temp0 = raw_input(doc)
print(temp0)

這樣,您將獲得作為單個字符串的輸出。

此外,還有generator之類的東西。 生成器是可以迭代的 object。 您可以使您的代碼延遲評估(按需):

# now raw_input is a generator
def raw_input(doc):
    # we don't need a storage now
    for root, dirs, packs in doc:
        for files in packs:
            if files == 'abc.xml':
                filename = os.path.join(root, files)
                open_file = open(filename, 'r')
                perpX_ = open_file.read()
                outputX_ = re.compile('<test (.*?)</text>', re.DOTALL | re.IGNORECASE).findall(perpX_)
                # now we yield current value and function temporary stops its evaluation
                yield str("|".join(outputX_))

doc=os.walk('./data/')
results = raw_input(doc)
# now results is a generator. It is not evaluated yet
# you can get first output like this:
first_out = next(results)
# and then the second:
second_out = next(results)
# or iterate over it, just like over a casual list:
for res in results:
    print(res)
# note that it will iterate only over next values
# (excluding first and second ones, since it doesn't have access to them anymore)

# and now res is empty (we've reached the end of generator)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM