[英]How to Return in python, reading multiple .xml files
我正在 Python 中編寫一個腳本,它將通過文件夾和子文件夾 go 讀取包含超過 100 個文件的 XML 文件。 If i hard code this code outside the function it reads all 100 XML files in temp0, however if i put this code inside the function and use return, function always returns only one 1 file, I mean it reads only one file. 任何人都可以解釋為什么“返回”以這種方式工作嗎? 提前致謝。
def raw_input(doc):
for root, dirs, packs in doc:
for files in packs:
if files == 'abc.xml':
filename = os.path.join(root, files)
open_file = open(filename, 'r')
perpX_ = open_file.read()
# print(perpX_)
outputX_ = re.compile('<test (.*?)</text>', re.DOTALL | re.IGNORECASE).findall(perpX_)
temp0 = str("|".join(outputX_))
#print(temp0)
return temp0
doc=os.walk('./data/')
raw_input(doc)
temp0 = raw_input(doc)
print(temp0)
return
returns the function result, so as soon as return
is reached, Python exits the function and takes the result of the expression next to return
as an output of a function.
你的return
在for
循環中,這意味着每次迭代都會達到它,但是 Python 解釋器假定temp0
是你的 function 調用的最終結果,所以它退出了。
您可以在一個列表中返回多個值,例如,像這樣:
def raw_input(doc):
result = [] # this is where your output will be aggregated
for root, dirs, packs in doc:
for files in packs:
if files == 'abc.xml':
filename = os.path.join(root, files)
open_file = open(filename, 'r')
perpX_ = open_file.read()
# print(perpX_)
outputX_ = re.compile('<test (.*?)</text>', re.DOTALL | re.IGNORECASE).findall(perpX_)
# We append the output for current file to the list
result.append(str("|".join(outputX_)))
# And now we return our string, at the end of the function.
# AFTER the for loops
return '|'.join(result)
doc=os.walk('./data/')
temp0 = raw_input(doc)
print(temp0)
這樣,您將獲得作為單個字符串的輸出。
此外,還有generator
之類的東西。 生成器是可以迭代的 object。 您可以使您的代碼延遲評估(按需):
# now raw_input is a generator
def raw_input(doc):
# we don't need a storage now
for root, dirs, packs in doc:
for files in packs:
if files == 'abc.xml':
filename = os.path.join(root, files)
open_file = open(filename, 'r')
perpX_ = open_file.read()
outputX_ = re.compile('<test (.*?)</text>', re.DOTALL | re.IGNORECASE).findall(perpX_)
# now we yield current value and function temporary stops its evaluation
yield str("|".join(outputX_))
doc=os.walk('./data/')
results = raw_input(doc)
# now results is a generator. It is not evaluated yet
# you can get first output like this:
first_out = next(results)
# and then the second:
second_out = next(results)
# or iterate over it, just like over a casual list:
for res in results:
print(res)
# note that it will iterate only over next values
# (excluding first and second ones, since it doesn't have access to them anymore)
# and now res is empty (we've reached the end of generator)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.