简体   繁体   English

在Python中正确地打破循环

[英]Breaking the loop properly in Python

Currently I am trying to upload a set of files via API call. 目前我正在尝试通过API调用上传一组文件。 The files have sequential names: part0.xml, part1.xml, etc. It loops through all the files and uploads them properly, but it seems it doesn't break the loop and after it uploads the last available file in the directory I am getting an error: 这些文件有顺序名称:part0.xml,part1.xml等。它循环遍历所有文件并正确上传它们,但它似乎没有打破循环,并在它上传我目录中的最后一个可用文件后收到错误:

No such file or directory. 没有相应的文件和目录。

And I don't really understand how to make it stop as soon as the last file in the directory is uploaded. 并且我真的不明白如何在目录中的最后一个文件上传后立即停止。 Probably it a very dumb question, but I am really lost. 可能这是一个非常愚蠢的问题,但我真的迷路了。 How do I stop it from looping through non-existent files? 如何阻止它循环遍历不存在的文件?

The code: 编码:

part = 0
with open('part%d.xml' % part, 'rb') as xml:

    #here goes the API call code

part +=1

I also tried something like this: 我也尝试过这样的事情:

import glob
part = 0
for fname in glob.glob('*.xml'):
    with open('part%d.xml' % part, 'rb') as xml:

        #here goes the API call code

    part += 1

Edit: Thank you all for the answers, learned a lot. 编辑:谢谢大家的答案,学到了很多东西。 Still lots to learn. 还有很多要学的东西。 :) :)

Alternatively, you can simply use a regex. 或者,您可以简单地使用正则表达式。

import os, re
files = [f for f in os.listdir() if re.search(r'part[\d]+\.xml$', f)]
for f in files:
  #process..

This will be really useful in case you require advanced filtering. 如果您需要高级过滤,这将非常有用。

Note: you can do similar filtering using list returned by glob.glob() 注意:您可以使用glob.glob()返回的列表进行类似的过滤

If you are not familiar with the list comprehension and regex, I would recommend you to refer to: 如果您不熟悉列表理解和正则表达式,我建议您参考:

  1. Regex - howto 正则表达式 - 如何
  2. List Comprehensions 列表理解

You almost had it. 你几乎拥有它。 This is your code with some stuff removed: 这是删除了一些内容的代码:

import glob

for fname in glob.glob('part*.xml'):
    with open(fname, 'rb') as xml:
        # here goes the API call code

It is possible to make the glob more specific, but as it is it solves the "foo.xml" problem. 可以使glob更具体,但因为它解决了“foo.xml”问题。 The key is to not use counters in Python; 关键是不要在Python中使用计数器; the idiomatic iteration is for x in y: and you don't need a counter. 惯用迭代是for x in y:并且您不需要计数器。

glob will return the filenames in alphabetical order so you don't even have to worry about that, however remember that ['part1', 'part10', 'part2'] sort in that order. glob将按字母顺序返回文件名,因此您甚至不必担心,但请记住['part1','part10','part2']按此顺序排序。 There are a few ways to cope with that but it would be a separate question. 有几种方法可以解决这个问题,但这将是一个单独的问题。

Consider what happens if there are other files that match the '*.xml' 考虑如果有其他文件与'*.xml'匹配会发生什么

suppose that you have 11 files "part0.xml"..."part10.xml" but also a file called "foo.xml" 假设您有11个文件“part0.xml”...“part10.xml”,但也有一个名为“foo.xml”的文件

Then the for loop will iterate 12 times (since there are 12 matches for the glob). 然后for循环将迭代12次(因为glob有12个匹配)。 On the 12th iteration, you are trying to open "part11.xml" which doesn't exist. 在第12次迭代中,您尝试打开不存在的“part11.xml”。

On approach is to dump the glob and just handle the exception. 方法是转储glob并只处理异常。

part = 0
while True:
    try:
        with open('part%d.xml' % part, 'rb') as xml:

            #here goes the API call code

        part += 1
    except IOerror:
        break

When you use a counter, you need to test, if the file exists: 使用计数器时,如果文件存在,则需要测试:

import os
from itertools import count

for part in count():
    filename = 'part%d.xml' % part
    if not os.path.exists(filename):
        break
    with open(filename) as inp:
        # do something

Your for loop is saying "for every file that ends with .xml "; 你的for循环说“对于每个以.xml结尾的文件”; if you have any file that ends with .xml that isn't a sequential part%d.xml , you're going to get an error. 如果你有任何以.xml结尾但不是顺序part%d.xml ,你将会收到一个错误。 Imagine you have part0.xml and foo.xml . 想象一下,你有part0.xmlfoo.xml The for loop is going to loop twice; for循环将循环两次; on the second loop, it's going to try to open part1.xml , which doesn't exist. 在第二个循环中,它将尝试打开不存在的part1.xml

Since you know the filenames already, you don't even need to use glob.glob() ; 既然你已经知道了文件名,你甚至不需要使用glob.glob() ; just check if each file exists before opening it, until you find one that doesn't exist. 只需检查每个文件是否存在,然后再打开它,直到找到一个不存在的文件。

import os

from itertools import count


filenames = ('part%d.xml' % part_num for part_num in count())

for filename in filenames:
    if os.path.exists(filename):
        with open(filename, 'rb') as xmlfile:
            do_stuff(xml_file)
            # here goes the API call code
    else:
        break

If for any reason you're worried about files disappearing between os.path.exists(filename) and open(filename, 'rb') , this code is more robust: 如果由于任何原因你担心文件在os.path.exists(filename)open(filename, 'rb')之间消失,则此代码更加健壮:

import os

from itertools import count


filenames = ('part%d.xml' % part_num for part_num in count())

for filename in filenames:
    try:
        xmlfile = open(filename, 'rb')
    except IOError:
        break
    else:
        with xmlfile:
            do_stuff(xmlfile)
            # here goes the API call code

You are doing it wrong. 你做错了。 Suppose folder has 3 files- part0.xml part1.xml and foo.xml. 假设文件夹有3个文件 - part0.xml part1.xml和foo.xml。 So loop will iterate 3 times and it will give error for third iteration, it will try to open part2.xml, which is not present. 因此循环将迭代3次并且它将在第三次迭代时给出错误,它将尝试打开不存在的part2.xml。

Don't loop through all files with extension .xml. 不要遍历扩展名为.xml的所有文件。

Only Loop through files which start with 'part', have a digit in the name before the extension and having extension .xml 只循环遍历以'part'开头的文件,在扩展名前面有一个数字,并且扩展名为.xml

So your code will look like this: 所以你的代码看起来像这样:

import glob

for fname in glob.glob('part*[0-9].xml'):
    with open(fname, 'rb') as xml:
        #here goes the API call code

Read - glob – Filename pattern matching 读取 - glob - 文件名模式匹配

If you want files to be uploaded in sequential order then read : String Natural Sort 如果您希望按顺序上传文件,请阅读: String Natural Sort

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM