Currently I am trying to upload a set of files via API call. The files have sequential names: part0.xml, part1.xml, etc. It loops through all the files and uploads them properly, but it seems it doesn't break the loop and after it uploads the last available file in the directory I am getting an error:
No such file or directory.
And I don't really understand how to make it stop as soon as the last file in the directory is uploaded. Probably it a very dumb question, but I am really lost. How do I stop it from looping through non-existent files?
The code:
part = 0
with open('part%d.xml' % part, 'rb') as xml:
#here goes the API call code
part +=1
I also tried something like this:
import glob
part = 0
for fname in glob.glob('*.xml'):
with open('part%d.xml' % part, 'rb') as xml:
#here goes the API call code
part += 1
Edit: Thank you all for the answers, learned a lot. Still lots to learn. :)
Alternatively, you can simply use a regex.
import os, re
files = [f for f in os.listdir() if re.search(r'part[\d]+\.xml$', f)]
for f in files:
#process..
This will be really useful in case you require advanced filtering.
Note: you can do similar filtering using list returned by glob.glob()
If you are not familiar with the list comprehension and regex, I would recommend you to refer to:
You almost had it. This is your code with some stuff removed:
import glob
for fname in glob.glob('part*.xml'):
with open(fname, 'rb') as xml:
# here goes the API call code
It is possible to make the glob more specific, but as it is it solves the "foo.xml" problem. The key is to not use counters in Python; the idiomatic iteration is for x in y:
and you don't need a counter.
glob
will return the filenames in alphabetical order so you don't even have to worry about that, however remember that ['part1', 'part10', 'part2'] sort in that order. There are a few ways to cope with that but it would be a separate question.
Consider what happens if there are other files that match the '*.xml'
suppose that you have 11 files "part0.xml"..."part10.xml" but also a file called "foo.xml"
Then the for loop will iterate 12 times (since there are 12 matches for the glob). On the 12th iteration, you are trying to open "part11.xml" which doesn't exist.
On approach is to dump the glob and just handle the exception.
part = 0
while True:
try:
with open('part%d.xml' % part, 'rb') as xml:
#here goes the API call code
part += 1
except IOerror:
break
When you use a counter, you need to test, if the file exists:
import os
from itertools import count
for part in count():
filename = 'part%d.xml' % part
if not os.path.exists(filename):
break
with open(filename) as inp:
# do something
Your for
loop is saying "for every file that ends with .xml
"; if you have any file that ends with .xml
that isn't a sequential part%d.xml
, you're going to get an error. Imagine you have part0.xml
and foo.xml
. The for
loop is going to loop twice; on the second loop, it's going to try to open part1.xml
, which doesn't exist.
Since you know the filenames already, you don't even need to use glob.glob()
; just check if each file exists before opening it, until you find one that doesn't exist.
import os
from itertools import count
filenames = ('part%d.xml' % part_num for part_num in count())
for filename in filenames:
if os.path.exists(filename):
with open(filename, 'rb') as xmlfile:
do_stuff(xml_file)
# here goes the API call code
else:
break
If for any reason you're worried about files disappearing between os.path.exists(filename)
and open(filename, 'rb')
, this code is more robust:
import os
from itertools import count
filenames = ('part%d.xml' % part_num for part_num in count())
for filename in filenames:
try:
xmlfile = open(filename, 'rb')
except IOError:
break
else:
with xmlfile:
do_stuff(xmlfile)
# here goes the API call code
You are doing it wrong. Suppose folder has 3 files- part0.xml part1.xml and foo.xml. So loop will iterate 3 times and it will give error for third iteration, it will try to open part2.xml, which is not present.
Don't loop through all files with extension .xml.
Only Loop through files which start with 'part', have a digit in the name before the extension and having extension .xml
So your code will look like this:
import glob
for fname in glob.glob('part*[0-9].xml'):
with open(fname, 'rb') as xml:
#here goes the API call code
Read - glob – Filename pattern matching
If you want files to be uploaded in sequential order then read : String Natural Sort
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.