简体   繁体   English

带有glob.glob的Python多文件类型支持

[英]Python Multiple filetype support with glob.glob

I'm trying to use glob.glob to provide support for more than one filetype. 我正在尝试使用glob.glob为多个文件类型提供支持。 The code I have is supposed to take files with the extensions '.pdf', '.xls', and '.xlsx' residing in the directory '/mnt/Test' and execute the code below after files matching have been found. 我拥有的代码应该获取位于目录'/ mnt / Test'中的扩展名为'.pdf','。xls'和'.xlsx'的文件,并在找到文件匹配后执行以下代码。

When I replace the existing for loop with just 当我将现有的for循环替换为

for filename in glob.glob("*.xlsx"):
     print filename

It works just fine. 它工作正常。

When attempting to run the following code: 尝试运行以下代码时:

def main():
    os.chdir("/mnt/Test")
    extensions = ("*.xls", ".xlsx", ".pdf")
    filename = []
    for files in extensions:
        filename.extend(glob.glob(files))
        print filename
        sys.stdout.flush()
        doc_id, version = doc_placeholder(filename)

        print 'doc_id:', doc_id, 'version:', version

        workspace_upload(doc_id, version, filename)

        print "%s has been found. Preparing next phase..." % filename
        ftp_connection.cwd(remote_path)
        fh = open(filename, 'rb')
        ftp_connection.storbinary('STOR %s' % timestr + '_' + filename, fh)
        fh.close()

        send_email(filename)

I run across the following error: 我遇到以下错误:

Report /mnt/Test/fileTest.xlsx has been added.
[]
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/dist-    packages/watchdog/observers/api.py", line 199, in run
self.dispatch_events(self.event_queue, self.timeout)
File "/usr/local/lib/python2.7/dist- packages/watchdog/observers/api.py", line 368, in dispatch_events
handler.dispatch(event)
File "/usr/local/lib/python2.7/dist-packages/watchdog/events.py", line 330, in dispatch
_method_map[event_type](event)
File "observe.py", line 14, in on_created
fero.main()
File "/home/tesuser/project-a/testing.py", line 129, in main
doc_id, version = doc_placeholder(filename)
File "/home/testuser/project-a/testing.py", line 58, in doc_placeholder
payload = {'documents':[{'document':{'name':os.path.splitext(filename)[0],'parentId':parent_id()}}]}
File "/usr/lib/python2.7/posixpath.py", line 105, in splitext
return genericpath._splitext(p, sep, altsep, extsep)
File "/usr/lib/python2.7/genericpath.py", line 91, in _splitext
sepIndex = p.rfind(sep)
AttributeError: 'list' object has no attribute 'rfind'

How can I edit the code above to achieve what I need? 如何编辑上面的代码来实现所需的功能?

Thanks in advance, everyone. 预先感谢大家。 Appreciate the help. 感谢帮助。

doc_placeholder includes this snippet, os.path.splitext(filename) . doc_placeholder包括此片段os.path.splitext(filename) Assuming filename is the list you passed in you've given a list to os.path.splittext when it is expecting a string. 假设filename是您传入的列表,则当您期望一个字符串时,已向os.path.splittext提供了一个列表。

Fix this by iterating over each filename instead of trying to process the entire list at once. 通过遍历每个文件名而不是立即尝试处理整个列表来解决此问题。

def main():
    os.chdir("/mnt/Test")
    extensions = ("*.xls", "*.xlsx", "*.pdf")
    filenames = []  # made 'filename' plural to indicate it's a list

    # building list of filenames moved to separate loop
    for files in extensions: 
        filenames.extend(glob.glob(files)) 

    # iterate over filenames    
    for filename in filenames: 
        print filename
        sys.stdout.flush()
        doc_id, version = doc_placeholder(filename)

        print 'doc_id:', doc_id, 'version:', version

        workspace_upload(doc_id, version, filename)

        print "%s has been found. Preparing next phase..." % filename
        ftp_connection.cwd(remote_path)
        fh = open(filename, 'rb')
        ftp_connection.storbinary('STOR %s' % timestr + '_' + filename, fh)
        fh.close()

        send_email(filename)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM