如何在for循環中從Word文檔的文件名中提取特定名稱（在python中）？

Question

下面是循環所有Word文檔文件的for循環。 如您在下面看到的，我已經打印了文件名以查看其輸出。

for filename in os.listdir(root_dir):
            source_directory = root_dir + '/' + filename
            # The output of filename is shown in the next section.
           -> print(filename)
            arr = mynotes_extractor.get_mynotes(source_directory)
            list2str = str(arr)
            c = cleanString(newstring=list2str)
            new_arr = []
            new_arr += [c]
            text_file = open(output, 'a', encoding='utf-8')
            for item in new_arr:
                text_file.write("%s\n" % item)

下面是打印文件名后的輸出：

12345_Cat_A_My Notes.docx
6789_Cat_B_My Notes.docx
54321_Cat_A_My Notes.docx
12234_Cat_C_My Notes.docx
86075_Cat_D_My Notes.docx
34324_Cat_E_My Notes.docx

我只想提取特定的名稱，即for循環內word文檔的所有文件名中的“ My Notes”。

For instance: 
         Before filename of word document extraction: 34324_Cat_E_My Notes.docx
         After filename of word document extraction: My Notes

Answer 1

一字不漏，但剛開始時可能會造成混淆。

filename.split('.')[0].split('_')[-1]

輸出： 'My Notes'

詳細說明如下：

filename = '12345_Cat_A_My Notes.docx'

.split('.')在每個周期分割字符串

>>>['12345_Cat_A_My Notes', 'docx']

[0]占據列表的第一個元素

>>>'12345_Cat_A_My Notes'

.split('_')在每個下划線返回時拆分此字符串

>>>['12345', 'Cat', 'A', 'My Notes']

[-1]最后，返回列表中的最后一項

>>>'My Notes'

如何在for循環中從Word文檔的文件名中提取特定名稱（在python中）？

問題描述

1 個解決方案

解決方案1
2 已采納 2018-10-08 14:28:58

如何在for循環中從Word文檔的文件名中提取特定名稱（在python中）？

問題描述

1 個解決方案

解決方案1 2 已采納 2018-10-08 14:28:58

解決方案1
2 已采納 2018-10-08 14:28:58