![](/img/trans.png)
[英]Delete certain files from a directory using regex regarding their file names
[英]Match file names in a directory to Pandas series, delete non matching files
我使用的是Python 2.7。
我在目錄中有一堆文件(主要是Outlook電子郵件)。 示例文件名:
RE: We have Apple.msg
RE: Orange are in stock.msg
RE: Pick up some cabbage please.msg
我有一個熊貓系列
Granny Smith Apple
High Quality Orange
Delicious soup
如何遍歷目錄,查找包含pandas系列單詞的文件名,並刪除找不到匹配項的文件? 在上面的示例中, RE: Pick up some cabbage please.msg
。由於在熊貓系列中發現了Apple
和Orange
,因此RE: Pick up some cabbage please.msg
將被刪除。
編輯:我想實際刪除目錄中找不到匹配的文件
我們可以使用str.contains
s1[pd.Series(l).str.contains('|'.join(s.str.split().sum()))]
Out[560]:
0 RE: We have Apple.msg
1 RE: Orange are in stock.msg
dtype: object
數據輸入
l=['RE: We have Apple.msg',
'RE: Orange are in stock.msg',
'RE: Pick up some cabbage please.msg']
s1=pd.Series(l)
s=pd.Series(['Granny Smith Apple','High Quality Orange','Delicious soup'])
可以使用os
和listdir
,然后使用str.contains
from os import listdir
from os.path import isfile, join
m = '/' # your path
files_in_directory = [f for f in listdir(m) if isfile(join(m, f))]
files = pd.Series(files_in_directory)
s = pd.Series(["Granny Smith Apple",
"High Quality Orange",
"Delicious soup"])
z = pd.Series(s.str.split().sum())
files.str.contains('|'.join(z))
這是我發現適合我的解決方案
#contains strings we want to filter
checklist = [x.lower() for x in checklist]
m = r'' # path where our files are contained
new_directory = r'' # path where we will move the matched files to to
for each_checklist in checklist:
print 'now checking for keyword ' + str(each_checklist)
for root, dirs, files in os.walk(m):
for i in files:
if each_checklist in i.lower():
# this moves the file from root, to target directory
os.rename(os.path.join(root, i), os.path.join(new_directory, i))
else:
None
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.