簡體   English   中英

如何在Python中的字符串中搜索列表項

[英]How to search items of a list within strings in Python

在Python 2.7中,我想查找並計算文件名中包含特定字符串列表的文件。

文件清單:

  • Passport_Mike.pdf
  • 大衛護照
  • 伊恩身份證.pdf
  • CopyPassport Michael.pdf
  • 駕駛執照John.pdf

我想計算其中所有帶有“護照”或“ ID”的文件。

目前,我已經找到一種方法,可以根據定界符(_- /')將文件名拆分為不同的單詞。 我的文件總是找不到,因為文件不能總是被定界,例如'CopyPassport Michael',因為它沒有將'Passport'與'Copy'分開的對應分隔符。

我的代碼是基於的另外一個問題給出答案。 對於此代碼,我使用collections.Counter()

這是我的代碼:

from collections import Counter

listOfFiles = [Passport_Mike.pdf, David-Passport.pdf, Iain ID Card.pdf, CopyPassport Michael.pdf, Driving License John.pdf]
searrchTermsList = ["Passport", ÏD']

def fileSplit(string, delimiters):
    delimiters = tuple(delimiters)
    stack = [string,]

    for delimiter in delimiters:
        for i, substring in enumerate(stack):
            substack = substring.split(delimiter)
            stack.pop(i)
            for j, _substring in enumerate(substack):
                stack.insert(i+j, _substring)
    return stack
#This is a complicated split function but this method makes the files split into parts in my next function. Other split methods didn't work for me.

def searchTermsCount(listOfFiles, searchTermsList):
            counts = Counter()              
            for myFile in listOfFiles:
                myFileSplit = fileSplit(myFile,('_',' ','-','.'))
                counts.update(word.upper() for word in myFileSplit)
            myCount = 0
            for word in searchTermsList:
                myCount +=counts[word]
            print "Count files:", myCount

什么是Python 2.7方法來計算文件名中包含字符串列表而不使用分隔符的文件?

嘗試這個:

listOfFiles = ['Passport_Mike.pdf', 'David-Passport.pdf', 'Iain ID Card.pdf', 'CopyPassport Michael.pdf', 'Driving License John.pdf']
searrchTermsList = ["Passport", 'ID']
relevantfiles = [filename for filename in listOfFiles if any(searchterm in filename for searchterm in searrchTermsList)]
print(relevantfiles)

輸出:

['Passport_Mike.pdf', 'David-Passport.pdf', 'Iain ID Card.pdf', 'CopyPassport Michael.pdf']
listOfFiles = ['Passport_Mike.pdf', 'David-Passport.pdf', 'Iain ID Card.pdf', 'CopyPassport Michael.pdf', 'Driving License John.pdf']
searrchTermsList = ['Passport', 'ÏD']

filenamesWithTerms = [] 
for filename in listOfFiles:
   for term in searrchTermsList:
        if term in filename:
            filenamesWithTerms.append(filename) 
            break
print filenamesWithTerms
>>['Passport_Mike.pdf', 'David-Passport.pdf', 'Iain ID Card.pdf', 'CopyPassport Michael.pdf']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM