Python-文件名中匹配整數？

Question

我正在創建一個數據文件，該文件在第一列（'id'）中具有標識符，其中包含名稱和數字（即name22，name43，name185）。 我正在嘗試從數據文件中獲取標識符中的數字，並將其與當前文件名上方目錄中存在的文件名中的數字相匹配-這些文件具有不同的名稱，但具有相同的對應編號（即old22， old43，old185）。

如何將數據文件“ id”列中的數字與文件名中的數字匹配？ 我在下面編寫了腳本，但是沒有任何輸出/錯誤。

import os
import fnmatch
import pandas as pd

os.system('grep id *log > data.txt')
df = pd.read_table("data.txt", delim_whitespace=True, header = None)
df.columns = ['id','anum','aname','iso']
num = df.id.str.extract('(\d+)')
regex = r'\d+'

for filename in os.listdir('../'):
    if fnmatch.fnmatch(regex,'*.txt'):
         f = open(filename,"r"):
         ...do more things....

Answer 1

此模塊提供對Unix Shell樣式通配符的支持，這些通配符與正則表達式（在re模塊中記錄）不同。 Shell樣式通配符中使用的特殊字符為：

Pattern     Meaning
*   matches everything
?   matches any single character
[seq]   matches any character in seq
[!seq]  matches any character not in seq

fnmatch文檔

這意味着您不能使用完整的正則表達式來查找文件名，而只能使用Shell通配符。 我建議使用*同一個ID，如*123.txt

# assuminng you have id variable
for filename in os.listdir('.'):
  if fnmatch.fnmatch(filename, '*{0}.txt'.format(id)):
     f = open(filename,"r") #...

您也可以使用fnmatch.filter函數，因為上述解決方案不是最有效的解決方案。

Answer 2

如果您的id列是這樣的：

f_s = pd.Series(['name22', 'name43', 'name185'])

和os.listdir('../')像這樣：

others = ['old22.txt', 'old43.txt', 'old185.txt', 'mold43.png']

您可以在id列中創建一組數字

id_nbrs = set(f_s.str.extract(r'(\d+)'))

然后使用一個函數來過濾所需的文件：

digits = re.compile(r'(\d+)$')
def f(s):
    name, ext = s.split('.')
    nbr = digits.search(name).group()
    #print(name, ext, nbr)
    return nbr in id_nbrs and ext == 'txt'

for thing in filter(f, others):
    print(thing)

>>>
old22.txt
old43.txt
old185.txt
>>>

Python-文件名中匹配整數？

問題描述

2 個解決方案

解決方案1
2 2018-02-14 20:14:12

解決方案2
1 已采納 2018-02-14 21:56:34

Python-文件名中匹配整數？

問題描述

2 個解決方案

解決方案1 2 2018-02-14 20:14:12

解決方案2 1 已采納 2018-02-14 21:56:34

解決方案1
2 2018-02-14 20:14:12

解決方案2
1 已采納 2018-02-14 21:56:34