計算正則表達式模式出現在字符串列表中的次數

Question

假設我有一個學校列表：

schools = [
    '00A000',
    '01A000',
    '00B000',
    '01B000',
    '00C000',
    '01C000'
]

我正在做一些數據探索，我想做的第一件事是計算所有學校，例如%A% （中間有一個A ）。

我以為我可以使用類似下面的命令：

schools.count('\BA')

但看起來我可以使用正則表達式的唯一方法是使用re模塊：

[re.findall('\BA', x) for x in schools].count(['A'])

這是最簡單的方法嗎？

完整代碼：

import re

schools = [
    '00A000',
    '01A000',
    '00B000',
    '01B000',
    '00C000',
    '01C000'
]

# Data exploration. Find count of all district A schools.

# I thought I could use list's built in count and some kind of string regex for it to
# take in:
schools.count('\BA')
# Above example is invalid.

# It looks like I must loop over with regex and then add a count after, right?
[re.findall('\BA', x) for x in schools].count(['A'])

# Repeat for B and C...

Answer 1

你可以完全放棄使用正則表達式，如果你確實想匹配“xyAuv”而不是“Axyuv”或“xyuvA”，你可以使用：

len([1 for school in schools if 'A' in school[1:-1]])

如果字符串中的任何 'A' 都可以，當然只需'A' in school使用'A' in school 。

一種更有趣的寫法是：

sum('A' in school for school in schools)

但它可能會令人困惑，而且速度有點慢。

或者：

from functools import reduce                                                                                 
from operator import add                                                                                     

reduce(add, ('A' in school for school in schools))

這很有趣，但速度更快。

Answer 2

如何將列表加入字符串並獲取出現次數：

import re
print(len(re.findall(r'\BA',','.join(schools))))

輸出：

Answer 3

正如我在評論中所說，我會選擇：

len(re.findall('\BA\B', ','.join(schools)))

這是一個概念證明：

Python 3.7.6 (default, Dec 19 2019, 22:52:49) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> schools = [
...     '00A000',
...     '01A000',
...     '00B000',
...     '01B000',
...     '00C000',
...     '01C000',
...     'A0D000',
...     '01B00A'
... ]
>>> 
>>> len(re.findall('\BA\B', ','.join(schools)))
2

計算正則表達式模式出現在字符串列表中的次數

問題描述

3 個解決方案

解決方案1
1 2020-01-17 18:12:16

解決方案2
0 2020-01-17 17:14:40

解決方案3
0 2020-01-17 17:18:51

計算正則表達式模式出現在字符串列表中的次數

問題描述

3 個解決方案

解決方案1 1 2020-01-17 18:12:16

解決方案2 0 2020-01-17 17:14:40

解決方案3 0 2020-01-17 17:18:51

解決方案1
1 2020-01-17 18:12:16

解決方案2
0 2020-01-17 17:14:40

解決方案3
0 2020-01-17 17:18:51