[英]How to count the average number of places between a letter in a string?
我有这样的字符串:
F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F
我想计算“ F”,“ +”和“-”之间的平均位数。
因此,对于此示例,它将是:
Average chars between Fs: 1
Average chars between +s: 2.25
Average chars between -s: 3
最有效的方法是什么?
这是一个变体。
首先我收集索引i
所有出现的char
acters; 然后我计算差异的mean
:
from collections import defaultdict
from itertools import islice
from statistics import mean
strg = "F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F"
dct = defaultdict(list)
for i, char in enumerate(strg):
dct[char].append(i)
for char, occurrences in dct.items():
avg = mean(b - a for a, b in zip(occurrences, islice(occurrences, 1, None))) - 1
print(f"Average chars between {char}s: {avg}")
打印:
Average chars between Fs: 1
Average chars between -s: 3
Average chars between +s: 2.25
在第一个for循环之后,在dct
中将有这样的条目:
'-': [1, 3, 11, 13, 15, 23, 29, 33, 35, 37]
如前所述,第二个for循环计算差异的平均值。
我会通过以下方式使用正则表达式( re
模块)来做到这一点:
import re
txt = "F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F"
chars = set(list(txt))
between = dict()
for i in chars:
between[i] = re.findall('(?<='+re.escape(i)+').*?(?='+re.escape(i)+')',txt)
for i in chars:
if len(between[i])==0:
between[i] = 0.0
else:
between[i] = sum([len(i) for i in between[i]])/len(between[i])
print(between)
输出:
{'F': 1.0, '+': 2.25, '-': 3.0}
说明:我正在以非贪婪的方式从左到右查找给定字符(使用零长度断言)的出现之间的子字符串(因此, "FFF"
给出["-","-"]
而不是["-F-"]
),然后简单地计算其长度的平均值。 请注意,我使用re.escape
处理具有特殊含义的字符(例如+
)。
不使用任何库的另一种方法:
string = 'F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F'
string = list(string)
chars = set(string)
for char in chars:
ind = [i for i, x in enumerate(string) if x == char]
diff = [ind[i+1]-ind[i] - 1 for i in range(len(ind)-1)]
print(f'Average chars between {char}s: {sum(diff) / len(diff)}')
输出:
Average chars between +s: 2.25
Average chars between Fs: 1.0
Average chars between -s: 3.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.