I have strings like so:
F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F
And I would like to count the average number of places between instances of 'F', '+' and '-'.
So for this example it would be:
Average chars between Fs: 1
Average chars between +s: 2.25
Average chars between -s: 3
What would be the most efficient way of doing this?
this is a variant.
first i collect the index i
of the occurrences of all char
acters; then i calculate the mean
of the differences:
from collections import defaultdict
from itertools import islice
from statistics import mean
strg = "F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F"
dct = defaultdict(list)
for i, char in enumerate(strg):
dct[char].append(i)
for char, occurrences in dct.items():
avg = mean(b - a for a, b in zip(occurrences, islice(occurrences, 1, None))) - 1
print(f"Average chars between {char}s: {avg}")
this prints:
Average chars between Fs: 1
Average chars between -s: 3
Average chars between +s: 2.25
after the first for loop there will be entries like this in dct
:
'-': [1, 3, 11, 13, 15, 23, 29, 33, 35, 37]
and - as mentioned - the second for loop calculates the mean of the differences.
I would do that using regular expressions ( re
module) following way:
import re
txt = "F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F"
chars = set(list(txt))
between = dict()
for i in chars:
between[i] = re.findall('(?<='+re.escape(i)+').*?(?='+re.escape(i)+')',txt)
for i in chars:
if len(between[i])==0:
between[i] = 0.0
else:
between[i] = sum([len(i) for i in between[i]])/len(between[i])
print(between)
Output:
{'F': 1.0, '+': 2.25, '-': 3.0}
Explanation: I am looking for substrings between occurences given character (using zero length assertions) from left to right workng in non-greedy manner (hence "FFF"
give ["-","-"]
rather than ["-F-"]
) then simply calculate averages of their lengths. Note that I used re.escape
to deal with characters of special meaning (like +
).
Another way without using any libraries:
string = 'F-F-F+F+F+F-F-F-F+F+F+F-F+F+F-F+F-F-F-F'
string = list(string)
chars = set(string)
for char in chars:
ind = [i for i, x in enumerate(string) if x == char]
diff = [ind[i+1]-ind[i] - 1 for i in range(len(ind)-1)]
print(f'Average chars between {char}s: {sum(diff) / len(diff)}')
Output:
Average chars between +s: 2.25
Average chars between Fs: 1.0
Average chars between -s: 3.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.