Given a string S as input. The program must find the number of patterns matching a*b. where * represent 1 or more alphabets.
import re
s = input()
matches = re.findall(r'MAGIC',s)
print(len(matches))
'''
i/p - aghba34bayyb
o/p - 2
(i.e aghb,ayyb)
It should not take a34b in count.
i/p - aabb
o/p - 3
(i.e aab abb aabb)
i/p : adsbab
o/p : 2
(i.e adsb ab)'''
You can use
a[a-zA-Z]+?b
import re
s = input()
matches = re.findall(r'a[a-zA-Z]+?b',s)
print(len(matches))
Using re.finditer
to match all substrings:
inputs = ['aghba34bayyb',
'aabb',
'adsbab']
import re
def all_substrings(s):
length, seen = len(s), set()
for i in range(length):
for j in range(i + 1, length + 1):
for g in re.finditer(r'(a[^\d]+b)', s[i:j]):
if (i+g.start(), i+g.end()) in seen:
continue
seen.add((i+g.start(), i+g.end()))
yield g.groups()[0]
for i in inputs:
print('Input="{}" Matches:'.format(i))
for s in all_substrings(i):
print(' "{}"'.format(s))
Prints:
Input="aghba34bayyb" Matches:
"aghb"
"ayyb"
Input="aabb" Matches:
"aab"
"aabb"
"abb"
Input="adsbab" Matches:
"adsb"
"adsbab"
You can find the positions of a
and b
in the word, find all possible substrings and then filter the substrings that only contains one or more chars in between
from itertools import product
words = ['aghba34bayyb', 'aabb', 'adsbab']
for word in words:
a_pos = [i for i,c in enumerate(word) if c=='a']
b_pos = [i for i,c in enumerate(word) if c=='b']
all_substrings = [word[s:e+1] for s,e in product(a_pos, b_pos) if e>s]
substrings = [s for s in all_substrings if re.match(r'a[a-zA-Z]+b$', s)]
print (word, substrings)
Output
aghba34bayyb ['aghb', 'ayyb']
aabb ['aab', 'aabb', 'abb']
adsbab ['adsb', 'adsbab']
re.findall(r'a[A-Za-z]+?b',s)
Where
[A-Za-z]
matches an alphabetic character, +
is one or more characters ?
tells it to be nongreedy You could match a
followed by 1 char az
and then use a character class matching 0+ times a
or cz
and then match the first b
a[a-z][ac-z]*b
If you want to match all following b's to match aabb
instead of aab
you could use
a[a-z][ac-z]*b+
import re
s = input()
matches = re.findall(r'a[a-z][ac-z]*b+',s)
print(len(matches))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.