I have a list of strings. I only want to extract the words within each string that have a specific character sequence.
For example
l1=["grad madd have", "ddim middle left"]
I want all the words that have sequence "dd"
so I would like to get
[["madd"], ["ddim", "middle"]]
I've been trying patterns of the form
[re.findall(r'(\b.*?dd.*\s+)',word) for word in l1]
but have had little success
You can just use list comprehension for this. You don't need regex to accomplish what you're trying to do.
l1=["grad madd have", "ddim middle left"]
print([s for a in l1 for s in a.split() if 'dd' in s])
This loops over l1
and splits each value by the space character. It then tests that substring to see if it contains dd
and returns it if it does.
您接近了,您想要使用\\w*
将单词字符0匹配很多次:
[re.findall(r'\w*dd\w*', word) for word in l1]
You can try with this Regex : \\b\\w*dd\\w*\\b
Try this in one line:
l1=["grad madd have", "ddim middle left"]
print(list(map(lambda x:list(filter(lambda y:'dd' in y,x.split())),l1)))
output:
[['madd'], ['ddim', 'middle']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.