![](/img/trans.png)
[英]pandas remove all words before a specific word and get the first n words after that specific word
[英]Get all words that come after specific word
我有一个包含这样的字符串的列表:
['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n']
我想获取括号中所有单词之前的所有单词以及括号中的单词以获得以下内容:
White Buns (Hot Dog)
2 x Danish (Almond Danish)
有没有办法做到这一点? 我试过正则表达式,但不知道如何指定(括号中的一个词)
我会使用str.partition :
li=['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n']
>>> [''.join(s.partition(')')[:2]) for s in li]
['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish)']
无论是否存在')'
,它都有效。
鉴于:
li=['No paren in this string...',
'Divide me (after) <- please',
'More that (1) paren (2) here']
>>> [''.join(s.partition(')')[:2]) for s in li]
['No paren in this string...', 'Divide me (after)', 'More that (1)']
如果要使用正则表达式,请使用re.sub查找并保留第一部分:
>>> [re.sub(r'([^)]*[)]).*', r'\1', s) for s in li]
['No paren in this string...', 'Divide me (after)', 'More that (1)']
不要忘记([^)]*[)])
或re.sub
之后的.*
只是重新组合你的字符串。
要指定括号,您可以使用'\)'(等于')')和'\('(等于'(')
import re #regex lib
pattern = "(.*?)(\(.*?\))"
text = 't2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n'
result = re.search(pattern, text)
print(result[0]) #all string found
print(result[1]) #first group (.*?)
print(result[2]) #second group (\(.*?\))
结果:
t2 x Danish (Almond Danish)
t2 x Danish
(Almond Danish)
以下是如何实现这一目标
list1 = ['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n']
for word in list1:
for i, letter in enumerate(word):
if letter == ')':
b.append(word[0:i+1])
print(b)
Output
['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish)']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.