[英]pandas remove all words before a specific word and get the first n words after that specific word
[英]Get all words that come after specific word
我有一個包含這樣的字符串的列表:
['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n']
我想獲取括號中所有單詞之前的所有單詞以及括號中的單詞以獲得以下內容:
White Buns (Hot Dog)
2 x Danish (Almond Danish)
有沒有辦法做到這一點? 我試過正則表達式,但不知道如何指定(括號中的一個詞)
我會使用str.partition :
li=['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n']
>>> [''.join(s.partition(')')[:2]) for s in li]
['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish)']
無論是否存在')'
,它都有效。
鑒於:
li=['No paren in this string...',
'Divide me (after) <- please',
'More that (1) paren (2) here']
>>> [''.join(s.partition(')')[:2]) for s in li]
['No paren in this string...', 'Divide me (after)', 'More that (1)']
如果要使用正則表達式,請使用re.sub查找並保留第一部分:
>>> [re.sub(r'([^)]*[)]).*', r'\1', s) for s in li]
['No paren in this string...', 'Divide me (after)', 'More that (1)']
不要忘記([^)]*[)])
或re.sub
之后的.*
只是重新組合你的字符串。
要指定括號,您可以使用'\)'(等於')')和'\('(等於'(')
import re #regex lib
pattern = "(.*?)(\(.*?\))"
text = 't2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n'
result = re.search(pattern, text)
print(result[0]) #all string found
print(result[1]) #first group (.*?)
print(result[2]) #second group (\(.*?\))
結果:
t2 x Danish (Almond Danish)
t2 x Danish
(Almond Danish)
以下是如何實現這一目標
list1 = ['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish) - A delight with coffee! \nThe best fruit fillings and of coarse butter filled flaky puff pastry.\n']
for word in list1:
for i, letter in enumerate(word):
if letter == ')':
b.append(word[0:i+1])
print(b)
Output
['White Buns (Hot Dog)', ' 2 x Danish (Almond Danish)']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.