![](/img/trans.png)
[英]How do i get a specific word phrase out of a word soup with beautiful soup?
[英]How can I get a specific word after a phrase?
我正在處理一個包含文本的數據集,我想在文本中提取一個名稱。 所以它的 tweet_id,文本列,我想從推文文本中提取名稱。
text.startswith('This is ') and re.match(r'[A-Z].*', text.split()[2]):
new_names.append(text.split()[2].strip(',').strip('.'))
這就是我用來在“this is”之后提取名稱的內容。
我想提取可能在文本中間的名稱,例如在“name is”和“named”之后,我該怎么做?
如果我理解你,這就是解決方案:
import re
s = "This is Shahab .... my name is Shahab .... he is named Gholam"
names_regex = re.compile(r"[T|t]his\sis\s(\w+)|named\s(\w+)|name\sis\s(\w+)")
names = names_regex.findall(s)
print(names)
Output:
[('Shahab', '', ''), ('', '', 'Shahab'), ('', 'Gholam', '')]
import re
text = "this pooch's name is Pepper. She's a sweet lovable monster. Although she has a lot of good qualities she also pees in the house and won't stop killing birds! Because of that we gave her an 8/10. Not bad."
m = re.search('(?<=name is\s)[A-Za-z]*', text, flags=re.IGNORECASE)
name = m.group(0)
print(name)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.