简体   繁体   English

如何使用Python删除特定单词之前的所有单词(如果有多个特定单词)?

[英]How to remove all words before specific word using Python (if there are multiple specific words)?

I want to remove all words before a specific word. 我想删除特定单词之前的所有单词。 But in my sentence there are some specific word. 但是在我的句子中有一些特定的词。 the following example: 下面的例子:

dvdrentalLOG: statement: SELECT email, actor.last_name, count(actor.last_name) FROM (SELECT email, actor_id FROM (SELECT email, film_id FROM (SELECT email, inventory_id FROM customer as cu JOIN rental ON cu.customer_id = rental.customer_id ORDER BY email) as sq JOIN inventory ON sq.inventory_id = inventory.inventory_id) as sq2 JOIN film_actor ON sq2.film_id = film_actor.film_id) as sq3 JOIN actor ON sq3.actor_id = actor.actor_id GROUP BY email, actor.last_name ORDER BY COUNT(actor.last_name) DESC

In the example above, I want to remove all the words before the first SELECT. 在上面的示例中,我想删除第一个 SELECT之前的所有单词。 I've already tried this How to remove all characters before a specific character in Python? 我已经尝试过如何删除Python中特定字符之前的所有字符?

Any idea what I need to do? 知道我需要做什么吗?

You can use this regex and replace with empty string: 您可以使用此正则表达式并替换为空字符串:

^.+?(?=SELECT)

like this: 像这样:

result = re.sub(r"^.+?(?=SELECT)", "", your_string)

Explanation: 说明:

Because you want to remove everything that's before the first SELECT , the match is going to start at the start of the string ^ . 因为您要删除第一个SELECT之前的所有内容,所以匹配将从字符串^的开头开始。 And then you lazily match any character .+? 然后您懒惰地匹配任何字符.+? , until you see SELECT . ,直到看到SELECT

Alternatively, remove the lookahead and replace with SELECT : 或者,删除前瞻并替换为SELECT

result = re.sub(r"^.+?SELECT", "SELECT", your_string)

EDIT: 编辑:

I found yet another way to do this, with partition : 我找到了另一种方法,使用partition

partitions = your_string.partition("SELECT")
result = partitions[1] + partitions[2]

If you are concerned only with 1st occurence of word it is easy to do. 如果仅关注单词的第一次出现,则很容易做到。 Consider following example 考虑以下示例

import re
txt = 'blah blah blah SELECT something SELECT something another SELECT'
output = re.sub(r'.*?(?=SELECT)','',txt,1)
print(output) #SELECT something SELECT something another SELECT

I used so called zero-length assertion inside pattern, so it is match only if SELECT follow and I give 1 as 4th re.sub argument meaning that there will be only 1 substitution. 我在模式内部使用了所谓的零长度断言,因此只有在SELECT跟在re.sub并且我给1作为第4个re.sub参数时才匹配,这意味着将只有1个替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM