我想删除特定单词之前的所有单词。但在我的句子中有一些特定的词。以下示例:
dvdrentalLOG: statement: SELECT email, actor.last_name, count(actor.last_name) FROM (SELECT email, actor_id FROM (SELECT email, film_id FROM (SELECT email, inventory_id FROM customer as cu JOIN rental ON cu.customer_id = rental.customer_id ORDER BY email) as sq JOIN inventory ON sq.inventory_id = inventory.inventory_id) as sq2 JOIN film_actor ON sq2.film_id = film_actor.film_id) as sq3 JOIN actor ON sq3.actor_id = actor.actor_id GROUP BY email, actor.last_name ORDER BY COUNT(actor.last_name) DESC
在上面的例子中,我想删除第一个SELECT之前的所有单词。我已经尝试过这个How to remove all characters before a specific character in Python?
知道我需要做什么吗?
您可以使用此正则表达式并替换为空字符串:
^.+?(?=SELECT)
像这样:
result = re.sub(r"^.+?(?=SELECT)", "", your_string)
说明:
因为你想删除第一个SELECT
之前的所有内容,匹配将从字符串^
的开头开始。然后你懒洋洋地匹配任何角色.+?
,直到你看到SELECT
。
或者,删除前瞻并替换为SELECT
:
result = re.sub(r"^.+?SELECT", "SELECT", your_string)
编辑:
我找到了另一种方法,用partition
:
partitions = your_string.partition("SELECT")
result = partitions[1] + partitions[2]
如果您只关心第一次出现的单词,那么很容易做到。考虑以下示例
import re
txt = 'blah blah blah SELECT something SELECT something another SELECT'
output = re.sub(r'.*?(?=SELECT)','',txt,1)
print(output) #SELECT something SELECT something another SELECT
我在模式中使用了所谓的零长度断言,所以只有当SELECT
跟随并且我给1
作为第4个re.sub
参数时才匹配,这意味着只有1个替换。