如何刪除python字符串中包含子字符串的單詞？

Question

當我使用Twitter API時，我得到了幾個包含鏈接的字符串（tweets），即以'http://'開頭的子字符串。

我如何擺脫這些鏈接，就是這樣，我想刪除整個單詞 。

假設我有：

'Mi grupo favorito de CRIMINALISTICA. Ultima clase de cuatrimestre http://t.co/Ad2oWDNd4u'

我想獲得：

'Mi grupo favorito de CRIMINALISTICA. Ultima clase de cuatrimestre'

這些子字符串可能出現在字符串的任何位置

Answer 1

您可以使用re.sub（）將所有鏈接替換為空字符串：

>>> import re
>>> pattern = re.compile('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+')
>>> s = 'Mi grupo favorito de CRIMINALISTICA. Ultima clase de cuatrimestre http://t.co/Ad2oWDNd4u'
>>> pattern.sub('', s)
'Mi grupo favorito de CRIMINALISTICA. Ultima clase de cuatrimestre '

它將替換字符串內所有位置的所有鏈接：

>>> s = "I've used google https://google.com and found a regular expression pattern to find links here https://stackoverflow.com/questions/6883049/regex-to-find-urls-in-string-in-python"
>>> pattern.sub('', s)
"I've used google  and found a regular expression pattern to find links here "

正則表達式是從以下線程獲取的：

正則表達式在Python中以字符串形式查找網址

Answer 2

您可以這樣做：

s[:s.index('http://')-1]

如果它並不總是出現在末尾，則可以執行以下操作：

your_list = s.split()
i = 0
while i < len(your_list):
    if your_list[i].startswith('http://'):
        del your_list[i]
    else:
        i+=1
s = ' '.join(your_list)

如何刪除python字符串中包含子字符串的單詞？

問題描述

2 個解決方案

解決方案1
4 已采納 2014-04-08 03:26:22

解決方案2
0 2014-04-08 03:25:04

如何刪除python字符串中包含子字符串的單詞？

問題描述

2 個解決方案

解決方案1 4 已采納 2014-04-08 03:26:22

解決方案2 0 2014-04-08 03:25:04

解決方案1
4 已采納 2014-04-08 03:26:22

解決方案2
0 2014-04-08 03:25:04