使用正則表達式提取部分字符串？

Question

我有幾百個不整潔的 url 和缺少 tld 擴展名的示例數據。 我一直試圖只提取沒有擴展名的名稱

我的示例數據如下所示：

我期望輸出是：

我正在使用正則表達式來做到這一點，但我仍然是正則表達式的初學者，應用如下：

new = re.findall(r'\.(.+)\.', name_Extract)

任何幫助將不勝感激？

Answer 1

pip install tldextract

在 Python 解釋器中：

import tldextract
tldextract.extract('www.ghi').domain

適用於所有三個示例。 我正在使用 Python 2.7.12。