re.findall 在 python 的列表中

Question

我有一個清單如下。

sample_text = ['199.72.81.55 -- [01/Jul/1995:00:00:01 -0400] "Get /histpry/appollo/HTTP/1.0" 200 6245',
    'unicomp6.unicomp.net -- [01/Jul/1995:00:00:06 -0400] "Get /shuttle/countdown/HTTP/1.0" 200 3985', 
    '199.120.110.21 -- [01/Jul/1995:00:00:01 -0400] "Get /histpry/appollo/HTTP/1.0" 200 6245',
    'burger.letters.com -- [01/Jul/1995:00:00:06 -0400] "Get /shuttle/countdown/HTTP/1.0" 200 3985', 
    '205.172.11.25 -- [01/Jul/1995:00:00:01 -0400] "Get /histpry/appollo/HTTP/1.0" 200 6245']

我需要在列表中獲取所有主機名。 預期結果如下。

['199.72.81.55', 'unicomp6.unicomp.net', '199.120.110.21', 'burger.letters.com', '205.172.11.25']

我的代碼是：

for i in range(0, len(sample_text)):
    s=sample_text[i]
    host.append(re.findall('[\d]*[.][\d]*[.][\d]*[.][\d]*|[a-z0-9]*[.][a-z]*[.][a-z]*', s))
print(host)

我的 output：

[['199.72.81.55'], ['unicomp6.unicomp.net'], ['199.120.110.21'], ['burger.letters.com'], ['205.172.11.25']]

我該如何解決？

Answer 1

在不使用正則表達式的情況下，您可以在'--'上進行str.split並獲取第一部分

>>> [i.split('--')[0].strip() for i in sample_text]
['199.72.81.55', 'unicomp6.unicomp.net', '199.120.110.21', 'burger.letters.com', '205.172.11.25']

類似的想法，但使用正則表達式

>>> import re
>>> [re.match(r'(.*) -- .*', i).group(1) for i in sample_text]
['199.72.81.55', 'unicomp6.unicomp.net', '199.120.110.21', 'burger.letters.com', '205.172.11.25']

在這兩種情況下，您都可以使用列表理解來替換您的for循環

Answer 2

您可以輕松展平host ：

host = []
for i in range(0, len(sample_text)):
    s=sample_text[i]
    host += re.findall('[\d]*[.][\d]*[.][\d]*[.][\d]*|[a-z0-9]*[.][a-z]*[.][a-z]*', s)
print(host)

Output：

['199.72.81.55', 'unicomp6.unicomp.net', '199.120.110.21', 'burger.letters.com', '205.172.11.25']

Answer 3

re.findall()返回一個字符串列表。

文檔： https://docs.python.org/3/library/re.html#re.findall

.append會將列表作為單個項目添加到新列表中。

嘗試：

host.extend(

文檔： https://docs.python.org/3/tutorial/datastructures.html

Answer 4

我只是使用 .extend 而不是 append 解決了這個問題。

host.extend(re.findall('[\d]*[.][\d]*[.][\d]*[.][\d]*|[a-z0-9]*[.][a-z]* 
             [.][a-z]*', s))

Answer 5

也許嘗試這樣的事情：

sum(host, [])

re.findall 在 python 的列表中

問題描述

5 個解決方案

解決方案1
4 2020-06-10 11:31:48

解決方案2
2 已采納 2020-06-10 11:33:22

解決方案3
0 2020-06-10 11:36:31

解決方案4
0 2020-06-10 11:40:51

解決方案5
-1 2020-06-10 11:32:35

re.findall 在 python 的列表中

問題描述

5 個解決方案

解決方案1 4 2020-06-10 11:31:48

解決方案2 2 已采納 2020-06-10 11:33:22

解決方案3 0 2020-06-10 11:36:31

解決方案4 0 2020-06-10 11:40:51

解決方案5 -1 2020-06-10 11:32:35

解決方案1
4 2020-06-10 11:31:48

解決方案2
2 已采納 2020-06-10 11:33:22

解決方案3
0 2020-06-10 11:36:31

解決方案4
0 2020-06-10 11:40:51

解決方案5
-1 2020-06-10 11:32:35