簡體   English   中英

正則表達式提取一組單詞

[英]Regular expression to extract a group of words

我想在下表中的每一行的Description列中提取字符串。 由於搜索sting包含空格並且列由空格分隔,因此我不確定如何解析每行中的右側字段。

    Name     PCI Device    Driver  Admin Status  Link Status  Speed  Duplex  MAC Address         MTU  Description
-------  ------------  ------  ------------  -----------  -----  ------  -----------------  ----  ----------------------------------------------------------------
vmnic0   0000:3d:00.0  i40en   Up            Down             0  Half    00:00:00:00:03:14  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic1   0000:3d:00.1  i40en   Up            Down             0  Half    00:00:00:00:03:15  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic10  0000:d9:00.1  ixgben  Up            Down             0  Half    a0:36:9f:d9:b9:11  1500  Intel(R) Ethernet Controller 10G X550
vmnic11  0000:01:00.0  i40en   Up            Down             0  Half    3c:fd:fe:a9:4e:b8  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic12  0000:01:00.1  i40en   Up            Up           10000  Full    3c:fd:fe:a9:4e:b9  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic2   0000:00:1f.6  ne1000  Up            Down             0  Half    88:88:88:88:87:88  1500  Intel Corporation Ethernet Connection (3) I219-LM
vmnic3   0000:3d:00.2  i40en   Up            Down             0  Half    00:00:00:00:03:16  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic4   0000:3d:00.3  i40en   Up            Down             0  Half    00:00:00:00:03:17  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic5   0000:18:00.0  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a8  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic6   0000:18:00.1  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a9  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic7   0000:81:00.0  ixgben  Up            Up           10000  Full    90:e2:ba:1e:b6:24  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic8   0000:81:00.1  ixgben  Up            Down             0  Half    90:e2:ba:1e:b6:25  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic9   0000:d9:00.0  ixgben  Up            Up            1000  Full    a0:36:9f:d9:b9:10  1500  Intel(R) Ethernet Controller 10G X550

看來你的分隔符是“不止一個空格”。 正則表達式為\\s{2,} 所以對於這里的每一行, description = re.split('\\s{2,}', line)[-1]

使用pandas

from io import StringIO
import pandas as pd

TESTDATA = StringIO("""
        Name     PCI Device    Driver  Admin Status  Link Status  Speed  Duplex  MAC Address         MTU  Description
-------  ------------  ------  ------------  -----------  -----  ------  -----------------  ----  ----------------------------------------------------------------
vmnic0   0000:3d:00.0  i40en   Up            Down             0  Half    00:00:00:00:03:14  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic1   0000:3d:00.1  i40en   Up            Down             0  Half    00:00:00:00:03:15  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic10  0000:d9:00.1  ixgben  Up            Down             0  Half    a0:36:9f:d9:b9:11  1500  Intel(R) Ethernet Controller 10G X550
vmnic11  0000:01:00.0  i40en   Up            Down             0  Half    3c:fd:fe:a9:4e:b8  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic12  0000:01:00.1  i40en   Up            Up           10000  Full    3c:fd:fe:a9:4e:b9  1500  Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic2   0000:00:1f.6  ne1000  Up            Down             0  Half    88:88:88:88:87:88  1500  Intel Corporation Ethernet Connection (3) I219-LM
vmnic3   0000:3d:00.2  i40en   Up            Down             0  Half    00:00:00:00:03:16  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic4   0000:3d:00.3  i40en   Up            Down             0  Half    00:00:00:00:03:17  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic5   0000:18:00.0  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a8  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic6   0000:18:00.1  ixgben  Up            Down             0  Half    90:e2:ba:37:50:a9  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic7   0000:81:00.0  ixgben  Up            Up           10000  Full    90:e2:ba:1e:b6:24  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic8   0000:81:00.1  ixgben  Up            Down             0  Half    90:e2:ba:1e:b6:25  1500  Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic9   0000:d9:00.0  ixgben  Up            Up            1000  Full    a0:36:9f:d9:b9:10  1500  Intel(R) Ethernet Controller 10G X550
    """)

df = pd.read_csv(TESTDATA, sep="\s{2,}").iloc[1:]
descriptions = [x for x in df['Description']]

並輸出:

['Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel(R) Ethernet Controller 10G X550',
 'Intel(R) Ethernet Controller XXV710 for 25GbE SFP28',
 'Intel(R) Ethernet Controller XXV710 for 25GbE SFP28',
 'Intel Corporation Ethernet Connection (3) I219-LM',
 'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
 'Intel(R) Ethernet Controller 10G X550']

我想你可以在一個字符串中得到每一行。

>>> s = "vmnic0   0000:3d:00.0  i40en   Up            Down             0  Half    00:00:00:00:03:14  1500  Intel(R) Ethernet Connection X722 for 10GbE SFP+"
>>> row = re.split(r"\s{2,}", s)
>>> description = row[-1]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM