簡體   English   中英

如何使用選定的空格拆分字符串 Python

[英]How to Split String with Selected White Spaces Python

我正在嘗試在 python 中拆分以下字符串。是否可以在給定相應輸入的情況下實現以下 output?

輸入

Platforms: Linux Applies to versions: 10.0 Upgrades to: 10.0 Severity: 10 - High Impact/High Probability of Occurrence \Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability Abstract: SqlGuard Patch 10.0p4052 Sniffer Update

Output

Platforms: Linux
Applies to versions: 10.0
Upgrades to: 10.0
Severity: 10 - High Impact/High Probability of Occurrence 
Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability 
Abstract: SqlGuard Patch 10.0p4052 Sniffer Update

由於字段是固定的,所以拆分字段而不是空格:

>>> fields = [
...     "Platforms: ",
...     "Applies to versions: ",
...     "Upgrades to: ",
...     "Severity: ",
...     "Categories: ",
...     "Abstract: ",
... ]
>>> import re
>>> for k,v in zip(fields, re.split("|".join(fields), s)[1:]):
...     print(k + v)
...
Platforms: Linux
Applies to versions: 10.0
Upgrades to: 10.0
Severity: 10 - High Impact/High Probability of Occurrence
Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability
Abstract: SqlGuard Patch 10.0p4052 Sniffer Update

由於其他答案依賴於已知的字段列表,讓我們嘗試一個先驗知道字段的解決方案:

import re

string = r"Platforms: Linux Applies to versions: 10.0 Upgrades to: 10.0 Severity: 10 - High Impact/High Probability of Occurrence \Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability Abstract: SqlGuard Patch 10.0p4052 Sniffer Update"

iterable = iter(re.split(r"([A-Z][a-z ]+:)", string)[1:])  # "Applies to versions:"

for field in iterable:
    print(field, next(iterable), sep='')

OUTPUT

> python3 test.py
Platforms: Linux 
Applies to versions: 10.0 
Upgrades to: 10.0 
Severity: 10 - High Impact/High Probability of Occurrence \
Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability 
Abstract: SqlGuard Patch 10.0p4052 Sniffer Update
>

你能解釋一下正則表達式背后的邏輯嗎?

我們正在做一個re.split() ,但帶有保留括號,以便我們拆分的任何模式都得到保留。 所有字段名稱的模式都是相同的,例如"Applies to versions:"

(  # retain split pattern match
[A-Z]  # starts with a capital letter
[a-z ]+  # continues with lower case letters and spaces
:  # a colon marks the end of the field name
)

當我們執行re.split()時,字符串實際上以模式匹配開始,這導致re.split()在第一個項目之前返回一個空字段,因此re.split(...)[1:]扔掉第一個空物品。 我們現在有一個字段名稱和字段主體的列表,我們使用迭代器成對地遍歷它們。

我正在嘗試在 python 中拆分以下字符串。 給定相應的輸入,是否可以實現以下 output?

輸入

Platforms: Linux Applies to versions: 10.0 Upgrades to: 10.0 Severity: 10 - High Impact/High Probability of Occurrence \Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability Abstract: SqlGuard Patch 10.0p4052 Sniffer Update

Output

Platforms: Linux
Applies to versions: 10.0
Upgrades to: 10.0
Severity: 10 - High Impact/High Probability of Occurrence 
Categories: Availability, Compatibility, Data, Function, Performance, Security Vulnerability (Sec/Int), Serviceability, Usability 
Abstract: SqlGuard Patch 10.0p4052 Sniffer Update

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM