re.split（）有特殊情況

Question

我是正則表達式的新手，並且有關於re.split功能的問題。

在我的情況下，分裂必須關注“特殊逃脫”。

案文應分開; ，除了有領先? 。

編輯：在這種情況下，兩個部分不應該拆分和? 必須刪除。

這是一個例子和我希望的結果：

import re
txt = 'abc;vwx?;yz;123'
re.split(r'magical pattern', txt)
['abc', 'vwx;yz', '123']

我到目前為止試過這些嘗試：

re.split(r'(?<!\?);', txt)

得到了：

['abc', 'vwx?;yz', '123']

可悲的是導致沒消耗? 麻煩和以下列表理解是性能關鍵：

[part.replace('?;', ';') for part in re.split(r'(?<!\?);', txt)]
['abc', 'vwx;yz', '123']

是否有一種“快速”的方式來重現這種行為？

re.findall函數可以成為解決方案嗎？

例如，此代碼的擴展版本：

re.findall(r'[^;]+', txt)

我使用的是python 2.7.3。

感謝你在期待！

Answer 1

正則表達式不是這項工作的工具。 改為使用csv模塊：

>>> txt = 'abc;vwx?;yz;123'
>>> r = csv.reader([txt], delimiter=';', escapechar='?')
>>> next(r)
['abc', 'vwx;yz', '123']

Answer 2

你不能用一個正則表達式做你想做的事。 Unescaping ?; 拆分后完全是一個單獨的任務，而不是在同時拆分時可以讓re模塊為你完成的任務。

只需將任務分開; 您可以使用生成器為您進行取消操作：

def unescape(iterable):
    for item in iterable:
        yield item.replace('?;', ';')

for elem in unescape(re.split(r'(?<!\?);', txt)):
    print elem

但這不會比你的列表理解更快。

Answer 3

我會這樣做：

 re.sub('(?<!\?);',r'|', txt).replace('?;',';').split('|')

Answer 4

試試這個：-）

def split( txt, sep, esc, escape_chars):
    ''' Split a string
        txt - string to split
        sep - separator, one character
        esc - escape character
        escape_chars - List of characters allowed to be escaped
    '''
    l = []
    tmp = []
    i = 0
    while i < len(txt):
        if len(txt) > i + 1 and txt[i] == esc and txt[i+1] in escape_chars:
            i += 1
            tmp.append(txt[i])
        elif txt[i] == sep:
            l.append("".join(tmp))
            tmp = []
        elif txt[i] == esc:
            print('Escape Error')
        else:
            tmp.append(txt[i])
        i += 1
    l.append("".join(tmp))
    return l

if __name__ == "__main__":
    txt = 'abc;vwx?;yz;123'
    print split(txt, ';', '?', [';','\\','?'])

返回：

['abc', 'vwx;yz', '123']

re.split（）有特殊情況

問題描述

4 個解決方案

解決方案1
5 2013-03-22 17:03:33

解決方案2
0 已采納 2013-03-22 16:54:29

解決方案3
0 2013-03-22 17:06:59

解決方案4
0 2013-03-25 18:14:35

re.split（）有特殊情況

問題描述

4 個解決方案

解決方案1 5 2013-03-22 17:03:33

解決方案2 0 已采納 2013-03-22 16:54:29

解決方案3 0 2013-03-22 17:06:59

解決方案4 0 2013-03-25 18:14:35

解決方案1
5 2013-03-22 17:03:33

解決方案2
0 已采納 2013-03-22 16:54:29

解決方案3
0 2013-03-22 17:06:59

解決方案4
0 2013-03-25 18:14:35