正則表達式匹配兩個符號之間的任何東西

Question

試圖更多地了解Python中的正則表達式，我發現很難在兩個符號（包括這些符號）之間匹配任何字符（包括換行符，制表符，空格等） 。

例如：

foobar89\\n\\nfoo\\tbar; '''blah blah blah'8&^"''' foobar89\\n\\nfoo\\tbar; '''blah blah blah'8&^"'''需要匹配''blah blah blah'8&^"'''
fjfdaslfdj; '''blah\\n blah\\n\\t\\t blah\\n'8&^"''' fjfdaslfdj; '''blah\\n blah\\n\\t\\t blah\\n'8&^"'''需要匹配'''blah\\n blah\\n\\t\\t blah\\n'8&^"'''

（請注意，用\\n和\\t符號表示文本文件中的換行符和制表符空格）

在這個問題之后，我嘗試了這個^.*\\'''(.*)\\'''.*$和這個*?\\'''(.*)\\'''.*沒有成功。

有人可以指導我做錯什么嗎？ 我也希望任何簡短的解釋。

另外，為了理解轉義特殊字符的概念，我想知道是否通過在正則表達式中替換兩個符號（例如從'''到"""或*** ）是否仍然可以正常工作（對於相關字符串）？

例如

fjfdaslfdj; """blah\\n blah\\n\\t\\t blah\\n'8&^""" fjfdaslfdj; """blah\\n blah\\n\\t\\t blah\\n'8&^"""需要匹配"""blah\\n blah\\n\\t\\t blah\\n'8&^"""

更新

我正在嘗試測試regexes的代碼（從此處獲取和修改）：

import collections
import re

Token = collections.namedtuple('Token', ['typ', 'value', 'line', 'column'])

def tokenize(code):
    token_specification = [
        # regexes suggested from [Thomas Ayoub][3]
        ('BOTH',      r'([\'"]{3}).*?\2'), # for both triple-single quotes and triple-double quotes
        ('SINGLE',    r"('''.*?''')"),     # triple-single quotes 
        ('DOUBLE',    r'(""".*?""")'),     # triple-double quotes 
        # regexes which match OK
        ('COM',       r'#.*'),
        ('NUMBER',  r'\d+(\.\d*)?'),  # Integer or decimal number
        ('ASSIGN',  r':='),           # Assignment operator
        ('END',     r';'),            # Statement terminator
        ('ID',      r'[A-Za-z]+'),    # Identifiers
        ('OP',      r'[+\-*/]'),      # Arithmetic operators
        ('NEWLINE', r'\n'),           # Line endings
        ('SKIP',    r'[ \t]+'),       # Skip over spaces and tabs
        ('MISMATCH',r'.'),            # Any other character
    ]

    test_regexes = ['COM', 'BOTH', 'SINGLE', 'DOUBLE']

    tok_regex = '|'.join('(?P<%s>%s)' % pair for pair in token_specification)
    line_num = 1
    line_start = 0
    for mo in re.finditer(tok_regex, code):
        kind = mo.lastgroup
        value = mo.group(kind)
        if kind == 'NEWLINE':
            line_start = mo.end()
            line_num += 1
        elif kind == 'SKIP':
            pass
        elif kind == 'MISMATCH':
            pass
        else:
            if kind in test_regexes:
                print(kind, value)
            column = mo.start() - line_start
            yield Token(kind, value, line_num, column)

f = r'C:\path_to_python_file_with_above_examples'

with open(f) as sfile:
    content = sfile.read()

for t in tokenize(content):
    pass #print(t)

Answer 1

您可以選擇：

((['"]{3}).*?\2)

查看實時運行的python或實時運行的正則表達式

^.*\\'''(.*)\\'''.*$ =>您在行首/行尾添加了錨點，這在需要多行匹配時不起作用
*?\\'''(.*)\\'''.* =>語法錯誤
re.compile(ur'(([\\'"]{3}).*?\\2)', re.MULTILINE | re.DOTALL) => re.DOTALL使.匹配新行。

正則表達式匹配兩個符號之間的任何東西

問題描述

1 個解決方案

解決方案1
2 已采納 2016-06-20 18:04:43

正則表達式匹配兩個符號之間的任何東西

問題描述

1 個解決方案

解決方案1 2 已采納 2016-06-20 18:04:43

解決方案1
2 已采納 2016-06-20 18:04:43