简体   繁体   English

在python正则表达式中指定匹配换行的不同方法

[英]Different way to specify matching new line in Python regex

I find out there are different ways to match a new line in python regex. 我发现有多种方法可以匹配python regex中的新行。 For example, all patterns used in the code below can match a new line 例如,以下代码中使用的所有模式都可以匹配换行

str = 'abc\n123'
pattern = '\n'   # print outputs new line 
pattern2 = '\\n' # print outputs \n
pattern3 = '\\\n' # print outputs \ and new line
pattern4 = r'\n'  # print outputs \n
s = re.search(pattern, str).group()
print ('a' + s + 'a')

I have 2 questions about this: 我对此有2个问题:

  1. pattern is a new line, pattern2 and pattern4 is \\n. pattern是换行,pattern2和pattern4是\\ n。 Why python regex generates the same pattern for different string? 为什么python regex为不同的字符串生成相同的模式?

  2. Not sure why pattern3 also generates the same pattern. 不知道为什么pattern3也会生成相同的模式。 When passed to re parser, pattern3 stands for \\ + new line, why re parser translates that into just matching new line? 当传递给重新解析器时,pattern3代表\\ +新行,为什么重新解析器将其转换为仅匹配新行?

I am using Python 3 我正在使用Python 3

The combo \\n indicates a 'newline character' in both Python itself and in re expressions as well ( https://docs.python.org/2.0/ref/strings.html ). 组合\\n在Python本身 re表达式( https://docs.python.org/2.0/ref/strings.html )中指示“换行符”。

In a regular Python string, \\n gets translated to a newline. 在常规Python字符串中, \\n会转换为换行符。 The newline code is then fed into the re parser as a literal character. 然后换行被馈送到re解析器作为文字字符。

A double backslash in a Python string gets translated to a single one. Python字符串中的反斜杠将转换为单个。 Therefore, a string "\\\\n" gets stored internally as "\\n" , and when sent to the re parser, it in turn recognizes this combo \\n as indicating a newline code. 因此,一个字符串"\\\\n"被内部存储为"\\n" ,而当发送到re分析器, 反过来又承认这个组合\\n作为表示新行代码。

The r notation is a shortcut to prevent having to enter double double backslashes: r表示法是避免必须输入双反斜杠的快捷方式:

backslashes are not handled in any special way in a string literal prefixed with 'r' ( https://docs.python.org/2/library/re.html ) 反斜杠不会以任何特殊方式处理以'r'开头的字符串文字( https://docs.python.org/2/library/re.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM