python正则表达式搜索模式

Question

I'm searching a block of text for a newline followed by a period. 我正在搜索一段换行符后跟一个句点的文本。

pat = '\n\.'
block = 'Some stuff here. And perhaps another sentence here.\n.Some more text.'

For some reason when I use regex to search for my pattern it changes the value of pat (using Python 2.7). 由于某种原因，当我使用正则表达式搜索模式时，它会更改pat的值（使用Python 2.7）。

import re
mysrch = re.search(pat, block)

Now the value of pat has been changed to: 现在pat的值已更改为：

'\n\\.'

Which is messing with the next search that I use pat for. 这与我使用pat进行的下一次搜索搞混了。 Why is this happening, and how can I avoid it? 为什么会发生这种情况，我该如何避免呢？

Thanks very much in advance in advance. 提前非常感谢。

Answer 1

The extra slash isn't actually part of the string - the string itself hasn't changed at all. 多余的斜杠实际上不是字符串的一部分-字符串本身完全没有改变。

Here's an example: 这是一个例子：

>>> pat = '\n\.'
>>> pat
'\n\\.'
>>> print pat

\.

As you can see, when you print pat, it's only got one \\ in it. 如您所见，当您打印pat时，其中只有一个\\ 。 When you dump the value of a string it uses the __repr__ function which is designed to show you unambiguously what is in the string, so it shows you the escaped version of characters. 当您转储字符串的值时，它将使用__repr__函数，该函数旨在清楚地向您显示字符串中的内容，因此它向您显示字符的转义版本。 Like \\n is the escaped version of a newline, \\\\ is the escaped version of \\ . 就像\\n是换行符的转义版本一样， \\\\是\\的转义版本。

Your regex is probably not matching how you expect because it has an actual newline character in it, not the literal string "\\n" (as a repr: "\\\\n" ). 您的正则表达式可能与您的期望不符，因为其中有一个实际的换行符，而不是文字字符串"\\n" （作为代表： "\\\\n" ）。

You should either make your regex a raw string (as suggested in the comments). 您应该使正则表达式成为原始字符串（如注释中所建议）。

>>> pat = r"\n\."
>>> pat
'\\n\\.'
>>> print pat
\n\.

Or you could just escape the slashes and use 或者您可以逃脱斜线并使用

pat = "\\n\\."

python正则表达式搜索模式

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-09-12 21:37:12

python正则表达式搜索模式

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-09-12 21:37:12

解决方案1
1 已采纳 2014-09-12 21:37:12