Python正则表达式子空间

Question

CODE: 码：

word = 'aiuhsdjfööäö ; sdfdfd'
word1=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\r\n\f()!{$}.+?|]',"""\[^^0-9\t\r\n\f(!){$}.+?|\]*""", word) ; print 'word=  ', word
word2=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\r\n\f()!{$}.+?|]',"""\[^^0-9\\t\\r\\n\\f(!){$}.+?|\]*""", word) ; print 'word=  ', word
word3=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\r\n\f()!{$}.+?|]',"""\[^^0-9\\\t\\\r\\\n\\\f(!){$}.+?|\]*""", word) ; print 'word=  ', word
word4=re.sub('[^^äÄöÖåÅA-Za-z0-9\s()!{$}.+?|]',"""\[^^0-9\s(!){$}.+?|\]*""", word) ; print 'word=  ', word
word5=re.sub('[^^äÄöÖåÅA-Za-z0-9\s()!{$}.+?|]',"""\[^^0-9\\s(!){$}.+?|\]*""", word) ; print 'word=  ', word
word6=re.sub('[^^äÄöÖåÅA-Za-z0-9\s()!{$}.+?|]',"""\[^^0-9\\\s(!){$}.+?|\]*""", word) ; print 'word=  ', word

F=open('suoriP.txt','w')
F.writelines(word1+'\n\n'+word2+'\n\n'+word3+'\n\n'+word4+'\n\n'+word5+'\n\n'+word6)
F.close

RESULT: 结果：

aiuhsdjfööäö\[^^0-9 

(!){$}.+?|\]*\[^^0-9    

(!){$}.+?|\]*\[^^0-9    

(!){$}.+?|\]*sdfdfd

aiuhsdjfööäö\[^^0-9 

(!){$}.+?|\]*\[^^0-9    

(!){$}.+?|\]*\[^^0-9    

(!){$}.+?|\]*sdfdfd

aiuhsdjfööäö\[^^0-9\    \
\
\(!){$}.+?|\]*\[^^0-9\  \
\
\(!){$}.+?|\]*\[^^0-9\  \
\
\(!){$}.+?|\]*sdfdfd

aiuhsdjfööäö \[^^0-9\s(!){$}.+?|\]* sdfdfd

aiuhsdjfööäö \[^^0-9\s(!){$}.+?|\]* sdfdfd

aiuhsdjfööäö \[^^0-9\s(!){$}.+?|\]* sdfdfd

QUESTION: 题：

I do not understand why: 我不理解为什么：

re does not substitute backslashes, \\s, \\s, \\\\s are all substituted as \\s re不替换反斜杠，\\ s，\\ s，\\\\ s都替换为\\ s
re does not substitute \\\\t\\\\r\\\\n\\\\f for ';' re不会用\\\\ t \\\\ r \\\\ n \\\\ f代替';'

I am trying to generate complicated re patterns with variable names by analyzing a file. 我试图通过分析文件来生成带有变量名的复杂re模式。

I am not able to generate space characters representation [^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|] . 我无法生成空格字符表示形式[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|] 。 I mean if I find in the text file ';' 我的意思是如果我在文本文件中找到“;” with word1=re.sub('[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]',.... 与word1=re.sub('[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]',....

I am not able to substitute this character ';' 我无法替换此字符“;” by string '[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]' 通过字符串'[^^äÄööååA-Za-z0-9\\ t \\ r \\ n \\ f（）！{$}。+？|]'

This string is a pattern string, which I use in re.search to extract certain words as variables. 这个字符串是一个模式字符串，我在re.search使用它来提取某些单词作为变量。

SOLUTION < WHICH EMERGED LATER AND IS ADDED LATER. 解决方案 <后来出现，以后又添加了。

In the end I replaced xxxx instead of space special characters. 最后，我替换了xxxx而不是空格特殊字符。 Later merged, split and merged string by adding '\\t\\n\\f\\v\\r'. 后来通过添加'\\ t \\ n \\ f \\ v \\ r'来合并，拆分和合并字符串。

strsub=smart_str('[^^äÄöÖåÅA-Za-z0-9xxxx()!{$}.+?|`\"£$\%&_+~#\'@><]+', encoding='utf-8', strings_only=False, errors='replace' )
word=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\n\r\f()!{$}.+?|£$\%&_+~#\'@><]+',strsub,word)

for line in word.split('xxxx'):
     str2=str2+'\\t\\n\\f\\v\\r'+line 
     F.writelines(str2)

Answer 1

When you use re.sub the second part won't be regex -- you simply should group it and call it in \\1 or \\2 for example: 当您使用re.sub ，第二部分将不是正则表达式-您只需将其分组并以\\1或\\2进行调用，例如：

 word="aiuhsdjfööäö"
 word1=re.sub("(.+?)[äa](.+?)","\1a\2 [corrected]",word)

What I did above is completely unnecessary but I did it to show my point that using [ doesn't have to come after \\ when you use it as the second part of re.sub 我上面所做的工作完全没有必要，但是我这样做是为了表明我的观点，当您将[用作re.sub的第二部分时，不必在\\之后使用[

Python正则表达式子空间

问题描述

1 个解决方案

解决方案1
0 2013-07-18 15:35:10

Python正则表达式子空间

问题描述

1 个解决方案

解决方案1 0 2013-07-18 15:35:10

解决方案1
0 2013-07-18 15:35:10