[英]Python regex sub space
CODE: 码:
word = 'aiuhsdjfööäö ; sdfdfd'
word1=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\r\n\f()!{$}.+?|]',"""\[^^0-9\t\r\n\f(!){$}.+?|\]*""", word) ; print 'word= ', word
word2=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\r\n\f()!{$}.+?|]',"""\[^^0-9\\t\\r\\n\\f(!){$}.+?|\]*""", word) ; print 'word= ', word
word3=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\r\n\f()!{$}.+?|]',"""\[^^0-9\\\t\\\r\\\n\\\f(!){$}.+?|\]*""", word) ; print 'word= ', word
word4=re.sub('[^^äÄöÖåÅA-Za-z0-9\s()!{$}.+?|]',"""\[^^0-9\s(!){$}.+?|\]*""", word) ; print 'word= ', word
word5=re.sub('[^^äÄöÖåÅA-Za-z0-9\s()!{$}.+?|]',"""\[^^0-9\\s(!){$}.+?|\]*""", word) ; print 'word= ', word
word6=re.sub('[^^äÄöÖåÅA-Za-z0-9\s()!{$}.+?|]',"""\[^^0-9\\\s(!){$}.+?|\]*""", word) ; print 'word= ', word
F=open('suoriP.txt','w')
F.writelines(word1+'\n\n'+word2+'\n\n'+word3+'\n\n'+word4+'\n\n'+word5+'\n\n'+word6)
F.close
RESULT: 结果:
aiuhsdjfööäö\[^^0-9
(!){$}.+?|\]*\[^^0-9
(!){$}.+?|\]*\[^^0-9
(!){$}.+?|\]*sdfdfd
aiuhsdjfööäö\[^^0-9
(!){$}.+?|\]*\[^^0-9
(!){$}.+?|\]*\[^^0-9
(!){$}.+?|\]*sdfdfd
aiuhsdjfööäö\[^^0-9\ \
\
\(!){$}.+?|\]*\[^^0-9\ \
\
\(!){$}.+?|\]*\[^^0-9\ \
\
\(!){$}.+?|\]*sdfdfd
aiuhsdjfööäö \[^^0-9\s(!){$}.+?|\]* sdfdfd
aiuhsdjfööäö \[^^0-9\s(!){$}.+?|\]* sdfdfd
aiuhsdjfööäö \[^^0-9\s(!){$}.+?|\]* sdfdfd
QUESTION: 题:
I do not understand why: 我不理解为什么:
re does not substitute backslashes, \\s, \\s, \\\\s are all substituted as \\s re不替换反斜杠,\\ s,\\ s,\\\\ s都替换为\\ s
re does not substitute \\\\t\\\\r\\\\n\\\\f for ';' re不会用\\\\ t \\\\ r \\\\ n \\\\ f代替';'
I am trying to generate complicated re patterns with variable names by analyzing a file. 我试图通过分析文件来生成带有变量名的复杂re模式。
I am not able to generate space characters representation [^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]
. 我无法生成空格字符表示形式[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]
。 I mean if I find in the text file ';' 我的意思是如果我在文本文件中找到“;” with word1=re.sub('[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]',....
与word1=re.sub('[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]',....
I am not able to substitute this character ';' 我无法替换此字符“;” by string '[^^äÄöÖåÅA-Za-z0-9\\t\\r\\n\\f()!{$}.+?|]' 通过字符串'[^^äÄööååA-Za-z0-9\\ t \\ r \\ n \\ f()!{$}。+?|]'
This string is a pattern string, which I use in re.search
to extract certain words as variables. 这个字符串是一个模式字符串,我在re.search
使用它来提取某些单词作为变量。
SOLUTION < WHICH EMERGED LATER AND IS ADDED LATER. 解决方案 <后来出现,以后又添加了。
In the end I replaced xxxx instead of space special characters. 最后,我替换了xxxx而不是空格特殊字符。 Later merged, split and merged string by adding '\\t\\n\\f\\v\\r'. 后来通过添加'\\ t \\ n \\ f \\ v \\ r'来合并,拆分和合并字符串。
strsub=smart_str('[^^äÄöÖåÅA-Za-z0-9xxxx()!{$}.+?|`\"£$\%&_+~#\'@><]+', encoding='utf-8', strings_only=False, errors='replace' )
word=re.sub('[^^äÄöÖåÅA-Za-z0-9\t\n\r\f()!{$}.+?|£$\%&_+~#\'@><]+',strsub,word)
for line in word.split('xxxx'):
str2=str2+'\\t\\n\\f\\v\\r'+line
F.writelines(str2)
When you use re.sub
the second part won't be regex -- you simply should group it and call it in \\1
or \\2
for example: 当您使用re.sub
,第二部分将不是正则表达式-您只需将其分组并以\\1
或\\2
进行调用,例如:
word="aiuhsdjfööäö"
word1=re.sub("(.+?)[äa](.+?)","\1a\2 [corrected]",word)
What I did above is completely unnecessary but I did it to show my point that using [
doesn't have to come after \\
when you use it as the second part of re.sub
我上面所做的工作完全没有必要,但是我这样做是为了表明我的观点,当您将[用作re.sub
的第二部分时,不必在\\
之后使用[
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.