简体   繁体   English

正则表达式python字符串忽略特殊字符

[英]regex python string ignore special character

This is what i have now : 这就是我现在拥有的:

import re

x = "From: Joyce IP: 192.111.1.1 Source: 192.168.1.1"    
x = x.replace(' ', '')
m = re.findall('(?<=:)\S+', x)
print m 

And I want to have a output like this to make this $ script.py > result.txt : 我希望有一个这样的输出来制作这个$ script.py> result.txt

Joyce 192.111.1.1 192.168.1.1

Instead of finding the matches of the text you want as the result, it may be easier to replace the stuff you don't want: 而不是找到你想要的文本的匹配结果,更换你不想要的东西可能更容易:

>>> import re
>>> x = "From: Joyce IP: 192.111.1.1 Source: 192.168.1.1"
>>> re.sub(r'\w+:\s', '', x)
'Joyce 192.111.1.1 192.168.1.1'

However, if you prefer to use re.findall() here is one option that is similar to your current approach: 但是,如果您更喜欢使用re.findall()这里有一个类似于您当前方法的选项:

>>> ' '.join(re.findall(r'(?<=:\s)\S+', x))
'Joyce 192.111.1.1 192.168.1.1'

You need the \\s in the negative lookbehind because there is a space after each of the colons in your input string. 你需要在负向lookbehind中的\\s ,因为输入字符串中的每个冒号后面都有一个空格。

a slight change to your code (don't remove the spaces, and include them in the look behind) works perfectly: 对代码稍作修改(不要删除空格,并将它们包含在后面的外观中)完美地运行:

import re

x = "From: Joyce IP: 192.111.1.1 Source: 192.168.1.1"    
m = re.findall('(?<=:\s)\S+', x)
print " ".join(m) 
import re

x = "From: Joyce IP: 192.111.1.1 Source: 192.168.1.1"    

reg = r"\d{1,3}(?:[.]\d+){3}"

m = re.findall(reg, x)

for i in m:
  print(i)

Result : 192.111.1.1 192.168.1.1 结果:192.111.1.1 192.168.1.1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM