[英]Remove part of the string using Regex in python
I have text file which contains the information in below format. 我有一个文本文件,其中包含以下格式的信息。
2018/03/21-17:08:48.638553 508 7FF4A8F3D704 snononsonfvnosnovoosr
2018/03/21-17:08:48.985053 346K 7FE9D2D51706 ahelooa afoaona woom
2018/03/21-17:08:50.486601 1.5M 7FE9D3D41706 qojfcmqcacaeia
2018/03/21-17:08:50.980519 16K 7FE9BD1AF707 user: number is 93823004
2018/03/21-17:08:50.981908 1389 7FE9BDC2B707 user 7fb31ecfa700
2018/03/21-17:08:51.066967 0 7FE9BDC91700 Exit Status = 0x0
2018/03/21-17:08:51.066968 1 7FE9BDC91700 std:ZMD:
Expected Result 预期结果
I want to remove part of the string till 3rd space (that is 7FF4A8F3D704). 我想删除字符串的一部分直到第三个空格(即7FF4A8F3D704)。 Result should look like 结果应该看起来像
snononsonfvnosnovoosr
ahelooa afoaona woom
qojfcmqcacaeia
user: number is 93823004
user 7fb31ecfa700
Exit Status = 0x0
std:ZMD:
Solution 解
I can remove "2018/03/21-17:08:48.638553" with the below code. 我可以使用以下代码删除“ 2018/03 / 21-17:08:48.638553”。 But I am trying to replace the whole part with ''. 但是我试图用''代替整个部分。
import re
Regex_list = [r'\d{4}/\d{2}/\d{2}-\d{2}:\d{2}:\d{2}.\d{6}']
for p in Regex_list:
text = re.sub(p, ' ', file)
If this is the exact structure of your text file, why don't you simply cut off the first n uninteresting characters? 如果这是文本文件的确切结构,那么为什么不简单地剪切掉前n个无趣的字符呢?
for line in txt.splitlines():
print(line[53:])
#snononsonfvnosnovoosr
#ahelooa afoaona woom
#qojfcmqcacaeia
#user: number is 93823004
#user 7fb31ecfa700
#Exit Status = 0x0
#std:ZMD:
Because it looks like the first 3 column values won't ever have any spaces in them, match \\S+\\s+
to get a column value and its associated space-padding to the right, and repeat it 3 times: 因为看起来前三个列的值中将永远没有空格,所以匹配\\S+\\s+
可获得列值及其右侧的相关空格填充,然后重复3次:
output = re.sub(r'(?m)^(?:\S+\s+){3}', '', input)
https://regex101.com/r/YHXTJs/1 https://regex101.com/r/YHXTJs/1
Another approach that uses re.split()
(and limits the split to 3 splits). 另一种使用re.split()
(并将拆分限制为3个拆分)。 This assumes that there are no spaces in the first three fields. 假定前三个字段中没有空格。
It splits on 1 or more spaces. 它在1个或多个空格上分割。
for data in L.splitlines():
print(re.split(r'\s+', data, 3)[-1])
Output: 输出:
snononsonfvnosnovoosr
ahelooa afoaona woom
qojfcmqcacaeia
user: number is 93823004
user 7fb31ecfa700
Exit Status = 0x0
std:ZMD:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.