简体   繁体   English

在多行python上找到模式

[英]find pattern on multiline python

I spent two days tried to build regular expression to find two words/numbers occur Sequentially on different lines. 我花了两天时间尝试构建正则表达式,以发现两个单词/数字在不同的行上依次出现。 I has a file that has text like: 我有一个文本如下的文件:

  1  [pid 29743] 18:58:19 prctl(PR_CAPBSET_DROP, 0x9, 0, 0, 0 <unfinished ...>
  2  [pid 29746] 18:58:19 <... mprotect resumed> ) = 0
  3  [pid 29743] 18:58:19 <... prctl resumed> ) = 0
  4  [pid   615] 18:58:19 <... ioctl resumed> , 0xffffffffffb4f054) = 0
  5  [pid 29743] 18:58:19 prctl(PR_CAPBSET_READ, 0xa, 0, 0, 0 <unfinished ...>
  6  [pid   615] 18:58:19 ioctl(13, 0x40047703 <unfinished ...>
  7  [pid 29743] 18:58:19 <... prctl resumed> ) = 1
  8  [pid 29746] 18:58:19 mprotect(0xfffffffff4ae2000, 4096, PROT_NONE <unfinished ...>
  9  [pid 29743] 18:58:19 prctl(PR_CAPBSET_DROP, 0xa, 0, 0, 0 <unfinished ...>
  10 [pid   615] 18:58:19 <... ioctl resumed> , 0x7fd19062e0) = 0
  11 [pid 29743] 18:58:19 <... prctl resumed> ) = 0
  12 [pid 29746] 18:58:19 <... mprotect resumed> ) = 0
  13 [pid 29743] 18:58:19 prctl(PR_CAPBSET_READ, 0xb, 0, 0, 0 <unfinished ...>
  14 [pid 29746] 18:58:19 ioctl(13, 0x40047703, 
  <unfinished ...>
  15 [pid 29743] 18:58:19 <... prctl resumed> ) = 1
  16 [pid   615] 18:58:19 <... ioctl resumed> , 0x7fd19064b0) = 0

I am looking for two values 0x7fd19062e0 and 0x7fd19064b0 that has appeared Sequentially on text. 我正在寻找两个顺序出现在文本上的值0x7fd19062e0和0x7fd19064b0。 They have appeared at line 10 and 16. I want to build regular expression that tells me if appeared or not Sequentially Here is my code 它们出现在第10行和第16行。我想构建一个正则表达式,告诉我是否依次出现。这是我的代码

    file = open("text.txt", a+)
    for line in file:
        text += line
    if re.findall(r"^.*0x7fd19062e0.*0x7fd19064b0", text, re.M):
                       print 'found a match!'
                    else:
                       print 'no match'

re.M modifies the behavior of ^ and $ anchors. re.M修改^$锚点的行为。 For the "dot matches newline" option, you need re.S . 对于“点匹配换行”选项,您需要re.S Also, if you just want to find if there is a match, don't use re.findall() : 另外,如果您只想查找是否存在匹配项,请不要使用re.findall()

file = open("text.txt")  # why append mode?
text = file.read()       # no need to build the string line by line
if re.search(r"\b0x7fd19062e0\b.*\b0x7fd19064b0\b", text, re.S):
     print 'found a match!'
else:
     print 'no match' 

Note that I added word boundary anchors to ensure that only entire hex numbers are matched (otherwise, submatches of longer numbers would be possible). 请注意,我添加了单词边界锚以确保仅匹配整个十六进制数字(否则,可以进行更长数字的子匹配)。 This may or may not be relevant in your case, but it's probably good practice. 这可能与您的情况无关,但可能是个好习惯。

No need for RE: 无需RE:

f = open('text.txt')
numerated_lines = enumerate(f.readlines())
lines_with_pattern = filter(lambda l: '0x7fd19062e0' in l[1], enumerated_lines)
pairs = zip(lines, lines[1:])
result = filter(lambda p: p[0][0]+1 == p[1][0], pairs)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM