简体   繁体   English

您如何在python中的字符串中查找文本,然后在其后查找数字?

[英]How would you find text in a string in python and then look for a number after it?

I have a log file and at the end of each line in the file there is this string: Line:# where # is the line number. 我有一个日志文件,文件的每一行的末尾都有以下字符串: Line:#其中#是行号。

I am trying to get the # and compare it to the previous line's number. 我正在尝试获取#并将其与上一行的数字进行比较。 what would be the best way to do that in python? 在python中做到这一点的最佳方法是什么?

I would probably use str.split because it seems easy: 我可能会使用str.split因为它看起来很简单:

with open('logfile.log') as fin:
    numbers = [ int(line.split(':')[-1]) for line in fin ]

Now you can use zip to compare one number with the next one: 现在,您可以使用zip将一个数字与下一个数字进行比较:

for num1,num2 in zip(numbers,numbers[1:]):
    compare(num1,num2)  #do comparison here.

Of course, this isn't lazy (you store every line number in the file at once when you really only need 2 at a time), so it might take up a lot of memory if your files are HUGE . 当然,这不是懒(您存储文件中的每一行号,一旦当你真的只需要2在同一时间),所以如果你的文件是巨大的 ,可能会占用大量的内存。 It wouldn't be hard to make it lazy though: 不过,让它变得懒惰并不难:

def elem_with_next(iterable):
    ii = iter(iterable)
    prev = next(ii)
    for here in ii:
        yield prev,here
        prev = here

with open('logfile.log') as fin:
    numbers = ( int(line.split(':')[-1]) for line in fin )
    for num1,num2 in elem_with_next(numbers):
        compare(num1,num2)

I'm assuming that you don't have something convenient to split a string on, meaning a regular expression might make more sense. 我假设您没有方便的拆分字符串的方法,这意味着正则表达式可能更有意义。 That is, if the lines in your log file are structured like: 也就是说,如果日志文件中的行结构如下:

date: 1-15-2013, error: mildly_annoying, line: 121
date: 1-16-2013, error: err_something_bad, line: 123

Then you won't be able to use line.split('#') as mgilson as suggested, although if there is always a colon, line.split(':') might work. 然后,您将无法按照建议的那样使用line.split('#')作为mgilson,尽管如果总是有一个冒号,则line.split(':')可能会起作用。 In any case, a regular expression solution would look like: 无论如何,正则表达式解决方案如下所示:

import re
numbers = []
for line in log:
    digit_match = re.search("(\d+)$", line)
    if digit_match is not None:
        numbers.append(int(digit_match.group(1)))

Here the expression "(\\d+)$" is matching some number of digits and then the end of the line. 在这里,表达式"(\\d+)$"匹配一些数字,然后匹配该行的末尾。 We extract the digits with the group(1) method on the returned match object and then add them to our list of line numbers. 我们使用group(1)方法在返回的匹配对象上提取数字,然后将其添加到行号列表中。

If you're not confident that the "Line: #" will always come at the end of the log, you could replace the regular expression used above with something akin to "Line:\\s*(\\d+)" which checks for the string "Line:" then some (or no) whitespace, and then any number of digits. 如果您不确定“ Line:#”将始终出现在日志的末尾,则可以将上面使用的正则表达式替换为类似于"Line:\\s*(\\d+)" ,以检查是否字符串“行:”,然后是一些(或没有)空格,然后是任意数量的数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 您将如何检查字母/数字/符号是否在字符串中? (Python) - How would you check if a letter/number/symbol is in a string? (Python) 你如何找到一个字符串在列表中连续重复的最大数量? - How would you find the maximum number a string has been repeated consecutively in a list? 在Python中找到一些文本后,如何阅读某些行? - How to read certain lines after you find some text in Python? 在文本中找到字符串和行号-python - Find the String and the Line number in the Text - python Python - 查找字符串文本中的重复项数 - Python - Find the number of duplicates in a string text Python正则表达式-如何在数字后查找任意数量的句子? - Python regex - How to look for an arbitrary number of sentences after a digit? 在Python中,如何检查数字是否是整数类型之一? - In Python, how would you check if a number is one of the integer types? 你会如何在Python中压缩未知数量的列表? - How would you zip an unknown number of lists in Python? 如何在 Python 中找到线性插值的根? - How would you find the roots of a linear interpolation in Python? 在python中,您将如何计算唯一术语出现在列表中的次数,但是如果该术语紧随​​其后,则只计算1 - In python, How would you calculate the number of times a unique term appears in a list, however if the term is directly after, only count 1
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM