繁体   English   中英

如何使用 RegEx 在 Python 中搜索一些数字?

[英]How to use RegEx to search for some numbers in Python?

我有一个带有单词和数字的大文本。 在文本中有多行这样的:

线性回归完成。 价值:123.235

当然,文档中的数字会发生变化。 问题是:我真的需要这些数字。 但是要花很长时间才能通过 100.000 行并获得所有数字。 我尝试了正则表达式,但我不擅长正则表达式。 有谁能帮忙吗?

import re

file = open('filename.txt', 'r')
x = re.findall("value", file)
print(value)

如果你能帮我得到所有数值后的数字,那就太好了。

我们可以使用re.findall如下:

with open('filename.txt', 'r') as file:
    data = file.read()

nums = re.findall(r'\bvalue:\s*(\d+(?:\.\d+)?)', data)

鉴于以下包含Linear regression is done. value: <value>sample.txt文件已完成。 Linear regression is done. value: <value> ,6次:

示例.txt:

Linear regression is done. value: 000.00 ssdfsdfsdfhklshdfkhskldhflsdf
Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium
Linear regression is done. value: 123.12 doloremque, Linear regression is done. value: 0.0123 eaque
dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem
ipsum quia Linear regression is done. value: 234.23 dolor sit amet, consectetur, adipisci velit, sed
quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad
minima veniam, quis nostrum exercitationem ullam corporis suscipit Linear regression is done. value: 345.34 laboriosam,
nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam
nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?
lskdfhlshdfl Linear regression is done. value: 456.45

这是一种方法:

import re

REGEX = 'Linear regression is done. value: [+-]?([0-9]+\.?[0-9]*|\.[0-9]+)'

if __name__ == '__main__':
    numbers_in_text = []
    with open('sample.txt', 'r') as file:
        for line in file:
            numbers_in_line = re.findall(REGEX, line)
            numbers_in_text.extend(numbers_in_line)
    
    print(numbers_in_text)
    assert 6 == len(numbers_in_text), 'It is not reading all the numbers'

印刷:

['000.00', '123.12', '0.0123', '234.23', '345.34', '456.45']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM