简体   繁体   English

正则表达式匹配科学计数法

[英]Regex to match scientific notation

I'm trying to match numbers in scientific notation (regex from here ): 我正在尝试以科学计数法匹配数字(来自此处的正则表达式):

scinot = re.compile('[+\-]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)')
re.findall(scinot, 'x = 1e4')
['1e4']
re.findall(scinot, 'x = c1e4')
['1e4']

I'd like it to match x = 1e4 but not x = c1e4 . 我希望它匹配x = 1e4但不匹配x = c1e4 What should I change? 我应该改变什么?

Update : The answer here has the same problem: it incorrectly matches 'x = c1e4' . 更新这里的答案有相同的问题:它错误地匹配'x = c1e4'

在正则表达式的末尾添加锚,并在数字前加空格或等号:

[\s=]+([+-]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+))$

Simply add [^\\w]? 只需添加[^\\w]? to exclude all alphanumeric characters that precede your first digit: 排除您的第一个数字之前的所有字母数字字符:

 [+\-]?[^\w]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)

Technically, the \\w will also exlude numeric characters, but that's fine because the rest of your regex will catch it. 从技术上讲, \\w也将排除数字字符,但这很好,因为您的正则表达式的其余部分都可以捕获它。

If you want to be truly rigorous, you can replace \\w with A-Za-z : 如果您想做到严格,可以将\\w替换为A-Za-z

 [+\-]?[^A-Za-z]?(?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)

Another sneaky way is to simply add a space at the beginning of your regex - that will force all your matches to have to begin with whitespace. 另一种偷偷摸摸的方法是在正则表达式的开头简单地添加一个空格-这将强制所有匹配项必须以空格开头。

scinot = re.compile('[-+]?[\\d]+\\.?[\\d]*[Ee](?:[-+]?[\\d]+)?')

This regex would help you to find all the scientific notation in the text. 此正则表达式将帮助您找到本文中的所有科学符号。

By the way, here is the link to the similar question: Extract scientific number from string 顺便说一下,这里是类似问题的链接: 从字符串中提取科学数字

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM