简体   繁体   English

Python 正则表达式从字符串中提取版本

[英]Python regex to extract version from a string

The string looks like this: ( \n used to break the line)字符串如下所示:( \n用于换行)

MySQL-vm
Version 1.0.1

WARNING:: NEVER EDIT/DELETE THIS SECTION

What I want is only 1.0.1.我想要的只是1.0.1。

I am trying re.search(r"Version+'([^']*)'", my_string, re.M).group(1) but it is not working.我正在尝试re.search(r"Version+'([^']*)'", my_string, re.M).group(1)但它不起作用。

re.findall(r'\d+', version) is giving me an array of the numbers which again I have to append. re.findall(r'\d+', version)给了我一个数字数组,我又必须给 append。

How can I improve the regex?如何改进正则表达式?

Use the below regex and get the version number from group index 1.使用以下正则表达式并从组索引 1 中获取版本号。

Version\s*([\d.]+)

DEMO演示

>>> import re
>>> s = """MySQL-vm
... Version 1.0.1
... 
... WARNING:: NEVER EDIT/DELETE THIS SECTION"""
>>> re.search(r'Version\s*([\d.]+)', s).group(1)
'1.0.1'

Explanation:解释:

Version                  'Version'
\s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                         more times)
(                        group and capture to \1:
  [\d.]+                   any character of: digits (0-9), '.' (1
                           or more times)
)                        end of \1

You can try with Positive Look behind as well that do not consume characters in the string, but only assert whether a match is possible or not.您也可以尝试在后面使用Positive Look ,它不消耗字符串中的字符,而只断言是否可能匹配。 In below regex you don't need to findAll and group functions.在下面的正则表达式中,您不需要findAllgroup函数。

(?<=Version )[\d.]+

Online demo在线演示

Explanation:解释:

  (?<=                     look behind to see if there is:
    Version                  'Version '
  )                        end of look-behind
  [\d.]+                   any character of: digits (0-9), '.' (1 or more times)
(?<=Version\s)\S+

Try this.Use this with re.findall .试试这个。将它与re.findall一起re.findall

x="""MySQL-vm
  Version 1.0.1

  WARNING:: NEVER EDIT/DELETE THIS SECTION"""

print re.findall(r"(?<=Version\s)\S+",x)

Output:['1.0.1']输出:['1.0.1']

See demo.见演示。

http://regex101.com/r/dK1xR4/12 http://regex101.com/r/dK1xR4/12

https://regex101.com/r/5Us6ow/1 https://regex101.com/r/5Us6ow/1

Bit recursive to match versions like 1, 1.0, 1.0.1:位递归匹配 1、1.0、1.0.1 等版本:

def version_parser(v):
    versionPattern = r'\d+(=?\.(\d+(=?\.(\d+)*)*)*)*'
    regexMatcher = re.compile(versionPattern)
    return regexMatcher.search(v).group(0)

Old question but none of the answers cover corner cases such as Version 1.2.3.老问题,但没有一个答案涵盖诸如Version 1.2.3. (ending with dot) or Version 1.2.3.A (ending with non-numeric values) Here is my solution: (以点结尾)或Version 1.2.3.A (以非数字值结尾)这是我的解决方案:

ver = "Version 1.2.3.9\nWarning blah blah..."
print(bool(re.match("Version\s*[\d\.]+\d", ver)))

We can use the python re library.我们可以使用 python re 库。 The regex described is for versions containing numbers only.所描述的正则表达式仅适用于包含数字的版本。

import re进口再

versions = re.findall('[0-9]+.[0-9]+.?[0-9]*', AVAILABLE_VERSIONS)版本 = re.findall('[0-9]+.[0-9]+.?[0-9]*', AVAILABLE_VERSIONS)

unique_versions = set(versions) # convert it to set to get unique versions unique_versions = set(versions) # 将其转换为 set 以获得唯一版本

Where AVAILABLE_VERSIONS is string containing versions.其中 AVAILABLE_VERSIONS 是包含版本的字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM