正则表达式在 Python 中的日期之后提取字符串

Question

Having these two types of string:有这两种类型的字符串：

1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip

1635508858063-1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip

How can I get using regex the 111040 part of the string?如何使用正则表达式字符串的111040部分？ It has always 6 digits.它总是 6 位数字。

My approach is: " Take the 6 digit code after the YYYY_MM_DD_HH_MM_SS_ part ", but any other approach is also welcome.我的方法是：“在 YYYY_MM_DD_HH_MM_SS_ 部分后取 6 位代码”，但也欢迎任何其他方法。

EDIT: The last part _0CM.csv.zip can be suceptible to change.编辑：最后一部分_0CM.csv.zip可以更改。

Thanks in advance.提前致谢。

Answer 1

You wanted a regex so here it is:你想要一个正则表达式，所以这里是：

[0-9]{4}(?:_[0-9]{2}){5}_([0-9]{6})

[0-9]{4} : match the first 4 digits of the year, this is our starting anchor [0-9]{4} : 匹配年份的前 4 位数字，这是我们的起始锚点
(?:_[0-9]{2}){5} : after that, it follows with 5 two digit numbers (month, day, hour, minute, second) so we can just group them all and ignore them (?:_[0-9]{2}){5} : 之后，后面跟着 5 个两位数（月、日、小时、分钟、秒），因此我们可以将它们全部分组并忽略它们
([0-9]{6}) : get the 6 digits following the previous expression. ([0-9]{6}) ：获取前一个表达式后面的 6 位数字。

The desired number is in capture group 1 of this regex:所需的数字在此正则表达式的捕获组 1 中：

import re
regex = '[0-9]{4}(?:_[0-9]{2}){5}_([0-9]{6})'
re.search(regex, '1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip').group(1)

Answer 2

How about this pattern?这个图案怎么样？ Works if you match each line one-by-line:如果您逐行匹配每一行，则有效：

import re
pattern = re.compile('\d{4}_\d{2}_\d{2}_\d{2}_\d{2}_\d{2}_(\d{6})')
print(pattern.findall("1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip"))

Answer 3

This will return '' if an appropriate match isn't found.如果找不到合适的匹配项，这将返回 ''。

import re

strings = [
    "1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip",
    "1635508858063-1625212673449-2021_07_02_07_55_05_111040_0CM.csv.zip",
    'Test'
]

pattern = re.compile('_(\d{6})_')

digits = [pattern.search(string).group(1) if pattern.search(string) else '' for string in strings]

print(digits)

正则表达式在 Python 中的日期之后提取字符串

问题描述

3 个解决方案

解决方案1
2 已采纳 2021-11-02 08:04:45

解决方案2
1 2021-11-02 08:05:48

解决方案3
1 2021-11-02 08:14:20

正则表达式在 Python 中的日期之后提取字符串

问题描述

3 个解决方案

解决方案1 2 已采纳 2021-11-02 08:04:45

解决方案2 1 2021-11-02 08:05:48

解决方案3 1 2021-11-02 08:14:20

解决方案1
2 已采纳 2021-11-02 08:04:45

解决方案2
1 2021-11-02 08:05:48

解决方案3
1 2021-11-02 08:14:20