如何使用正则表达式python提取子字符串？

Question

I have a string .I want to extract the substring which starts with a number and ends with a number in that substring.我有一个字符串。我想提取以数字开头并以该子字符串中的数字结尾的子字符串。

MY string is "05/24/2019 04:33 PM 582 atm1.py"我的字符串是"05/24/2019 04:33 PM 582 atm1.py"

I tried with the below pattern ^\\d.+\\s+\\d$我尝试使用以下模式^\\d.+\\s+\\d$

i="05/24/2019  04:33 PM               582 atm1.py"    
print(re.match("^\d.+\s+\d$",i))

Expected o/p= "05/24/2019 04:33 PM 582" Actual o/p=the entire string I am getting.预期 o/p= "05/24/2019 04:33 PM 582"实际 o/p=我得到的整个字符串。

Answer 1

A very sensitive pattern:一个非常敏感的模式：

print(re.match("\d+/\d+/\d+\s+\d+:\d+\s+PM\s+\d+",i).group(0))

Or use:或使用：

print(re.match(".+\s+",i).group(0))

Output:输出：

05/24/2019  04:33 PM               582

Answer 2

Try the following regex: "\\d[\\d\\s:APM/]*\\d"试试下面的正则表达式： "\\d[\\d\\s:APM/]*\\d"

import re

s = "05/24/2019  04:33 PM               582 atm1.py"
pattern = "\d[\d\s:APM/]*\d"
print(re.match(pattern, s).group(0))

Regex breakdown: 1. \\d : a decimal character (0-9) 2. [\\d\\s:APM/]* : the * means any number of the characters inside the square brackets.正则表达式细分： 1. \\d ：十进制字符 (0-9) 2. [\\d\\s:APM/]* ： *表示方括号内的任意数量的字符。 Inside the square brackets we have \\d for decimals (0-9), \\s for spaces, and :APM/ for those literal characters ( : for the time, APM for AM and PM, and / for the date).在方括号内，我们有\\d代表小数 (0-9)， \\s代表空格，以及:APM/代表这些文字字符（ :代表时间， APM代表 AM 和 PM，以及/代表日期）。 3. \\d : a decimal character (0-9) 3. \\d : 十进制字符 (0-9)

Outputs: 05/24/2019 04:33 PM 582输出： 05/24/2019 04:33 PM 582

Demo演示

Answer 3

If you want to get a substring that starts with the first number as whole word and ends with the last number as whole from a longer string, you may use如果你想从一个较长的字符串中得到一个以第一个数字作为整个单词开始并以最后一个数字作为整个单词结束的子字符串，你可以使用

r'\b\d+\b.*\b\d+\b'

Details细节

\\b\\d+\\b - a word boundary, digit and a word boundary (no digits, letters or underscores before and after are allowed) \\b\\d+\\b - 字边界、数字和字边界（前后不允许有数字、字母或下划线）
.* - any 0+ chars (without re.DOTALL or re.S flag, only matching non-linebreak chars), as many as possible .* - 任何 0+ 个字符（没有re.DOTALL或re.S标志，只匹配非换行符），尽可能多
\\b\\d+\\b - a word boundary, digit and a word boundary (no digits, letters or underscores before and after are allowed) \\b\\d+\\b - 字边界、数字和字边界（前后不允许有数字、字母或下划线）

In Python, use在 Python 中，使用

import re
i="05/24/2019  04:33 PM               582 atm1.py"
m = re.search(r'\b\d+\b.*\b\d+\b', i)
if m:
    print(m.group()) # => 05/24/2019  04:33 PM               582

See the Python demo .请参阅Python 演示。

如何使用正则表达式python提取子字符串？

问题描述

3 个解决方案

解决方案1
0 2019-08-29 07:14:38

解决方案2
0 2019-08-29 07:28:53

解决方案3
0 2019-08-29 11:36:25

如何使用正则表达式python提取子字符串？

问题描述

3 个解决方案

解决方案1 0 2019-08-29 07:14:38

解决方案2 0 2019-08-29 07:28:53

解决方案3 0 2019-08-29 11:36:25

解决方案1
0 2019-08-29 07:14:38

解决方案2
0 2019-08-29 07:28:53

解决方案3
0 2019-08-29 11:36:25