使用正则表达式将字符串拆分为 2 个字符串

Question

Good morning, I have a question, using Webscraping, I extract an information in string format like this:早上好，我有一个问题，使用 Webscraping，我以字符串格式提取信息，如下所示：

"Issued May2018No expiration date" 《2018年5月发行无有效期》

what I want is to split this string into 2 strings by using regular expression, my idea is: whenever you find 4 digits followed by "No", I want to create the following string:我想要的是使用正则表达式将此字符串拆分为 2 个字符串，我的想法是：每当您找到 4 个数字后跟“否”时，我想创建以下字符串：

"Issued May2018 - No expiration date".

In this way, I'm able to use the method "split" applied to "-" in a way that I achieve two strings:通过这种方式，我能够以实现两个字符串的方式使用应用于“-”的方法“split”：

Issued May2018 2018 年 5 月发布
No expiration date无有效期

I was thinking using regex with我正在考虑使用正则表达式

\d\d\d\dNo

and it should be able to recognise 2018No, but I don't know how to proceed in order that I can replace it with它应该能够识别 2018No，但我不知道如何进行才能将其替换为

May2018 - No expiration date

and set the floor for using the split function并设置使用拆分 function 的地板

Any suggestions?有什么建议么？ other approaches are well suggested其他方法被很好地建议

Answer 1

You can use a capture group to capture 4 digits followed by matching No您可以使用捕获组捕获 4 位数字，然后匹配No

In the replacement use the capture group 1 value followed by - No在替换中使用捕获组 1 值，后跟- No

import re

s = "Issued May2018No expiration date"
pattern = r"(\d{4})No "
print(re.sub(pattern, r"\1 - No ", s))

Output Output

Issued May2018 - No expiration date

See a Python demo and a regex demo .请参阅Python 演示和正则表达式演示。

Answer 2

Use re.sub .使用re.sub 。

\g<1> is represented in the string passed to the repl parameter of re.sub() as the result of a match for reference group 1. \g<1>在传递给 re.sub() 的 repl 参数的字符串中表示，作为参考组 1 的匹配结果。

import re

s = "Issued May2018No expiration date"
print(re.sub("(\d{4})(No)", "\g<1> - \g<2>", s))

# 'Issued May2018 - No expiration date'

Answer 3

import re

string = "Issued May2018No expiration date"

m = re.findall(r"^(.*[0-9]{4})(No.*)$", string)

print(m[0][0] + " - " + m[0][1])

-> ->

Issued May2018 - No expiration date

使用正则表达式将字符串拆分为 2 个字符串

问题描述

3 个解决方案

解决方案1
1 2022-01-10 14:16:11

解决方案2
1 2022-01-10 14:19:20

解决方案3
1 2022-01-10 14:21:08

使用正则表达式将字符串拆分为 2 个字符串

问题描述

3 个解决方案

解决方案1 1 2022-01-10 14:16:11

解决方案2 1 2022-01-10 14:19:20

解决方案3 1 2022-01-10 14:21:08

解决方案1
1 2022-01-10 14:16:11

解决方案2
1 2022-01-10 14:19:20

解决方案3
1 2022-01-10 14:21:08