在 Python 中使用正则表达式替换匹配前的符号

Question

I have strings such as:我有字符串，例如：

text1 = ('SOME STRING,99,1234 FIRST STREET,9998887777,ABC')
text2 = ('SOME OTHER STRING,56789 SECOND STREET,6665554444,DEF')
text3 = ('ANOTHER STRING,#88,4321 THIRD STREET,3332221111,GHI')

Desired output:所需的 output：

SOME STRING 99,1234 FIRST STREET,9998887777,ABC
SOME OTHER STRING,56789 SECOND STREET,6665554444,DEF
ANOTHER STRING #88,4321 THIRD STREET,3332221111,GHI

My idea: Use regex to find occurrences of 1-5 digits, possibly preceded by a symbol, that are between two commas and not followed by a space and letters, then replace by this match without the preceding comma.我的想法：使用正则表达式查找出现在两个逗号之间且后面没有空格和字母的 1-5 位数字，可能前面可能有一个符号，然后用这个没有前面逗号的匹配替换。 Something like:就像是：

text.replace(r'(,\d{0,5},)','.........')

Answer 1

If you would use regex module instead of re then possibly:如果您使用regex模块而不是re那么可能：

import regex
str = "ANOTHER STRING,#88,4321 THIRD STREET,3332221111,GHI"
print(regex.sub(r'(?<!^.*,.*),(?=#?\d+,\d+)', ' ', str))

You might be able to use re if you sure there are no other substring following the pattern in the lookahead.如果您确定没有其他 substring 遵循前瞻中的模式，您可能可以使用re 。

import re
str = "ANOTHER STRING,#88,4321 THIRD STREET,3332221111,GHI"
print(re.sub(r',(?=#?\d+,\d+)', ' ', str))

Answer 2

Easier to read alternative if SOME STRING, SOME OTHER STRING, and ANOTHER STRING never contain commas:如果 SOME STRING、SOME OTHER STRING 和 ANOTHER STRING从不包含逗号，则更易于阅读：

text1.replace(",", " ", 1)

which just replaces the first comma with a space它只是用空格替换第一个逗号

Answer 3

Simple, yet effective:简单而有效：

my_pattern = r"(,)(\W?\d{0,5},)"

p = re.compile(my_pattern)

p.sub(r" \2", text1) # 'SOME STRING 99,1234 FIRST STREET,9998887777,ABC'
p.sub(r" \2", text2) # 'SOME OTHER STRING,56789 SECOND STREET,6665554444,DEF'
p.sub(r" \2", text3) # 'ANOTHER STRING #88,4321 THIRD STREET,3332221111,GHI'

Secondary pattern with non-capturing group and verbose compilation:具有非捕获组和详细编译的辅助模式：

my_pattern = r"""
    (?:,)           # Non-capturing group for single comma.
    (\W?\d{0,5},)   # Capture zero or one non-ascii characters, zero to five numbers, and a comma
"""

# re.X compiles multiline regex patterns
p = re.compile(my_pattern, flags = re.X)

# This time we will use \1 to implement the first captured group
p.sub(r" \1", text1)
p.sub(r" \1", text2)
p.sub(r" \1", text3)

在 Python 中使用正则表达式替换匹配前的符号

问题描述

3 个解决方案

解决方案1
1 已采纳 2020-07-02 17:24:26

解决方案2
0 2020-07-02 17:46:35

解决方案3
0 2020-07-02 19:11:58

在 Python 中使用正则表达式替换匹配前的符号

问题描述

3 个解决方案

解决方案1 1 已采纳 2020-07-02 17:24:26

解决方案2 0 2020-07-02 17:46:35

解决方案3 0 2020-07-02 19:11:58

解决方案1
1 已采纳 2020-07-02 17:24:26

解决方案2
0 2020-07-02 17:46:35

解决方案3
0 2020-07-02 19:11:58