正則表達式，用於捕獲和替換字符串中的所有數字（特殊模式除外）

Question

我有一個文本，其中數字以各種可能的方式出現。 例如，

text = "hello23 the2e are 13 5.12apples *specially_x00123 named 31st"

我想用'＃'替換所有數字，但以*，單詞，下划線，任何字符和數字開頭的特殊模式中的數字除外，例如* \\ w + _ [az] \\ d +（即* specially_x00123）。

我嘗試使用環顧四周語法和非捕獲組，但是找不到將其准確更改為以下內容的方法

text_cleaned = "hello## the#e are ## #.##apples *specially_x00123 named ##st"

我可以使用如下所示的模式：

p1 = r'\d(?<!\*\w+_\w+)'

然后，它抱怨像這樣； “向后看需要固定寬度的樣式”

我試圖使用非捕獲組：

p2 = r'(?:\*[a-z]+_\w+)\b|\d'

它取出特殊令牌（* specially_x000123）和所有數字。 我認為這是我可能會包含在解決方案中的內容，但是我找不到如何做。 有任何想法嗎？

Answer 1

您可能要做的是在捕獲組(\\d)捕獲數字，並在替換檢查中使用第一個捕獲組的回調。

如果是組1，則用#替換，否則返回匹配項。

由於\\w+也匹配下划線，因此您可以使用反義字符類[^\\W_\\n]+匹配單詞char，但下划線除外

\*[^\W_\n]+_[a-z]\d+\b|(\d)

正則表達式演示 | Python演示

import re
text = "hello23 the2e are 13 5.12apples *specially_x00123 named 31st"
pattern = r"\*[^\W_\n]+_[a-z]\d+\b|(\d)"
print (re.sub(pattern, lambda x: "#" if x.group(1) else x.group(), text))

結果

hello## the#e are ## #.##apples *specially_x00123 named ##st

Answer 2

一種選擇可能是將字符串拆分為star之前，然后為star之后。 表達式(\\d)捕獲star之前的每個數字，我們可以簡單地使用#替換它，然后將其與$2 ：

(\d)|(\*.*)

測試

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(\d)|(\*.*)"

test_str = ("hello23 the2e are 13 5.12apples *specially_x00123 named\n\n"
    "hello## the#e are ## #.##apples *specially_x00123 named")

subst = "#\\2"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

regex101.com

 const regex = /(\\d)|(\\*.*)/gm; const str = `hello23 the2e are 13 5.12apples *specially_x00123 named`; const subst = `#$2`; // The substituted value will be contained in the result variable const result = str.replace(regex, subst); console.log('Substitution result: ', result);

正則表達式，用於捕獲和替換字符串中的所有數字（特殊模式除外）

問題描述

2 個解決方案

解決方案1
3 已采納 2019-05-25 06:53:50

解決方案2
0 2019-05-25 01:35:54

測試

regex101.com

正則表達式，用於捕獲和替換字符串中的所有數字（特殊模式除外）

問題描述

2 個解決方案

解決方案1 3 已采納 2019-05-25 06:53:50

解決方案2 0 2019-05-25 01:35:54

測試

regex101.com

解決方案1
3 已采納 2019-05-25 06:53:50

解決方案2
0 2019-05-25 01:35:54