使用正則表達式PYTHON替換文件中的特定字符串

Question

我正在使用Stanford NER標記文件，並且希望將每個“ O”標記替換為“ NONE”。 我已經嘗試過此代碼，但是顯示錯誤的輸出。 問題是它替換了字符串中的每個“ O”。 我對正則表達式不熟悉，也不知道什么是適合我的問題的正則表達式。 TIA。

這是我的代碼：

    import re
    tagged_text = st.tag(per_word(input_file))
    string_type = "\n".join(" ".join(line) for line in tagged_text)

    for line in string_type:
        output_file.write (re.sub('O$', 'NONE', line))

輸入樣例：

Tropical O
    Storm O
    Jolina O
    affects O
    2,000 O
    people O
    MANILA LOCATION
    , O
    Philippines LOCATION
    – O
    Initial O
    reports O
    from O
    the O

OUTPUT：

Tropical NONE
Storm NONE
Jolina NONE
affects NONE
2,000 NONE
people NONE
MANILA LNONECATINONEN
, NONE
Philippines LNONECATINONEN
– NONE
Initial NONE
reports NONE
from NONE
the NONE

Answer 1

您不需要遍歷string_type ，直接在字符串上使用re.sub應該可以工作：

s = """Tropical O
    Storm O
    Jolina O
    affects O
    2,000 O
    people O
    MANILA LOCATION
    , O
    Philippines LOCATION
    – O
    Initial O
    reports O
    from O
    the O"""

import re
print(re.sub(r"\bO(?=\n|$)", "NONE", s))

得到：

Tropical NONE
    Storm NONE
    Jolina NONE
    affects NONE
    2,000 NONE
    people NONE
    MANILA LOCATION
    , NONE
    Philippines LOCATION
    – NONE
    Initial NONE
    reports NONE
    from NONE
    the NONE

這里\\bO(?=\\n|$)匹配單個字母O后跟新行字符\\n或行$的結尾。

使用正則表達式PYTHON替換文件中的特定字符串

問題描述

1 個解決方案

解決方案1
1 已采納 2017-10-14 03:21:38

使用正則表達式PYTHON替換文件中的特定字符串

問題描述

1 個解決方案

解決方案1 1 已采納 2017-10-14 03:21:38

解決方案1
1 已采納 2017-10-14 03:21:38