簡體   English   中英

使用sed替換不以某些前綴開頭的單詞

[英]Replace words that don't start with certain prefix using sed

我想替換的每個實例word不具有前綴pre ,與preword使用sed 因此,不應替換prewordwordpreword應將單個word替換為preword

我像這樣嘗試了通常的負向后正則表達式

sed -E -i 's/(?<!pre)word/preword/g'

但這給了我錯誤

sed: -e expression #1, char 22: Invalid preceding regular expression

我讀過GNU sed有一些不同的方式來處理正則表達式。 我該怎么做才能做到這一點?

您不必先將“單詞”替換為“ pre”,也不必匹配“ pre”並系統地替換它:

sed -E 's/(pre)?word/preword/g'

換一種方式(更籠統),您將所有不是“ pre”的對象放入捕獲組:

sed -E 's/(^|[^e]|^e|[^r]e|^re|[^p]re)word/\1preword/g'

使用GNU sed:

sed 's/\bword\b/preword/g' file

\\b是零寬度的單詞邊界

如果您需要復雜的正則表達式,也可以考慮編寫一個小型解析器。

$ cat r.awk
BEGIN {
    re_wrd = "^[A-Za-z]+" # what we consider a word
    re_sep  = "^."        # the rest is a separator
}

function advance() { # sets `tag' and `tok'; eats a part of `line'
    if      (match(line, re_wrd)) tag = "wrd"
    else if (match(line, re_sep)) tag = "sep"
    tok  = substr(line, 1,          RLENGTH)
    line = substr(line, RLENGTH + 1        )
}

function process_sep() { # copy to output
    ans = ans tok
}

function process_wrd() {
    sub(/^word/, "preword", tok) # replace only at the beginning
    ans = ans tok
}

{
    line = $0; ans = tag = tok = ""
    while (length(line) > 0) {
        advance()
        # uncomment for tracing
        # print tag, "<" tok ">" | "cat 1>&2"
        if      (tag == "sep") process_sep()
        else if (tag == "wrd") process_wrd()
    }
    print ans
}

用法:

$ echo 'preword...microsoftword word wordword,word.word-preword' | awk -f r.awk
preword...microsoftword preword prewordword,preword.preword-preword

跟蹤:

wrd <preword>
sep <.>
sep <.>
sep <.>
wrd <microsoftword>
sep < >
wrd <word>
sep < >
wrd <wordword>
sep <,>
wrd <word>
sep <.>
wrd <word>
sep <->
wrd <preword>

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM