[英]Multi-line search and replace
我有一個非結構化的文件,我想搜索和替換字符串模式。
文件格式就像
col4 is required to be upper so
make col4 upper
abc 12345 !$% DATA SELECT
col1 as col1,
col2 as col2.
col3,
sch.col4 as col4,
sch.tab.col4 as col4_1,
col4,
col5 FROM sch.tab
xyz 34354 ^&* DATA SELECT
col5 as col5,
col3,
col4,
col4 as col4,
col4 FROM
blah blah blah
我要替換:
col4,
其中upper(col4) as col4,
sch.col4
with upper(sch.col4)
sch.tab.col4
與upper(sch.tab.col4)
col4
(如果col4在選擇查詢的末尾),其中upper(col4) as col4
該文件位於linux服務器上,我嘗試使用sed和awk縮小包含col4的行,但無法從那里向前移動。
我可以使用以下方式識別一種模式
awk '/SELECT/,/FROM/' test_file.txt | awk '/col4/{print $0, NR}' | awk -F AS '{print $1}'
在SELECT和FROM之間找到文本
識別具有col4的行
打印第一個字段
sed -n -e '/SELECT/,/FROM/p' -e 's/\(\([a-zA-Z]\{1,\}\.\)\{0,\}\)col4/upper(\0)/g' test_file.txt
並使用sed
實際:
col4 is required to be upper so
make col4 upper
abc 12345 !$% DATA SELECT
col1 as col1,
col2 as col2.
col3,
sch.col4 as col4,
sch.tab.col4 as col4_1,
col4,
col5 FROM sch.tab
xyz 34354 ^&* DATA SELECT
col5 as col5,
col3,
col4,
col4 as col4,
col4 FROM
blah blah blah
預期結果:
col4 is required to be upper so
make col4 upper
abc 12345 !$% DATA SELECT
col1 as col1,
col2 as col2.
col3,
upper(sch.col4) as col4,
upper(sch.tab.col4) as col4_1,
upper(col4) as col4,
col5 FROM sch.tab
xyz 34354 ^& DATA SELECT
col5 as col5,
col3,
upper(col4) as col4,
upper(col4) as col4,
upper(col4) as col4 FROM
blah blah blah
任何幫助深表感謝!!
我認為至少有95%做到了。 請告訴我是否有錯誤:
with open('ej.txt', 'r') as file:
string=file.read().replace('\n',' ')
import re
matches=re.findall(r'SELECT.*?FROM',string)
replacements={"col4,":"upper(col4) as col4,",
"sch.col4":"upper(sch.col4)",
"sch.tab.col4":"upper(sch.tab.col4)",
"col4 as col4,": "upper(col4) as col4,"}
new_matches=[]
for match in matches:
for k,v in replacements.items():
match=match.replace(k,v)
new_matches.append(match)
for k,v in {k:v for k,v in zip(matches,new_matches)}.items() :
string=string.replace(k,v)
string
以下是執行您的請求的簡短awk腳本:
awk '/SELECT/,/FROM/ {$0=gensub(/^[^[:space:]]*col4/,"upper(\\0)",-1);}1' input.txt
abc 12345 !$% DATA SELECT
col1 as col1,
col2 as col2.
col3,
upper(sch.col4) as col4,
upper(sch.tab.col4) as col4_1,
upper(col4),
col5 FROM sch.tab
xyz 34354 ^&* DATA SELECT
col5 as col5,
col3,
upper(col4),
upper(col4) as col4,
upper(col4) FROM
blah blah blah
/SELECT/,/FROM/
包含范圍,從/ SELECT /到/ FROM /
$0=gensub(***)
用gensub()的替換來更新當前行
/^[^[:space:]]*col4/
搜索/^[^[:space:]]*col4/
的非空格前綴
upper(\\\\0)",-1
僅在第一次匹配時才將find-match與upper('found-match')替換
1
打印當前行。 1個
您對所需轉換的描述是不完整的(例如,您說您想將col4,
更改為col4,
upper(col4) as col4,
但預期輸出的第7行沒有反映這一點),所以我將其擱置一旁,然后將其寫為但會從您提供的輸入中產生您想要的輸出(使用GNU awk將第三個arg匹配()),希望這是您真正想要的:
$ cat tst.awk
/SELECT/ { inBlock=1 }
inBlock {
if ( match($0,/^((sch\.(tab\.)?)?col4\>)( as .*)/,a) ) {
$0 = "upper(" a[1] ")" a[4]
}
else if ( match($0,/^(col4\>)(.*)/,a) ) {
$0 = "upper(" a[1] ") as " a[1] a[2]
}
}
/FROM/ { inBlock=0 }
{ print }
$ awk -f tst.awk file
col4 is required to be upper so
make col4 upper
abc 12345 !$% DATA SELECT
col1 as col1,
col2 as col2.
col3,
upper(sch.col4) as col4,
upper(sch.tab.col4) as col4_1,
upper(col4) as col4,
col5 FROM sch.tab
xyz 34354 ^&* DATA SELECT
col5 as col5,
col3,
upper(col4) as col4,
upper(col4) as col4,
upper(col4) as col4 FROM
blah blah blah
與sed:
sed '/SELECT/,/FROM/ {s/as col4 *//;s/\([A-Za-z]*\.\)\{0,\}col4/upper(&) as col4/;}' file
說明:
s/as col4 *//
:刪除存在的as col4
,以防止第二次替換后重復 \\([A-Za-z]*\\.\\)\\{0,\\}col4
:搜索0個或更多字母和點的組合,后跟col4
upper(&) as col4/;
:用新文本替換(使用&
插入匹配字符串)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.