[英]How to extract multiple strings from a line using SED regex in Linux and write them to a file?
I have an XML file with multiple lines like below ( I only care about the lines that start with SOURCE) 我有一个包含多行的XML文件,如下所示(我只关心以SOURCE开头的行)
SOURCE BUSINESSNAME ="" DATABASETYPE ="Oracle" DBDNAME ="OrclExp11g" DESCRIPTION ="" NAME ="EMPLOYEES" OBJECTVERSION ="1"
SOURCE BUSINESSNAME ="" DATABASETYPE ="Oracle" DBDNAME ="OrclExp11g" DESCRIPTION ="" NAME ="HR" OBJECTVERSION ="1"
In every line that starts with SOURCE I need to get 3 strings and write them to another file like below. 在以SOURCE开头的每一行中,我需要获取3个字符串并将它们写入另一个文件,如下所示。
Oracle,OrclExp11g,EMPLOYEES Oracle,OrclExp11g,员工
Oracle,OrclExp11g,HR 甲骨文,OrclExp11g,HR
sed -n -e '/SOURCE /p' InputFile.XML | sed -r 's/.* NAME \=\"(.+)\" OBJECTVERSION \=\".*/\1/' > $Source_List.Out
I am new to using SED but so far I was able to get out only one string out using SED. 我是使用SED的新手,但到目前为止,使用SED只能得到一个字符串。 I really appreciate if anyone can help me how to get 3 strings out.
如果有人可以帮助我如何获得3个字符串,我非常感谢。 Thanks so much in advance!
非常感谢!
As you guessed sed
is your friend, you could replace matched regex using \\1
, \\2
and so on. 正如您猜
sed
是您的朋友一样,您可以使用\\1
, \\2
等替换匹配的正则表达式。
$ sed -nE '/SOURCE/{s/^.*DATABASETYPE ="([^"]*)".*DBDNAME ="([^"]*)".*NAME ="([^"]*)".*$/\1,\2,\3/;p}' file >outputfile
Output 输出量
$ cat outputfile
Oracle,OrclExp11g,EMPLOYEES
Oracle,OrclExp11g,HR
Notes 笔记
-E
enable extended regex. -E
启用扩展的正则表达式。 -n
with sed suppresses the normal output. -n
with sed禁止正常输出。 Only the lines that you would print with p
will be printed. p
打印的行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.