[英]printing lines based on pattern matching in multiple fields using awk
Suppose I have a html input like 假设我有一个类似html的输入
<li>this is a html input line</li>
I want to filter all such input lines from a file which begins with <li>
and ends with </li>
. 我想从以
<li>
开始并以</li>
结束的文件中过滤所有这些输入行。 Now my idea was to search for pattern <li>
in the first field and pattern </li>
in the last field using the below awk command 现在,我的想法是使用以下awk命令在第一个字段中搜索模式
<li>
在最后一个字段中搜索模式</li>
awk '$1 ~ /\<li\>/ ; $NF ~ /\</li\>/ {print $0}'
but looks like there is no provision to match two fields at a time or I am making some syntax mistakes. 但似乎没有规定一次匹配两个字段,或者我在语法上犯了一些错误。 Could you please help me here?
你能在这里帮我吗?
PS: I am working on a Solaris SunOS machine. PS:我正在使用Solaris SunOS计算机。
There's a lot going wrong in your script on Solaris: Solaris上的脚本有很多错误:
awk '$1 ~ /\<li\>/ ; $NF ~ /\</li\>/ {print $0}'
/usr/xpg4/bin/awk
. /usr/xpg4/bin/awk
。 There's also nawk
but it's got less POSIX features (eg. no support for character classes). nawk
但它有更少的POSIX功能(例如,用于字符类不支持)。 \\<...\\>
are gawk-specific word boundaries. \\<...\\>
是gawk特定的单词边界。 There is no awk on Solaris that would recognize those. &&
between them, not ;
&&
放在两者之间,而不是;
which is just the statement terminator in lieu of a newline. {print $0}
so you don't need to explicitly write that code. {print $0}
因此您无需显式编写该代码。 /
is the awk regexp delimiter so you do need to escape that in mid-regexp. /
是awk正则表达式分隔符,因此您确实需要在正则表达式中间进行转义。 $1
and $NF
will be <li>this
and line</li>
, not <li>
and </li>
. $1
和$NF
将是<li>this
和line</li>
,而不是<li>
和</li>
。 So if you DID for some reason compare multiple fields you could do: 因此,如果由于某种原因DID比较多个字段,则可以执行以下操作:
awk '($1 ~ /^<li>.*/) && ($NF ~ /.*<\/li>$/)'
but this is probably what you really want: 但这可能是您真正想要的:
awk '/^<li>.*<\/li>/'
in which case you could just use grep: 在这种情况下,您可以使用grep:
grep '^<li>.*</li>'
Why not just use a regex to match the start and end of the line like 为什么不使用正则表达式来匹配行的开头和结尾,例如
awk '/^[[:space:]]*<li>.*<\/li>[[:space:]]*$/ {print}'
though in general if you're trying to process HTML you'll be better of using a tool that's really designed to handle that. 尽管通常来说,如果您要处理HTML,最好使用专门设计用于处理HTML的工具。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.