简体   繁体   English

带有捕获功能的sed / awk正则表达式-如何使一个简单的正则表达式起作用?

[英]sed/awk regex with capture - how to get a somewhat simple regex to work?

Sample contents of FILE.txt are shown below. FILE.txt的示例内容如下所示。 How would I modify the regex used by SED to do a capture which results in the desired output section? 我将如何修改SED使用的正则表达式来进行捕获,从而生成所需的输出部分? Prefer to do this with POSIX awk or sed functions if possible. 如果可能,最好使用POSIX awk或sed函数执行此操作。 I've looked into doing this solely with AWK, but am not following how you create the same behavior as a capture with the options it provides. 我已经研究过仅使用AWK来做到这一点,但是没有遵循如何使用捕获器提供的选项来创建与捕获器相同的行为。

One of the problems I've run into trying various solutions is how to make the double-quotes optional. 我尝试各种解决方案时遇到的问题之一是如何使双引号变为可选。

sed -e 's/.Include .*"*\(.*\)"*/\1/g' FILE.txt


FILE.txt
##########################################################################
# Indexes Includes FollowSymLinks SymLinksifOwnerMatch ExecCGI Multiviews
# Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
# Possible values include: debug, info, notice, warn, error, crit,
# does not include the trailing slash. 
AddOutputFilter INCLUDES .shtml .html
    Options -Indexes FollowSymLinks Includes
LoadModule include_module modules/mod_include.so
Include /opt/file.conf
Include "/opt/file.conf"
Include /usr/bin/abcOutput.conf
Include /usr/bin/ed_Output.conf
###########################################################################


**Desired Output:**
/opt/file.conf
/opt/file.conf
/usr/bin/abcOutput.conf
/usr/bin/ed_Output.conf

Use extended regular expressions to avoid the unnecessary \\( and \\) and use [[:blank:]] as a character class that includes space and tab. 使用扩展的正则表达式避免不必要的\\(\\)并使用[[:blank:]]作为包含空格和制表符的字符类。 The ? ? indicates 0 or 1 matches. 表示0或1个匹配项。 The + indicates 1 or more. +表示1或更大。

sed -rn 's/^Include[[:blank:]]+"?([^"]+)"?/\\1/p'

Note: The -E option for extended regular expression (instead of -r ) makes it compatible with older versions of sed 注意:扩展正则表达式的-E选项(而不是-r )使它与较早版本的sed兼容

NOTE: Assuming you don't want the '#############' strings in the output, and based solely on the example you've provided ... 注意:假设您不希望在输出中使用'##############'字符串,并且仅基于您提供的示例即可...

How about a awk/sed combo: awk / sed组合如何:

$ awk '/^Include/ { print $2 } ' FILE.txt | sed 's/\"//g'
/opt/file.conf
/opt/file.conf
/usr/bin/abcOutput.conf
/usr/bin/ed_Output.conf

Perhaps not as efficient as a single sed command, but easier to understand/maintain (KISS) and unless you're calling this kind of construct a LOT then the performance difference will be negligible. 也许不如单个sed命令有效,但更易于理解/维护(KISS),除非您将这种构造称为LOT,否则性能差异可以忽略不计。

Or a purely awk example (and a bit more efficient than the awk/sed idea): 或纯粹的awk示例(比awk / sed的想法要有效一些):

$ awk '/^Include/ { gsub("\"","") ; print $2 } ' FILE.txt
/opt/file.conf
/opt/file.conf
/usr/bin/abcOutput.conf
/usr/bin/ed_Output.conf

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM