简体   繁体   English

正则表达式:以特定事物结束

[英]Regex: Ending with a specific thing

I am trying to find an expression that begins with Hello and ends in one of two ways: with nothing after the "Hello", or if it has something else after it needs to be preceded with "//". 我试图找到一个以Hello开头的表达式,并以两种方式之一结束:在“Hello”之后没有任何内容,或者在需要以“//”开头之后还有其他内容。 After the //, anything goes. 在//之后,任何事情都会发生。

I tried: grep '^Hello(//.*)?$' but this does not work. 我试过: grep '^Hello(//.*)?$'但这不起作用。
There is something wrong with the last part: (//.*)?$ 最后一部分有问题:( (//.*)?$

Sample Input:
Hello
Hello blah
Hi
Hello //
Hello // blah blah
Hello //blah

Sample Output using egrep:
Hello
Hello //
Hello // blah blah
Hello //blah

This is pretty straightforward with egrep: 这对于egrep非常简单:

egrep '^Hello(\s*\/\/.*)?$' input.txt

That is: 那是:

  • ^ ... - Force the match to start at the beginning of the line. ^ ... - 强制匹配从行的开头开始。
  • Hello - Definitely match the required phrase Hello . Hello - 绝对匹配必需的短语Hello
  • (\\s* ... ) - Allow optional whitespace to follow Hello . (\\s* ... ) - 允许可选空格跟随Hello
  • ( ... \\/\\/ ... ) - Match the forward slashes, escaping them (because some shells can do funny things to your regexes). ( ... \\/\\/ ... ) - 匹配正斜杠,逃避它们(因为有些贝壳可以对你的正则表达式做有趣的事情)。
  • ( ... .*) - Allow anything after the slashes. ( ... .*) - 在斜杠后允许任何内容。
  • ( ... )? - The question mark indicates the parenthesized part is optional. - 问号表示带括号的部分是可选的。
  • ... $ - Force the regex to only match if it consumes through the end of the line. ... $ - 强制正则表达式仅匹配,如果它消耗在行尾。

You were using grep instead of egrep . 你使用的是grep而不是egrep Plain grep uses a much simpler regex syntax that doesn't allow some of the operators you might like to use. 普通grep使用更简单的正则表达式语法,它不允许您可能想要使用的某些运算符。 Notably, in plain grep , parentheses and ? 值得注意的是,在普通的grep ,括号和? are just plain characters, not special meta-characters for grouping, so plain grep was searching for literal ( and ) in your file. 只是简单的字符,而不是用于分组的特殊元字符,所以普通的grep在文件中搜索文字 () When in doubt, prefer egrep . 如有疑问,请选择egrep

(And yes, for the pedantic folks in the audience, egrep is indeed just an alternate name for grep -E or grep --extended-regexp , but it's much easier to remember and type egrep than either of the other two "native" forms.) (是的,对于观众中迂腐的人来说, egrep确实只是grep -Egrep --extended-regexp的替代名称,但它比其他两种“原生”形式更容易记住和输入egrep 。)

Given: 鉴于:

$ echo "$txt" 
Hello
Hello blah
Hi
Hello //
Hello // blah blah
Hello //blah

With grep : grep

$ echo "$txt" | grep -E '^Hello$|^Hello[[:space:]]+//'
Hello
Hello //
Hello // blah blah
Hello //blah

Or with awk : 或者使用awk

$ echo "$txt" | awk '/^Hello$/ || /Hello[[:space:]]+\/\//'
Hello
Hello //
Hello // blah blah
Hello //blah

Or if you want to make sure there is something after the // : 或者如果你想确保在//之后有什么东西:

$ grep -E '^Hello$|^Hello[[:space:]]+//[^[:space:]]+'

Use the "match only whole lines" option of egrep ( -x ). 使用egrep( -x )的“仅匹配整行”选项。
Then look for optional whitespace ( [[:space:]]* ), two escaped / ( \\/\\/ ) followed by anything or nothing, .* . 然后查找可选的空格( [[:space:]]* ),两个转义/\\/\\/ )后跟任何东西或什么都没有.*
Use the optional specifier and parenthesis to allow the special ending, not require it ( (...)? ). 使用可选的说明符和括号来允许特殊结尾,而不是它( (...)? )。

egrep -x "Hello([[:space:]]*\/\/.*)?"

Another awk proposal. 另一个awk提案。 The first part /^Hello$/ matches just a singel Hello and the second part matches // and in this case prints the entire line. 第一部分/ ^ Hello $ /只匹配一个singel Hello,第二部分匹配//,在这种情况下打印整行。

awk '/^Hello$/||/\/\//' file

Hello
Hello //
Hello // blah blah
Hello //blah

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM