简体   繁体   中英

Grep for lines not beginning with “//”

I'm trying but failing to write a regex to grep for lines that do not begin with "//" (ie C++-style comments). I'm aware of the "grep -v" option, but I am trying to learn how to pull this off with regex alone. I've searched and found various answers on grepping for lines that don't begin with a character, and even one on how to grep for lines that don't begin with a string, but I'm unable to adapt those answers to my case, and I don't understand what my error is.

> cat bar.txt
hello
//world
> cat bar.txt | grep "(?!\/\/)"
-bash: !\/\/: event not found

I'm not sure what this "event not found" is about. One of the answers I found used paren-question mark-exclamation-string-paren, which I've done here, and which still fails.

> cat bar.txt | grep "^[^\/\/].+"
(no output)

Another answer I found used a caret within square brackets and explained that this syntax meant "search for the absence of what's in the square brackets (other than the caret). I think the ".+" means "one or more of anything", but I'm not sure if that's correct and if it is correct, what distinguishes it from ".*"

In a nutshell: how can I construct a regex to pass to grep to search for lines that do not begin with "//" ?

To be even more specific, I'm trying to search for lines that have "#include" that are not preceeded by "//".

Thank you.

  1. You're using negative lookahead which is PCRE feature and requires -P option
  2. Your negative lookahead won't work without start anchor
  3. This will of course require gnu-grep .
  4. You must use single quotes to use ! in your regex otherwise history expansion is attempted with the text after ! in your regex, the reason of !\\/\\/: event not found error.

So you can use:

grep -P '^(?!\h*//)' file
hello

\\h matches 0 or more horizontal whitespace.

Without -P or non-gnu grep you can use grep -v :

grep -v '^[[:blank:]]*//' file
hello

To find #include lines that are not preceded by // (or /* …), you can use:

grep '^[[:space:]]*#[[:space:]]*include[[:space:]]*["<]'

The regex looks for start of line, optional spaces, # , optional spaces, include , optional spaces and either " or < . It will find all #include lines except lines such as #include MACRO_NAME , which are legitimate but rare, and screwball cases such as:

#/*comment*/include/*comment*/<stdio.h>
#\
include\
<stdio.h>

If you have to deal with software containing such notations, (a) you have my sympathy and (b) fix the code to a more orthodox style before hunting the #include lines. It will pick up false positives such as:

/* Do not include this:
#include <does-not-exist.h>
*/

You could omit the final [[:space:]]*["<] with minimal chance of confusion, which will then pick up the macro name variant.


To find lines that do not start with a double slash, use -v (to invert the match) and '^//' to look for slashes at the start of a line:

grep -v '^//'

The first line tells you that the problem is from bash (your shell). Bash finds the ! and attempts to inject into your command the last you entered that begins with \\/\\/ . To avoid this you need to escape the ! or use single quotes. For an example of ! , try !cat , it will execute the last command beginning with cat that you entered.

You don't need to escape / , it has no special meaning in regular expressions. You also don't need to write a complicated regular expression to invert a match. Rather, just supply the -v argument to grep. Most of the time simple is better. And you also don't need to cat the file to grep. Just give grep the file name. eg.

grep -v "^//" bar.txt | grep "#include"

If you're really hungup on using regular expressions then a simple one would look like (match start of string ^ , any number of white space [[:space:]]* , exactly two backslashes /{2} , any number of any characters .* , followed by #include ):

grep -E "^[[:space:]]*/{2}.*#include" bar.txt

您必须使用-P (perl)选项:

cat bar.txt | grep -P '(?!//)'

对于不是以“ //”开头的行,可以使用(^[^/]{2}.*$)

If you don't like grep -v for this then you could just use awk:

awk '!/^\/\//' file

Since awk supports compound conditions instead of just regexps, it's often easier to specify what you want to match with awk than grep, eg to search for a and b in any order with grep:

grep -E 'a.*b|b.*a`

while with awk:

awk '/a/ && /b/'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM