Assume I have a file called text.txt In text.txt, I have a number of the following pattern:
/**
* @something
**/
I want to replace this pattern to empty string. What is the easiest Linux command to do this?
Suppose that our input file is:
$ cat text.txt
before
/**
* @something
**/
after
We can filter out the comments with awk
:
$ awk '/\/\*\*/ {c=1; next} /\*\*\// {c=0; next} c==0 {print}' text.txt
before
after
The awk
works by having a variable as a flag called c
. When we start, c=0
signaling that we are not in a comment. When the start-of-comment line appears, /**
, we set c=1
. c
stays at one until the next end-of-comment line, **/
, appears in which case c
is set back to 0. The line is only printed out if c=0
. Anything, whatever the format, between the open and close comment lines is not printed.
The code is a funny looking because both /
and *
are active characters to awk
. So, they both need to be escaped with backlashes. Thus, for example, the regular expression to look for the start-of-comment line looks like \\/\\*\\*
while the regular expression for end-of-comment looks like \\*\\*\\/
.
Suppose the input file has a more complex structure such as illustrated in JS's example:
$ cat file
something
/**
* @something
**/ random
hello
hi /**
* @something
**/ bye
hola
gracias
bye
We can handle this with awk
as follows:
$ awk -v RS='\\*\\*/\n*' '{sub(/\n*\/\*\*.*/,"",$0); print $0}' file
something
random
hello
hi
bye
hola
gracias
bye
The above was tested with GNU awk
. Since it uses a multi-character record separator, it may not work with older versions of awk
.
While awk
normally reads a file line by line, in our version above we have set the record separator, RS
, to match the end of a comment. Then, we delete everything from the comment start to the end of the record and print the record.
Here is a simple awk
to remove the text from, to a given pattern:
cat file
before
/**
* @something
**/
after
awk '/\*\*\//{f=0} f; /\/\*\*/{f=1}' file
* @something
When you do not like to include START/END pattern, this is one of the most simple awk
to handle this:
awk '/END/{f=0} f; /START/{f=1}'
Using GNU awk for multi-char RS to read the whole file as one string:
If you specifically want to remove just the string you posted, that'd be:
$ cat file
foo/**
* @something
**/bar and more/**
* @something
**/stuff
$ awk -v RS='^$' -v ORS= -v pat='/**
* @something
**/' '{
while ( s=index($0,pat) ) {
$0 = substr($0,1,s-1) substr($0,s+length(pat))
}
print
}' file
foobar and morestuff
or if you actually just want to remove everything between each occurrence of /**
and /
all you need is:
awk -v RS='/[*][*][^/]+/' -v ORS= '1' file
foobar and morestuff
cat text.txt | egrep -v "[/]" | egrep -v "[*] @" > newtext.txt
可以做到这一点,但是您可能必须根据文件中的其他内容稍作修改。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.