简体   繁体   English

如何使用sed删除Java项目的注释?

[英]How to delete comments for Java project with sed?

I have a Java project, in which I have JavaDoc comments 我有一个Java项目,其中有JavaDoc注释

/** ... */

other multi-line-comments 其他多行注释

/* ... */

line comments 行注释

// ...

and my own "explanatory comments" 和我自己的“解释性评论”

//* ...

When I release my code, I would like to have all line comments removed – not the other comments though. 当我发布代码时,我希望删除所有行注释,但是不要删除其他注释。 I though I would do it with sed, but so far I have not been successful. 我虽然会用sed来做,但是到目前为止我还没有成功。 I am trying the following: 我正在尝试以下方法:

#!/bin/bash

while read -d $'\0' findfile ; do
  echo "${findfile}"
  mv "${findfile}" "${findfile}".veryold
  cat "${findfile}".veryold | sed -e 's|//[^\*"]*[^"]*||' -e 's/[ ^I]*$//' | grep -A1 . | grep -v '^--$' > "${findfile}"
  rm -f "${findfile}".veryold
done < <(find "${1}" -type f -print0)

What am doing wrong? 怎么了? Note that // in "..." should not be removed, since they might be part of a URL. 请注意,不应删除// "..."中的// ,因为它们可能是URL的一部分。

The crucial part is 关键部分是

-e 's|//[^\*"]*[^"]*||'

For a test file that looks like this: 对于看起来像这样的测试文件:

/** This should stay */
/* And this
 * should stay
 * as well */
// This one should be removed
//* But this one should stay
code here // This part should go, but not the next line
"http://test.com"
code here //* This should stay

you can do it like this: 您可以这样做:

$ sed '\#^//[^*]#d;s#//[^*"][^"]*$##' test.java
/** This should stay */
/* And this
 * should stay
 * as well */
//* But this one should stay
code here
"http://test.com"
code here //* This should stay

The first expression, \\#^//[^*]#d , deletes all lines that start with // (but not //* ). 第一个表达式\\#^//[^*]#d删除以//开头的所有行(但不删除//* )。 This is to avoid getting empty lines in the output when the whole line is removed. 这是为了避免在删除整行时在输出中出现空行。

The second expression, s#//[^*"][^"]*$## , matches from // on (but not //* or //" ), until the end of the line, unless there is a " between the // and the end of the line. 第二个表达式s#//[^*"][^"]*$##//开(但不是//*//" )匹配,直到行尾,除非有一个"之间的//与线的端部。

Your expression s|//[^\\*"]*[^"]*|| 您的表达式s|//[^\\*"]*[^"]*|| did almost the same, except: 几乎一样,除了:

  • No need to escape characters in bracket expressions; 无需在方括号表达式中转义字符; [\\*] matches both \\ and * . [\\*]匹配\\*
  • You want the first bracket expression to match just once, not zero or more times; 您希望第一个方括号表达式只匹配一次,而不是零次或多次。 your expression does match //* . 您的表达式确实匹配//*
  • You don't anchor at the end of the line, so this expressions matches // everywhere and removes them, even if followed by " or * . 您无需将其定位在行的末尾,因此此表达式在所有位置都//匹配并删除它们,即使后面跟有"*

For the peculiar case of a zero-length comment at the end of the line, 对于本行末尾零长度注释的特殊情况,

code here //

where you'd want to remove the // , you'd have to add a third substitution s#//$/# because the existing one expects at least one character after the // . 在要删除// ,必须添加第三个替换s#//$/#因为现有的替换期望//后面至少有一个字符。

Notice how this is not super clean in that it might leave useless spaces at the end of lines, but that can easily be remedied by s/[[:space:]]*$/$/ . 请注意,这不是很干净,因为它可能在行尾留下无用的空格,但是可以很容易地用s/[[:space:]]*$/$/纠正。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM