简体   繁体   English

外壳工具(awk / grep / sed等),用于删除数字之后和固定标记之前的文本

[英]shell tool (awk/grep/sed etc) to remove text after number and before a fixed marker

I have lines with the pattern 我有图案的线条

<positive_integer> <textA(which may include an integer)> | <larger_integer> <textA>

For instance: 例如:

1544 input packet chains processed with length greater than 4 | 1545 input packet chains processed with length greater than 4

I'm not sure what whitespace rules there are, may be tab or spaces there. 我不确定有什么空白规则,可能是制表符或空格。 I think the second textA will be the same as the first but maybe there is something in netstat output where that might not be true. 我认为第二个textA将与第一个相同,但也许netstat输出中可能有一些不正确的地方。
If it helps, I am working on the output of diff -y fileA fileB where fileA and fileB lines came from netstat -s at different times — after a bit of filtering: 如果有帮助,我正在处理diff -y fileA fileB的输出,其中fileA和fileB行在不同时间来自netstat -s —经过一些过滤后:

Based on suggestions better filtering for me is:
netstat -s | awk '/error|length|bad|overflow|failure|dropped|loss|unknown|detect|^[[:lower:]]*:$/ { if ($1!= 0) { $1=$1; print} }'  
(keeping protocol type lines like tcp: ip:, call this the flag, which may be useful.
I hope to prepend this flag to each line (store in variables), and maybe add the
number from the line after the flag which shows the total data of that type.)

Deprecated code was: 
netstat -s | awk '{$1=$1};1' | grep -v "^0" |
grep "error\|length\|bad\|overflow\|failure\|dropped\|loss\|unknown\|detect"

I am hunting down network issues... 我正在寻找网络问题...

I'd like to get out (with a simple one-line, pipe-able, OS X command): 我想出去(用一个简单的单行,可管道的OS X命令):

 1544 | 1545 input packet chains processed with length greater than 4

If it's easy and compact in the same command I'd show the data change more clearly as 如果在同一命令中既简单又紧凑,我将显示数据更改更清晰

 1544 > 1545 input packet chains processed with length greater than 4

This will be compact and readable in log file or on screen... 这将是紧凑的,并且可以在日志文件或屏幕上阅读...

Is there a better way to get here from fileA and fileB than first diff -y ? 是否有比第一个diff -y从fileA和fileB到达这里更好的方法?
Or a better way to detect anomalies in my network? 还是检测网络中异常的更好方法?

My test file: 我的测试文件:

238 times recovered from bad retransmission using DSACK       | 239 times recovered from bad retransmission using DSACK
17576 dropped due to full socket buffers              | 17593 dropped due to full socket buffers
14016 with data size > data length                | 14057 with data size > data length
3609 packets for unknown/unsupported protocol             | 3610 packets for unknown/unsupported protocol
13562 packets received for unknown multicast group        | 13571 packets received for unknown multicast group
4909 input packet chains processed with length greater than 2 | 4911 input packet chains processed with length greater than 2
1544 input packet chains processed with length greater than 4 | 1545 input packet chains processed with length greater than 4
1473 message too big failures                     | 1481 message too big failures
13 send failures                          | 17 send failures

You could use the command 您可以使用命令

$ sed -e 's/\([0-9][0-9]*\).*|/\1 |/' < input-file

If input-file contains 如果input-file包含

1544 input packet chains processed with length greater than 4 | 1545 input packet chains processed with length greater than 4

what you get out will be 你得到的将是

1544 | 1545 input packet chains processed with length greater than 4

Thanks to inspiration to try this route from awk (above) I have reduced and added to the magic incantations until... tan-tat-ta-TA-taraaa: sed 's/\\([0-9]*\\)[^|]*|/\\1 >/' 多亏了我尝试从awk(以上)尝试此路线的灵感,我减少并添加了魔咒,直到... tan-tat-ta-TA-taraaa: sed 's/\\([0-9]*\\)[^|]*|/\\1 >/'
I think .* is greedy and runs to line end. 我认为。*是贪婪的,并且会持续到行尾。
repeated not | 重复不了 works [^|]* 作品[^ |] *
And I have built in my change of | 我已经建立了| to > for nice formatting my test file rendering as: 到>,以便将我的测试文件渲染格式设置为:

238 > 239 times recovered from bad retransmission using DSACK
17576 > 17593 dropped due to full socket buffers
14016 > 14057 with data size > data length
3609 >  3610 packets for unknown/unsupported protocol
13562 > 13571 packets received for unknown multicast group
4909 >  4911 input packet chains processed with length greater than 2
1544 >  1545 input packet chains processed with length greater than 4
1473 >  1481 message too big failures
13 >    17 send failures

Pretty enough for me! 对我来说足够了! Could not manage to insert a tab char before the >... 无法在> ...之前插入制表符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM