简体   繁体   English

模式匹配后打印n个字符

[英]print n character after pattern match

I have line say 我要说

jan 02:12:00 YRU QRS : ASP.net Bird
feb 02:12:00 YRU QRS : ASP.net Dog

I want a script to have a pattern search of ASP.NET and print the 10 Characters from it in each line , So that it would involve both Bird and Dog 我希望脚本对ASP.NET进行模式搜索,并在每行中打印10个字符,以便它同时涉及BirdDog

Thanks 谢谢

One way using perl : 一种使用perl

perl -ne 'm/asp\.net\s+(.{0,10})/i && print "$1\n"' infile

That yields: 产生:

Bird
Dog

EDIT to explain the syntax of the perl one-liner: 编辑以解释perl单行代码的语法:

m/.../i tries to match a regular expression with the whole line. m/.../i尝试将正则表达式与整行匹配。 The i flag ignores the case. i标志忽略大小写。 The regular expression is the literal asp.net plus spaces plus any number of characters between 0 and 10 in greedy way. 正则表达式是文字asp.net加上空格以及贪婪方式在0到10之间的任意数量的字符。 If that succeeds execute the following instruction that prints what matched between parens. 如果成功,请执行以下指令,打印出两个括号之间的匹配内容。

perl -lne 'print $1 if /ASP.net (.{0,10})/'

GNU and BSD greps have a nice extension --only-matching , or -o , which will output only the part of the line you match: GNU和BSD greps有一个很好的扩展名--only-matching-o ,它将仅输出您匹配的行的一部分:

grep -Eio 'asp\.net.{0,10}' <<< 'jan 02:12:00 YRU QRS : ASP.net Bird
feb 02:12:00 YRU QRS : ASP.net Dog'
ASP.net Bird
ASP.net Dog

Bash can do this with its regular expression functionality, though it's probably better if you turn on case-insensitive matching first: Bash可以使用其正则表达式功能来做到这一点,但是如果先打开不区分大小写的匹配可能会更好:

shopt -s nocasematch
while read; do
    if [[ $REPLY =~ asp\.net(.{0,10}) ]]; then
        echo "${BASH_REMATCH[1]}"
    fi
done <<< 'jan 02:12:00 YRU QRS : ASP.net Bird
feb 02:12:00 YRU QRS : ASP.net Dog'
 Bird
 Dog

awk one-liner: awk一线:

awk -F'ASP\\.net' '{print substr($2,0,10)}' file

Note that this will print 10 chars immediately after ASP.net , which means, starting from the space . 请注意,这将在ASP.net之后立即打印10个字符,这意味着从space开始。 if you don't want the space, use the line below: 如果您不希望有空格,请使用以下行:

 awk -F'ASP\\.net ' '{print substr($2,0,10)}' file

Assuming you have the text inside a file "input.txt" the following one-liner will do the job: 假设您在文件“ input.txt”中包含文本,那么下面的单行代码即可完成此工作:

cat input.txt | awk '/ASP\.net/ {print substr($0, index($0,"ASP.net") + length("ASP.net"), 10)}'

Explanation: 说明:

  • On lines that contain the text "ASP.net" 在包含文本“ ASP.net”的行上
  • print 10 characters 打印10个字符
  • starting right after the location of "ASP.net" 在“ ASP.net”位置之后开始

这可能对您有用(GNU sed):

sed -nr '/.*ASP\.net(.{,10}).*/s//\1/p' file

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM