简体   繁体   English

使用awk检测正则表达式模式并按行打印?

[英]use awk to detect regex patterns and print by line?

I have a text file in this format [ONE testing 1 2 3] [TWO lorem ipsum] [ONE 123] 我有一个这种格式的文本文件[ONE testing 1 2 3] [TWO lorem ipsum] [ONE 123]

I want to print `[ONE.+]` line by line. 我想逐行打印`[ONE。+]`。

An example output would be 示例输出为

[ONE testing 1 2 3]
[ONE 123]

I've tried awk '/\\[ONE.+\\]/ { print $1 }' but it didn't work. 我试过awk '/\\[ONE.+\\]/ { print $1 }'但没有用。 Can anyone teach me why? 谁能教我为什么? And what the correct way is? 正确的方法是什么?

awk works line by line, so the expression is only matched once per line. awk逐行工作,因此表达式每行仅匹配一次。 To do it in awk, you can use the match function in a loop. 要在awk中执行此操作,可以循环使用match函数。 You'd also have to modify your regex to be less greedy, since your expression doesn't magically stop at the first ]. 您还必须修改您的正则表达式以减少贪婪,因为您的表达式不会神奇地停在第一个]上。

It might be easier to just use grep: 仅使用grep可能会更容易:

echo  "[ONE testing 1 2 3] [TWO lorem ipsum] [ONE 123]" | grep -o '\[ONE[^]]*\]'

You can try something like this 您可以尝试这样的事情

sed -re 's/(\[ONE[^\[]*\])/\n\1\n/g' temp.txt

Input 输入项

[ONE testing 1 2 3] [TWO lorem ipsum] [ONE 123]

Output 输出量

[ONE testing 1 2 3]
 [TWO lorem ipsum] 
[ONE 123]

If you want to remove the column with TWO then 如果要删除两个列,则

sed -re 's/(\[ONE[^\[]*\])()/\n\1\n/g; s/(\[[^ONE][^\[]*\])//g' temp.txt

Output 输出量

[ONE testing 1 2 3]

[ONE 123]

If this is part of something bigger: 如果这是更大的一部分:

BEGIN { 
# Change the field-separator, from default blank, to the end-marker 
# for each "field"
    FS = "] "
}
# Get rid of lines which can't possibly match
!/\[ONE/ { next
    }
{
# Test and report each of three fields for starting with [ONE,
# "closing" the field with FS, except for the last which will 
# already be "closed"
if ( $1 ~ /^\[ONE/ ) {
    print $1 FS
    }
if ( $2 ~ /^\[ONE/ ) {
    print $2 FS
    }
if ( $3 ~ /^\[ONE/ ) {
    print $3
    }
}

The "if"s can be replaced by one in a loop if you are so inclined, but watch for the final one, as the FS (field separator) is not needed (unless you have a trailing blank in your data). 如果您愿意,可以将“ if”替换为一个循环,但请注意最后一个,因为不需要FS(字段分隔符)(除非数据中有尾随空白)。

"awk" by default takes as a 'single space' as the separator and the 'print $1' command tries to retrieve the first value separated by the default separator. 默认情况下,“ awk”以“单个空格”作为分隔符,“ print $ 1”命令尝试检索由默认分隔符分隔的第一个值。

Try this out : 试试看:

Let there is a text file named as 'test.txt' containing the three lines. 假设有一个名为“ test.txt”的文本文件,其中包含三行。

cat test.txt 猫test.txt

[ONE testing 1 2 3] [一个测试1 2 3]

[TWO lorem ipsum] [两个lorem ipsum]

[ONE 123] [一个123]

grep -h '[ONE*' test.txt grep -h'[ONE *'test.txt

[ONE testing 1 2 3] [一个测试1 2 3]

[ONE 123] [一个123]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM