简体   繁体   English

如何从文件的“ xxTHISISMYSTRING”开头grep包含查询字符串的字符串,该查询字符串包含任意2个字符?

[英]How to grep strings that contain a query string with any 2 characters in the beginning “xxTHISISMYSTRING” from a file?

I have a multi-lined file in the format: 我有一个以下格式的多行文件:

hhhhhhhhhhhhhhhhhhh hhaaaa hhhhhhhhhhhhhh hhhhhhhhhhhhhhhhhhh oaaaaa hhhhhhhhhhhhhh hhhhhhhhhhhh hbaaaa hhhhhhhhhhhhhhhhhhhhh hhhhhhhhhhhhhhhhhhhhh fbaaaa hhhhhhhhhhhh 哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈 哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈 哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈来

I want to find all strings that contain the "aaaa" motif as well as the two letters preceding it. 我想找到所有包含“ aaaa”主题以及其前面两个字母的字符串。

How would I grep out the strings: hhaaaa, oaaaaa, hbaaaa, fbaaaa ? 我如何找出字符串: hhaaaa,oaaaaa,hbaaaa,fbaaaa With " aaaa " as my input. 以“ aaaa ”作为输入。

To match any character in a regex, use . 要匹配正则表达式中的任何字符,请使用. :

$ grep -o ..aaaa file
hhaaaa
hoaaaa
hbaaaa
fbaaaa

The -o option tells grep to print only the matches, not the context for the matches. -o选项告诉grep仅打印匹配项,而不打印匹配项的上下文。

To restrict the match to alphabetic characters, use the alphabetic class: 要将匹配限制为字母字符,请使用字母类:

$ grep -Eo '[[:alpha:]]{2}aaaa' file
hhaaaa
hoaaaa
hbaaaa
fbaaaa

[[:alpha:]] matches any alphabetic character. [[:alpha:]]匹配任何字母字符。 Unlike AZ , this is unicode-safe. AZ不同,这是Unicode安全的。 The {2} indicates two such characters. {2}表示两个这样的字符。 To avoid backslashes, we have added the -E flag to turn on extended regex. 为了避免反斜杠,我们添加了-E标志以打开扩展的正则表达式。

grep -oh "..aaaa" file.txt

will do. 会做。

-h, --no-filename -h,--no-文件名
Suppress the prefixing of file names on output. 在输出中禁止文件名的前缀。 This is the default 这是默认值
when there is only one file (or only standard input) to search. 仅搜索一个文件(或仅标准输入)时。
-o, --only-matching -o,--only-matching
Print only the matched (non-empty) parts of a matching line, 仅打印匹配行的匹配(非空)部分,
with each such part on a separate output line. 每个这样的部分都放在单独的输出线上。

grep -o '..aaaa' file

should do it. 应该这样做。 Had the objective been to count the total matches, then do: 如果目标是计算总匹配数,则执行以下操作:

grep -o '..aaaa' file | wc -l

GREP manpage says : GREP手册页说:

-o, --only-matching -o,--only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line. 仅打印匹配行的匹配(非空)部分,每个这样的部分都在单独的输出行上。

WC manpage says : WC手册页说:

-l, --lines -l,-行
print the newline counts 打印换行计数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM