简体   繁体   English

Grep模式匹配用双引号括起来的小写字符串

[英]Grep pattern matching lower case string enclosed in double quotes

I'm having a bit of an issue with grep that I can't seem to figure out. 我有一点grep的问题,我似乎无法弄明白。 I'm trying to search for all instances of lower case words enclosed in double quotes (C strings) in a set of source files. 我正在尝试搜索一组源文件中用双引号(C字符串)括起来的小写单词的所有实例。 Using bash and gnu grep: 使用bash和gnu grep:

grep -e '"[a-z]+"' *.cpp

gives me no matches, while 给我没有比赛,而

grep -e '"[a-z]*"' *.cpp

gives me matches like "Abc" which is not just lower case characters. 给我像“Abc”这样的比赛,这不仅仅是小写字符。 What is the proper regular expression to match only "abc"? 什么是正确的正则表达式只匹配“abc”?

You're forgetting to escape the meta characters. 你忘了逃避元字符。

grep -e '"[a-z]\+"'

For the second part, the reason it is matching multi-case characters is because of your locale. 对于第二部分,它匹配多案例字符的原因是因为您的语言环境。 As follows: 如下:

$ echo '"Abc"' | grep -e '"[a-z]\+"'
"Abc"
$ export LC_ALL=C
$ echo '"Abc"' | grep -e '"[a-z]\+"'
$

To get the "ascii-like" behavior, you need to set your locale to "C", as specified in the grep man page: 要获得“类似ascii”的行为,您需要将您的语言环境设置为“C”,如grep手册页中所指定:

Within a bracket expression, a range expression consists of two characters separated by a hyphen. 在括号表达式中,范围表达式由两个用连字符分隔的字符组成。 It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. 它匹配使用区域设置的整理顺序和字符集在两个字符之间进行排序的任何单个字符。 For example, in the default C locale, [ad] is equivalent to [abcd]. 例如,在默认的C语言环境中,[ad]等同于[abcd]。 Many locales sort characters in dictionary order, and in these locales [ad] is typically not equivalent to [abcd]; 许多语言环境按字典顺序对字符进行排序,在这些语言环境中[ad]通常不等同于[abcd]; it might be equivalent to [aBbCcDd], for example. 例如,它可能等同于[aBbCcDd]。 To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C. 要获得括号表达式的传统解释,可以通过将LC_ALL环境变量设置为值C来使用C语言环境。

Mask the + 掩盖+

grep -e '"[a-z]\+"' *.cpp

or use egrep: 或使用egrep:

egrep  '"[a-z]+"' *.cpp

maybe you had -E in mind: 也许你有-E记:

grep -E '"[a-z]+"' *.cpp

The lowercase -e is used, for example, to specify multiple search patterns. 例如,小写-e用于指定多个搜索模式。

The phaenomenon of uppercase characters might origin from your locale - which you can prevent with: 大写字符的现象可能来自您的语言环境 - 您可以使用以下方法阻止:

LC_ALL=C egrep  '"[a-z]+"' *.cpp

你可能需要逃避+

grep -e '"[a-z]\+"' *.cpp

如果你不想搞乱语言环境,这对我有用:

grep -e '"[[:lower:]]\+"'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM