简体   繁体   English

grep仅适用于包含“标准”美国字符的行

[英]grep for lines containing “standard” US characters only

I'm trying to figure out how to grep for lines that are made up of AZ and az exclusively, that is, the "American" alphabet of letters. 我试图找出如何grep由AZ和az组成的行,即字母的“美国”字母表。 I would expect this to work, but it does not: 我希望这可行,但它没有:

$ echo -e "Jutland\nJastrząb" | grep -x '[A-Za-z]*'
Jutland
Jastrząb

I want this to only print "Jutland", because ą is not a letter in the American alphabet. 我想这只是打印“日德兰半岛”,因为一个不在美国字母一个字母。 How can I achieve this? 我怎样才能做到这一点?

You need to add LC_ALL=C before grep : 你需要在grep之前添加LC_ALL=C

printf '%b\n' "Jutland\nJastrząb" | LC_ALL=C grep -x '[A-Za-z]*'

Jutland

You may also use -i switch to ignore case and reduce regex: 您也可以使用-i开关忽略大小写并减少正则表达式:

printf '%b\n' "Jutland\nJastrząb" | LC_ALL=C grep -ix '[a-z]*'

LC_ALL=C avoids locale-dependent effects otherwise your current LOCALE treats ą as [a-zA-Z] . LC_ALL=C避免了与语言环境相关的效果,否则您当前的LOCALE将ą视为[a-zA-Z]

You can use perl regex: 你可以使用perl正则表达式:

$ echo -e "Jutland\nJastrząb" | grep -P '^[[:ascii:]]+$'
Jutland

It's experimental though: 虽然它是实验性的:

-P, --perl-regexp
      Interpret  the  pattern as a Perl-compatible regular expression (PCRE).  This is experimental and
      grep -P may warn of unimplemented features.

EDIT 编辑

For letters only, use [A-Za-z] : 仅限字母,请使用[A-Za-z]

$ echo -e "L'Egyptienne\nJutland\nJastrząb" | grep -P '^[A-Za-z]+$'
Jutland

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM