[英]How do I filter lines in a text file that start with a capital letter and end with a positive integer with regex on the command line in linux?
I am attempting to use Regex with the grep command in the linux terminal in order to filter lines in a text file that start with Capital letter and end with a positive integer.我试图在 linux 终端中将正则表达式与 grep 命令一起使用,以便过滤文本文件中以大写字母开头并以正 integer 结尾的行。 Is there a way to modify my command so that it does this all in one line with one call of grep instead of two?有没有办法修改我的命令,以便通过一次调用 grep 而不是两次来完成这一切? I am using windows subsystem for linux and the microsoft store ubuntu.我正在为 linux 和微软商店 ubuntu 使用 windows 子系统。
Text File:文本文件:
C line 1
c line 2
B line 3
d line 4
E line five
The command that I have gotten to work:我已经开始工作的命令:
grep ^[A-Z] cap*| grep [0-9]$ cap*
The Output Output
C line 1
B line 3
This works but i feel like the regex statement could be combined somehow but这行得通,但我觉得正则表达式语句可以以某种方式组合,但是
grep ^[A-Z][0-9]$
does not yield the same result as the command above.不会产生与上述命令相同的结果。
You need to use你需要使用
grep '^[A-Z].*[0-9]$'
grep '^[[:upper:]].*[0-9]$'
See the online demo .请参阅在线演示。 The regex matches:正则表达式匹配:
^
- start of string ^
- 字符串的开头[AZ]
/ [[:upper:]]
- an uppercase letter [AZ]
/ [[:upper:]]
- 大写字母.*
- any zero or more chars ( [^0-9]*
matches zero or more non-digit chars) .*
- 任何零个或多个字符( [^0-9]*
匹配零个或多个非数字字符)[0-9]
- a digit. [0-9]
- 一个数字。$
- end of string. $
- 字符串结束。 Also, if you want to make sure there is no -
before the number at the end of string, you need to use a negated bracket expression, like另外,如果你想确保没有-
在字符串末尾的数字之前,你需要使用一个否定的括号表达式,比如
grep -E '^[[:upper:]].*[^-0-9][0-9]+$'
Here, the POSIX ERE regx (due to -E
option) matches在这里,POSIX ERE regx(由于-E
选项)匹配
^[[:upper:]].*
- an uppercase letter at the start and then any text, ^[[:upper:]].*
- 开头的大写字母,然后是任何文本,[^-0-9]
- any char other than a digit and -
[^-0-9]
- 数字以外的任何字符和-
[0-9]+
- one or more digits [0-9]+
- 一位或多位数字$
- end of strng. $
- 字符串结束。 When you use a pipeline, you want the second grep
to act on standard input, not on the file you originally grepped from.当您使用管道时,您希望第二个grep
作用于标准输入,而不是作用于您最初从中获取的文件。
grep ^[A-Z] cap*| grep [0-9]$
However, you need to expand the second regex if you want to exclude negative numbers.但是,如果要排除负数,则需要扩展第二个正则表达式。 Anyway, a better solution altogether might be to switch to Awk:无论如何,一个更好的解决方案可能是切换到 Awk:
awk '/^[A-Z]/ && /[0-9]$/ && $NF > 0' cap*
The output format will be slightly different than from grep
; output 格式将与grep
略有不同; if you want to include the name of the matching file, you have to specify that separately:如果要包含匹配文件的名称,则必须单独指定:
awk '/^[A-Z]/ && /[0-9]$/ && $NF > 0 { print FILENAME ":" $0 }' cap*
The regex ^[AZ][0-9]$
matches exactly two characters, the first of which must be an alphabetic, and the second one has to be a number.正则表达式^[AZ][0-9]$
恰好匹配两个字符,第一个字符必须是字母,第二个字符必须是数字。 If you want to permit arbitrary text between them, that would be ^[AZ].*[0-9]$
(and for less arbitrary, use something a bit more specific than .*
, like (.*[^-0-9])?
perhaps, where you need grep -E
for the parentheses and the question mark for optional, or backslashes before each of these for the BRE regex dialect you get out of the box with POSIX grep
).如果您想在它们之间允许任意文本,那将是^[AZ].*[0-9]$
(并且为了不那么随意,请使用比.*
更具体的东西,例如(.*[^-0-9])?
也许,您需要grep -E
作为括号和问号作为可选,或者在每个这些之前的反斜杠对于您使用 POSIX grep
开箱即用的 BRE 正则表达式方言。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.