如何在 linux 的命令行上使用正则表达式过滤文本文件中以大写字母开头并以正 integer 结尾的行？

Question

I am attempting to use Regex with the grep command in the linux terminal in order to filter lines in a text file that start with Capital letter and end with a positive integer.我试图在 linux 终端中将正则表达式与 grep 命令一起使用，以便过滤文本文件中以大写字母开头并以正 integer 结尾的行。 Is there a way to modify my command so that it does this all in one line with one call of grep instead of two?有没有办法修改我的命令，以便通过一次调用 grep 而不是两次来完成这一切？ I am using windows subsystem for linux and the microsoft store ubuntu.我正在为 linux 和微软商店 ubuntu 使用 windows 子系统。

Text File:文本文件：

C line 1
c line 2
B line 3
d line 4
E line five

The command that I have gotten to work:我已经开始工作的命令：

grep ^[A-Z] cap*| grep [0-9]$ cap*

The Output Output

C line 1
B line 3

This works but i feel like the regex statement could be combined somehow but这行得通，但我觉得正则表达式语句可以以某种方式组合，但是

grep ^[A-Z][0-9]$

does not yield the same result as the command above.不会产生与上述命令相同的结果。

Answer 1

You need to use你需要使用

grep '^[A-Z].*[0-9]$'
grep '^[[:upper:]].*[0-9]$'

See the online demo .请参阅在线演示。 The regex matches:正则表达式匹配：

^ - start of string ^ - 字符串的开头
[AZ] / [[:upper:]] - an uppercase letter [AZ] / [[:upper:]] - 大写字母
.* - any zero or more chars ( [^0-9]* matches zero or more non-digit chars) .* - 任何零个或多个字符（ [^0-9]*匹配零个或多个非数字字符）
[0-9] - a digit. [0-9] - 一个数字。
$ - end of string. $ - 字符串结束。

Also, if you want to make sure there is no - before the number at the end of string, you need to use a negated bracket expression, like另外，如果你想确保没有-在字符串末尾的数字之前，你需要使用一个否定的括号表达式，比如

grep -E '^[[:upper:]].*[^-0-9][0-9]+$'

Here, the POSIX ERE regx (due to -E option) matches在这里，POSIX ERE regx（由于-E选项）匹配

^[[:upper:]].* - an uppercase letter at the start and then any text, ^[[:upper:]].* - 开头的大写字母，然后是任何文本，
[^-0-9] - any char other than a digit and - [^-0-9] - 数字以外的任何字符和-
[0-9]+ - one or more digits [0-9]+ - 一位或多位数字
$ - end of strng. $ - 字符串结束。

Answer 2

When you use a pipeline, you want the second grep to act on standard input, not on the file you originally grepped from.当您使用管道时，您希望第二个grep作用于标准输入，而不是作用于您最初从中获取的文件。

grep ^[A-Z] cap*| grep [0-9]$

However, you need to expand the second regex if you want to exclude negative numbers.但是，如果要排除负数，则需要扩展第二个正则表达式。 Anyway, a better solution altogether might be to switch to Awk:无论如何，一个更好的解决方案可能是切换到 Awk：

awk '/^[A-Z]/ && /[0-9]$/ && $NF > 0' cap*

The output format will be slightly different than from grep ; output 格式将与grep略有不同； if you want to include the name of the matching file, you have to specify that separately:如果要包含匹配文件的名称，则必须单独指定：

awk '/^[A-Z]/ && /[0-9]$/ && $NF > 0 { print FILENAME ":" $0 }' cap*

The regex ^[AZ][0-9]$ matches exactly two characters, the first of which must be an alphabetic, and the second one has to be a number.正则表达式^[AZ][0-9]$恰好匹配两个字符，第一个字符必须是字母，第二个字符必须是数字。 If you want to permit arbitrary text between them, that would be ^[AZ].*[0-9]$ (and for less arbitrary, use something a bit more specific than .* , like (.*[^-0-9])? perhaps, where you need grep -E for the parentheses and the question mark for optional, or backslashes before each of these for the BRE regex dialect you get out of the box with POSIX grep ).如果您想在它们之间允许任意文本，那将是^[AZ].*[0-9]$ （并且为了不那么随意，请使用比.*更具体的东西，例如(.*[^-0-9])?也许，您需要grep -E作为括号和问号作为可选，或者在每个这些之前的反斜杠对于您使用 POSIX grep开箱即用的 BRE 正则表达式方言。

如何在 linux 的命令行上使用正则表达式过滤文本文件中以大写字母开头并以正 integer 结尾的行？

问题描述

2 个解决方案

解决方案1
0 2021-12-20 23:45:52

解决方案2
0 2021-12-21 07:53:27

如何在 linux 的命令行上使用正则表达式过滤文本文件中以大写字母开头并以正 integer 结尾的行？

问题描述

2 个解决方案

解决方案1 0 2021-12-20 23:45:52

解决方案2 0 2021-12-21 07:53:27

解决方案1
0 2021-12-20 23:45:52

解决方案2
0 2021-12-21 07:53:27