如何过滤出长度为8且以.com结尾的文本文件行？

Question

I have a list of a million domain names in name.txt 我在name.txt列出了一百万个域名

hello.com
abc.com
gogogo.us
goodbye.me
...
...

How do I pipe only domain names with 8 letters (including the .com ) and only ends in .com to names_new.txt ? 如何仅将8个字母（包括.com ）且仅以.com结尾的域名通过管道传递给names_new.txt ？

I'm looking for a simple command and not a script or anything. 我在寻找一个简单的命令，而不是脚本或其他任何东西。

Answer 1

grep是第一个用于模式匹配的工具：

egrep -x '[a-z]{4}\.com' name.txt > newname.txt

Answer 2

尝试

 egrep "^[a-z][a-z][a-z][a-z]\.com$" name.txt > names_new.txt

Answer 3

Use Awk. 使用Awk。 The domain name is split by . 域名被分割. into fields. 进入领域。

First field is tested for length 4,as the .com adds another 4 chars. 第一个字段的长度为4，因为.com添加了另外4个字符。

The second field should contain com . 第二个字段应包含com 。

When both conditions are met, the line is printed. 当两个条件都满足时，将打印该行。

cat name.txt |awk -F. '((length($1)==4)&&($2=="com")){print;}' > names_new.txt

Note: the line may found false positives if you have subdomains, eg: mail.com.nz 注意：如果您有子域，则该行可能会发现误报，例如： mail.com.nz

Answer 4

There may be domain names with dashes or numbers. 域名可能带有破折号或数字。
-i forces egrep to match regardless of case. -i强制egrep匹配（无论大小写）。

egrep -i "^[a-z0-9-]{4}\.com$" name.txt > names_new.txt

如何过滤出长度为8且以.com结尾的文本文件行？

问题描述

4 个解决方案

解决方案1
4 2013-03-16 20:03:21

解决方案2
0 2013-03-16 20:00:54

解决方案3
0 2013-03-16 20:15:10

解决方案4
0 2013-03-20 03:11:22

如何过滤出长度为8且以.com结尾的文本文件行？

问题描述

4 个解决方案

解决方案1 4 2013-03-16 20:03:21

解决方案2 0 2013-03-16 20:00:54

解决方案3 0 2013-03-16 20:15:10

解决方案4 0 2013-03-20 03:11:22

解决方案1
4 2013-03-16 20:03:21

解决方案2
0 2013-03-16 20:00:54

解决方案3
0 2013-03-16 20:15:10

解决方案4
0 2013-03-20 03:11:22