如何使用Linux shell脚本生成文本文件中的唯一行列表？

Question

Suppose I have a file that contain a bunch of lines, some repeating: 假设我有一个包含一堆行的文件，有些重复：

line1
line1
line1
line2
line3
line3
line3

What linux command(s) should I use to generate a list of unique lines: 我应该使用什么linux命令来生成唯一行的列表：

line1
line2
line3

Does this change if the file is unsorted, ie repeating lines may not be in blocks? 如果文件未排序，这是否会改变，即重复行可能不在块中？

Answer 1

If you don't mind the output being sorted, use 如果您不介意输出被排序，请使用

sort -u

This sorts and removes duplicates 这会对重复项进行排序和删除

Answer 2

cat to output the contents, piped to sort to sort them, piped to uniq to print out the unique values: cat输出内容，管道sort以对它们进行排序，管道输出到uniq以打印出唯一值：

cat test1.txt | sort | uniq

you don't need to do the sort part if the file contents are already sorted. 如果文件内容已经排序，则不需要执行sort部分。

Answer 3

Create a new sort file with unique lines : 使用唯一行创建新的排序文件：

sort -u file >> unique_file

Create a new file with uniques lines (unsorted) : 使用唯一线条（未排序）创建新文件：

cat file | uniq >> unique_file

Answer 4

If we do not care about the order , then the best solution is actually: 如果我们不关心订单 ，那么最好的解决方案实际上是：

sort -u file

If we also want to ignore the case letter , we can use it (as a result all letters will be converted to uppercase): 如果我们也想忽略大小写字母 ，我们可以使用它（因此所有字母都将转换为大写）：

sort -fu file

It would seem that even a better idea would be to use the command: 似乎更好的想法是使用命令：

uniq file

and if we also want to ignore the case letter (as a result the first row of duplicates is returned, without any change in case): 如果我们也想忽略大小写字母 （结果返回第一行重复项，大小写没有任何变化）：

uniq -i file

However, in this case, may be returned a completely different result, than in case when we use the sort command, because uniq command does not detect repeated lines unless they are adjacent . 但是，在这种情况下，可能会返回与使用 sort 命令 时完全不同的结果 ，因为uniq 命令不检测重复的行，除非它们是相邻的 。

如何使用Linux shell脚本生成文本文件中的唯一行列表？

问题描述

4 个解决方案

解决方案1
31 已采纳 2013-05-30 16:06:53

解决方案2
9 2013-05-30 16:07:02

解决方案3
3 2018-04-10 06:08:56

解决方案4
1 2019-03-14 12:08:14

如何使用Linux shell脚本生成文本文件中的唯一行列表？

问题描述

4 个解决方案

解决方案1 31 已采纳 2013-05-30 16:06:53

解决方案2 9 2013-05-30 16:07:02

解决方案3 3 2018-04-10 06:08:56

解决方案4 1 2019-03-14 12:08:14

解决方案1
31 已采纳 2013-05-30 16:06:53

解决方案2
9 2013-05-30 16:07:02

解决方案3
3 2018-04-10 06:08:56

解决方案4
1 2019-03-14 12:08:14