简体   繁体   English

如何使用Linux shell脚本生成文本文件中的唯一行列表?

[英]How to generate list of unique lines in text file using a Linux shell script?

Suppose I have a file that contain a bunch of lines, some repeating: 假设我有一个包含一堆行的文件,有些重复:

line1
line1
line1
line2
line3
line3
line3

What linux command(s) should I use to generate a list of unique lines: 我应该使用什么linux命令来生成唯一行的列表:

line1
line2
line3

Does this change if the file is unsorted, ie repeating lines may not be in blocks? 如果文件未排序,这是否会改变,即重复行可能不在块中?

If you don't mind the output being sorted, use 如果您不介意输出被排序,请使用

sort -u

This sorts and removes duplicates 这会对重复项进行排序和删除

cat to output the contents, piped to sort to sort them, piped to uniq to print out the unique values: cat输出内容,管道sort以对它们进行排序,管道输出到uniq以打印出唯一值:

cat test1.txt | sort | uniq

you don't need to do the sort part if the file contents are already sorted. 如果文件内容已经排序,则不需要执行sort部分。

Create a new sort file with unique lines : 使用唯一行创建新的排序文件:

sort -u file >> unique_file

Create a new file with uniques lines (unsorted) : 使用唯一线条(未排序)创建新文件:

cat file | uniq >> unique_file

If we do not care about the order , then the best solution is actually: 如果我们不关心订单 ,那么最好的解决方案实际上是:

sort -u file

If we also want to ignore the case letter , we can use it (as a result all letters will be converted to uppercase): 如果我们也想忽略大小写字母 ,我们可以使用它(因此所有字母都将转换为大写):

sort -fu file

It would seem that even a better idea would be to use the command: 似乎更好的想法是使用命令:

uniq file

and if we also want to ignore the case letter (as a result the first row of duplicates is returned, without any change in case): 如果我们也想忽略大小写字母 (结果返回第一行重复项,大小写没有任何变化):

uniq -i file

However, in this case, may be returned a completely different result, than in case when we use the sort command, because uniq command does not detect repeated lines unless they are adjacent . 但是,在这种情况下,可能会返回与使用 sort 命令 时完全不同的结果 因为uniq 命令不检测重复的行,除非它们是相邻的

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Shell脚本在文件中添加多行文本 - How to append several lines of text in a file using a shell script 如何在3行中生成随机数-Linux Shell脚本 - How do I generate random numbers in 3 lines - Linux Shell Script 如何使用shell脚本将文本附加到文件中的特定行? - How to append text to a specific lines in a file using shell script? 使用shell脚本从文件中提取唯一的行块 - extract unique block of lines from a file using shell script 如何使用Linux Shell脚本读取文本文件列并将文件复制到子目录中的另一个路径 - How to read the column of text file using Linux shell script and copy the files to another path with in sub directories Linux shell脚本按字母顺序排序文本行(最好不使用排序) - Linux shell script to sort lines of text reverse alphabetically (preferably without using sort) 如何使用linux命令行工具列出文本文件中使用的唯一字符? - How can I list unique characters used in a text file using linux command line tools? Linux Shell脚本:将目录内容列出到文件中 - Linux Shell Script: list contents of a directory into a file 从Linux bash Shell脚本中的两个变量创建唯一列表 - Create a unique list from two variables in a Linux bash shell script 如何使用 linux shell 脚本在文件中找到这种模式? - How can i find this pattern in file using linux shell script?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM