简体   繁体   English

Shell 脚本 - 如何合并两个文本文件而不重复行

[英]Shell Script - How to merge two text files without repeating lines

my case is apparently easy, but I couldn't do it in a simple way and I need it because the real files is very large.我的情况显然很简单,但我无法以简单的方式做到这一点,我需要它,因为真实文件非常大。

So, I have two txt files and I would like to generate a new file containing the both content of the two without duplicating the lines.所以,我有两个 txt 文件,我想生成一个包含两者内容的新文件,而不重复这些行。 Something like that:像这样的东西:

file1.txt文件1.txt

192.168.0.100
192.168.0.101
192.168.0.102

file2.txt文件2.txt

192.168.0.100
192.168.0.101
192.168.1.200
192.168.1.201

I would like to merge these files above and generate another one like this:我想合并上面的这些文件并生成另一个像这样的文件:

result.txt结果.txt

192.168.0.100
192.168.0.101
192.168.0.102
192.168.1.200
192.168.1.201

Any simple sugestions?有什么简单的建议吗? Thank you谢谢

There's a semi-standard idiom in awk for removing duplicates: awk中有一个用于删除重复项的半标准习语:

awk '!a[$0]++ {print}' file1.txt file2.txt

The array a counts occurrences of each line, but only prints a line the first time it is added (ie, when a[$0] is 0 before it is incremented).数组a计算每一行的出现次数,但仅在第一次添加时打印一行(即,当a[$0]在递增之前为 0 时)。

This is asymptotically faster than sorting the input (and preserves the input order), but requires more memory.这比对输入进行排序要快(并保留输入顺序),但需要更多的 memory。

If changing the order is not an issue:如果更改订单不是问题:

sort -u file1.txt file2.txt > result.txt

First this sorts the lines of both files (in memory), then it runs through them and outputs each unique line only once ( -u flag).首先,这对两个文件的行进行排序(在内存中),然后遍历它们并仅输出每个唯一行一次( -u标志)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM