Shell 脚本 - 如何合并两个文本文件而不重复行

Question

my case is apparently easy, but I couldn't do it in a simple way and I need it because the real files is very large.我的情况显然很简单，但我无法以简单的方式做到这一点，我需要它，因为真实文件非常大。

So, I have two txt files and I would like to generate a new file containing the both content of the two without duplicating the lines.所以，我有两个 txt 文件，我想生成一个包含两者内容的新文件，而不重复这些行。 Something like that:像这样的东西：

file1.txt文件1.txt

192.168.0.100
192.168.0.101
192.168.0.102

file2.txt文件2.txt

192.168.0.100
192.168.0.101
192.168.1.200
192.168.1.201

I would like to merge these files above and generate another one like this:我想合并上面的这些文件并生成另一个像这样的文件：

result.txt结果.txt

192.168.0.100
192.168.0.101
192.168.0.102
192.168.1.200
192.168.1.201

Any simple sugestions?有什么简单的建议吗？ Thank you谢谢

Answer 1

There's a semi-standard idiom in awk for removing duplicates: awk中有一个用于删除重复项的半标准习语：

awk '!a[$0]++ {print}' file1.txt file2.txt

The array a counts occurrences of each line, but only prints a line the first time it is added (ie, when a[$0] is 0 before it is incremented).数组a计算每一行的出现次数，但仅在第一次添加时打印一行（即，当a[$0]在递增之前为 0 时）。

This is asymptotically faster than sorting the input (and preserves the input order), but requires more memory.这比对输入进行排序要快（并保留输入顺序），但需要更多的 memory。

Answer 2

If changing the order is not an issue:如果更改订单不是问题：

sort -u file1.txt file2.txt > result.txt

First this sorts the lines of both files (in memory), then it runs through them and outputs each unique line only once ( -u flag).首先，这对两个文件的行进行排序（在内存中），然后遍历它们并仅输出每个唯一行一次（ -u标志）。

Shell 脚本 - 如何合并两个文本文件而不重复行

问题描述

2 个解决方案

解决方案1
2 2020-04-22 11:42:08

解决方案2
1 已采纳 2020-04-22 11:24:13

Shell 脚本 - 如何合并两个文本文件而不重复行

问题描述

2 个解决方案

解决方案1 2 2020-04-22 11:42:08

解决方案2 1 已采纳 2020-04-22 11:24:13

解决方案1
2 2020-04-22 11:42:08

解决方案2
1 已采纳 2020-04-22 11:24:13