比较2个Unix文件和输出匹配行到新文件？

Question

I have 2 nix files. 我有2个nix文件。 All of the data is on one single line in each file. 所有数据都在每个文件的一行中。 Each value is separated by a null character. 每个值由空字符分隔。 Some off the values in the data match. 一些关闭数据中的值匹配。

How would I parse this data into a new file listing only the matching values ? 如何将此数据解析为仅列出匹配值的新文件？

I figure I could use sed to change the null characters into newlines ? 我想我可以使用sed将空字符更改为换行符？ From there on I'm not real sure... 从那以后，我不确定......

Any ideas ? 有任何想法吗？

Answer 1

Use tr , sort and comm : 使用tr ， sort和comm ：

Convert nulls into new lines, and sort the result: 将空值转换为新行，并对结果进行排序：

$ tr '\000' '\n' < file1 | sort > file1.txt
$ tr '\000' '\n' < file2 | sort > file2.txt

then use comm to get the lines that are common to both file: 然后使用comm来获取两个文件共有的行：

$ comm -1 -2 file1.txt file2.txt
<lines shown here are the common lines between file1.txt and file2.txt>

Answer 2

If there are no duplicate values within file1 or file2, you can do this: 如果file1或file2中没有重复值，则可以执行以下操作：

( tr '\0' '\n' < file1; tr '\0' '\n' < file2 ) | sort | uniq -c | egrep -v '^ +1'

This will count all of the duplicate values between the two files. 这将计算两个文件之间的所有重复值。

If the order of the fields is important, you can do this: 如果字段的顺序很重要，您可以这样做：

comm -1 -2 <(tr '\0' '\n' < file1) <(tr '\0' '\n' < file2)

This approach is not portable, it requires the 'process substitution' feature of Bash. 这种方法不可移植，它需要Bash的“进程替换”功能。

Answer 3

这可能对你有用：

parallel 'tr "\000" "\n" <{} | sort -u' ::: file{1,2} | sort | uniq -d

比较2个Unix文件和输出匹配行到新文件？

问题描述

3 个解决方案

解决方案1
12 已采纳 2012-01-04 04:58:11

解决方案2
5 2012-01-04 05:14:00

解决方案3
1 2012-02-11 22:21:33

比较2个Unix文件和输出匹配行到新文件？

问题描述

3 个解决方案

解决方案1 12 已采纳 2012-01-04 04:58:11

解决方案2 5 2012-01-04 05:14:00

解决方案3 1 2012-02-11 22:21:33

解决方案1
12 已采纳 2012-01-04 04:58:11

解决方案2
5 2012-01-04 05:14:00

解决方案3
1 2012-02-11 22:21:33