使用bash或awk显示两个csv文件的数据差异

Question

I need your advice about a situation I have comparing two cvs files in bash: 对于在bash中比较两个cvs文件的情况，我需要您的建议：

file1.csv file1.csv

300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000043.000|15|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|31583000|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000210.000|14|0|49300|1|43|4
300000493|300000323|300000323|300000000|16|0|12619|0|0|+000000000000014.000|16|0|49300|89|42|4
300146897|300146897|300000394|300000000|609|1|12619|0|0|+000000000000020.000|1|0|14689700|7|36|4

file2.csv file2.csv

300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000053.000|1|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|49300|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000219.000|14|0|49300|1|43|5

The diff -y file1.csv file2.csv command shows a similar output I'm looking for: diff -y file1.csv file2.csv命令显示了我正在寻找的类似输出：

300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000043.000|15|0|49300|1|42|4       |    300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000053.000|1|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|31583000|89|43|4   |    300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|49300|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000210.000|14|0|49300|1|43|4      |    300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000219.000|14|0|49300|1|43|5
300000493|300000323|300000323|300000000|16|0|12619|0|0|+000000000000014.000|16|0|49300|89|42|4     <
300146897|300146897|300000394|300000000|609|1|12619|0|0|+000000000000020.000|1|0|14689700|7|36|4   <

However I'm trying to get a more advanced output identifying with an asterik * the differences between cells and if a whole row does not exists in one of the sides, then put a dash - . 但是，我试图获得一个更高级的输出，用星号标识*单元格之间的差异，并且如果两侧中的某一行不存在整行，请在前面加上破折号- 。 And finally create one output file per side (because after that I'm going to convert each output csv to html in order to embbed them in a html file), something like: 最后，每侧创建一个输出文件（因为在此之后，我将每个输出csv转换为html以便将它们嵌入到html文件中），如下所示：

file1.out.csv file1.out.csv

300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000043.000*|15|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|31583000*|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000210.000*|14|0|49300|1|43|4*
300000493|300000323|300000323|300000000|16|0|12619|0|0|+000000000000014.000|16|0|49300|89|42|4
300146897|300146897|300000394|300000000|609|1|12619|0|0|+000000000000020.000|1|0|14689700|7|36|4

file2.out.csv file2.out.csv

300000493|300000323|300000323|300000000|2|0|12619|0|0|+000000000000053.000*|1|0|49300|1|42|4
300315830|300315830|300000419|300000000|2|0|12619|0|0|+000000000004020.000|18|0|49300*|89|43|4
300000493|300000323|300000323|300000000|10|0|12619|0|0|+000000000000219.000*|14|0|49300|1|43|5*
-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-
-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-

Hopefully you can help me here. 希望你能在这里帮助我。 Thanks! 谢谢！

Answer 1

I think a possible solution will be use: 我认为可能使用的解决方案是：

paste -d '\n' file1.csv file2.csv > pasted.csv

And then read the output file to generate I need 然后读取输出文件以生成我需要的

使用bash或awk显示两个csv文件的数据差异

问题描述

1 个解决方案

解决方案1
0 2018-07-11 18:32:47

使用bash或awk显示两个csv文件的数据差异

问题描述

1 个解决方案

解决方案1 0 2018-07-11 18:32:47

解决方案1
0 2018-07-11 18:32:47