Comparing tow large files line by line

Question

I was wondering if there is any efficient way to compare 2 large files line by line.

File 1

2
3
2

File 2

2 | haha
3 | hoho
4 | hehe

I am just taking the first character of each file and comparing against them. Currently i am using a very naive method of iterating through them in a double for loop.

Like

For i in file 1: 
    line number = 0
    For j in file 2: 
        loop until line number == counter else add 1 to line number 
        Compare line 1 
    increase counter

Reading both files into memory is not an option. I am using python on linux but i am open to both bash solutions and python script solutions

Answer 1

What about something like this:

diff <(cut -c 1 file1.txt) <(cut -c 1 file2.txt)

diff is the tool you use to compare files' lines. You can use process substitution (anonymous pipe) to compare a version of each file only containing the first character (using cut ).

Answer 2

You could zip the two files and iterate them together.

f1 = open('File 1')
f2 = open('File 2')

flag = True 

for file1_line, file2_line in zip(f1, f2):
  if file1_line[0] != file2_line[0]:
    flag = False
    break

print(flag)

Comparing tow large files line by line

Question

2 answers

solution1
2 ACCPTED 2015-09-07 06:31:55

solution2
0 2015-09-07 06:42:48

Comparing tow large files line by line

Question

2 answers

solution1 2 ACCPTED 2015-09-07 06:31:55

solution2 0 2015-09-07 06:42:48

solution1
2 ACCPTED 2015-09-07 06:31:55

solution2
0 2015-09-07 06:42:48