I was wondering if there is any efficient way to compare 2 large files line by line.
File 1
2
3
2
File 2
2 | haha
3 | hoho
4 | hehe
I am just taking the first character of each file and comparing against them. Currently i am using a very naive method of iterating through them in a double for loop.
Like
For i in file 1:
line number = 0
For j in file 2:
loop until line number == counter else add 1 to line number
Compare line 1
increase counter
Reading both files into memory is not an option. I am using python on linux but i am open to both bash solutions and python script solutions
What about something like this:
diff <(cut -c 1 file1.txt) <(cut -c 1 file2.txt)
diff
is the tool you use to compare files' lines. You can use process substitution (anonymous pipe) to compare a version of each file only containing the first character (using cut
).
You could zip the two files and iterate them together.
f1 = open('File 1')
f2 = open('File 2')
flag = True
for file1_line, file2_line in zip(f1, f2):
if file1_line[0] != file2_line[0]:
flag = False
break
print(flag)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.