The full.txt contains:
www.example.com/a.jpg
www.example.com/b.jpg
www.example.com/k.jpg
www.example.com/n.jpg
www.example.com/x.jpg
The partial.txt contains:
a.jpg
k.jpg
Why the following code does not provide the desired result?
with open ('full.txt', 'r') as infile:
lines_full=[line for line in infile]
with open ('partial.txt', 'r') as infile:
lines_partial=[line for line in infile]
with open ('remaining.txt', 'w') as outfile:
for element in lines_full:
if element[16:21] not in lines_partial: #element[16:21] means like a.jpg
outfile.write (element)
The desired remaining.txt should have those elements of full.txt that are not in partial.txt exactly as follows:
www.example.com/b.jpg
www.example.com/n.jpg
www.example.com/x.jpg
This code will include the newline character at the end of each line, which means it will never match "a.jpg"
or "k.jpg"
precisely.
with open ('partial.txt', 'r') as infile:
lines_partial=[line for line in infile]
Change it to
with open ('partial.txt', 'r') as infile:
lines_partial=[line[:-1] for line in infile]
to get rid of the newline characters ( line[:-1]
means "without the last character of the line")
you can use os.path library:
from os import path
with open ('full.txt', 'r') as f:
lines_full = f.read().splitlines()
with open ('partial.txt', 'r') as f:
lines_partial = set(f.read().splitlines()) # create set for faster checking
lines_new = [x + '\n' for x in lines_full if path.split(x)[1] not in lines_partial]
with open('remaining.txt', 'w') as f:
f.writelines(lines_new)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.