I have two text files, file1
and file2
.
File1
contains a bunch of random words, and file2
contains words that I want to remove from file1
when they occur. Is there a way of doing this?
I know I probably should include my own attempt at a script, to at least show effort, but to be honest it's laughable and wouldn't be of any help.
If someone could at least give a tip about where to start, it would be greatly appreciated.
get the words from each:
f1 = open("/path/to/file1", "r")
f2 = open("/path/to/file2", "r")
file1_raw = f1.read()
file2_raw = f2.read()
file1_words = file1_raw.split()
file2_words = file2_raw.split()
if you want unique words from file1 that aren't in file2:
result = set(file1_words).difference(set(file2_words))
if you care about removing the words from the text of file1
for w in file2_words:
file1_raw = file1_raw.replace(w, "")
If you read the words into a set
(one for each file), you can use set.difference()
. This works if you don't care about the order of the output.
If you care about the order, read the first file into a list, the second into a set, and remove all the elements in the list that are in the set.
a = ["a", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]
b = {"quick", "brown"}
c = [x for x in a if not x in b]
print c
gives: ['a', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.