Hi I am trying to make a filter for a piratebay movie rss feed, which filters out the movies I already acquired and keeps the ones I do not currently have. It will then later on download the torrent from the magnet link provided. The problem is I can't figure out how to filter out the movies I have from the ones I don't, as I am trying to filter a list from a string and do not know a way around it. Here is a run-able example, with the code I want to add in notes:
import feedparser
import ssl
if hasattr(ssl, '_create_unverified_context'):
ssl._create_default_https_context = ssl._create_unverified_context
feed = feedparser.parse('https://thepiratebay.org/rss/top100/207')
feed_title = feed['feed']['title']
feed_entries = feed.entries
f = open("movies.txt", "r+")
fr = f.readlines()
print(fr)
for entry in feed.entries[:25]:
el = entry.title.lower()
# if fr in el:
# remove_from_titles()
# else:
article_title = el
article_link = entry.link
print(article_title)
print(article_link)
movies.txt file:
aquaman
spiderman
Can you try the following:
with open("movies.txt", "r+") as f:
fr = f.readlines()
if article_title.lower() not in movies_list:
print(article_title)
# do your downloading stuff here
# update your movies.txt file
with open("movies.txt", "a") as f:
f.write('\n' + 'article_title')
Try to use set instead of list. If feed set is A and file titles B then the tittles in A that are not in B is A.difference(B)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.