I'm using praw the Reddit developer extension to take the title from a certain subreddit. I want to only take the title if the url the post directs to ends in .tv.
How to extract specific URLs that contain the top-level domain .tv and append them to their own list?
import praw
reddit = praw.Reddit(client_id='', client_secret='', user_agent='')
hot_p = reddit.subreddit('music').top('week')
for post in hot_p:
# if post.url ends in .tv...
raw_titles.append(post.title)
raw_url.append(post.url)
I will assume that the URL can be be http://abtv/etc or even http://abtv:80/etc , so:
from urllib.parse import urlparse
for post in hot_p:
o = urlparse(post.url)
top_level_domain = o.netloc.split('.')[-1].split(':')[0]
if top_level_domain == 'tv':
raw_titles.append(post.title)
raw_url.append(post.url)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.