I currently have a list of URLs, for example:
list = ['https://finance.yahoo.com/', 'https://query1.finance.yahoo.com/', 'https://ad.doubleclick.net/ddm/trackclk/']
I want to isolate the "query1.finance" URL and delete the others. I would like to be able to do this across different lists with different elements, using only the criteria that a URL that contains the text "query1" be kept in each list.
Is there an easy way to do this? I am using a selenium driver to pull hrefs off of websites and the hrefs are all imported as URLs, but I only want one of the href's for my use.
If the only condition is that the url contains the string 'query1'
then the following code will work:
url_list = [
'https://finance.yahoo.com/',
'https://query1.finance.yahoo.com/',
'https://ad.doubleclick.net/ddm/trackclk/'
]
filtered_list = [url for url in url_list if 'query1' in url]
You could simple use a for loop and check if 'query1.' is a substring of that url. If it's not a substring simply remove it from the list.
for i in list:
if (i.find('query1.') == -1):
list.remove(i)
This code below does the trick and returns a new list.
def filterLinks(lyst):
final_list = []
for i in range(len(lyst)):
if 'query1' in lyst[i]:
final_list.append(lyst[i])
return final_list
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.