简体   繁体   English

如何使列表理解具有“或?”

[英]How can I make list comprehension have “or?”

I am trying to get the list of links from a Google search: 我想从Google搜索中获取链接列表:

def google_word(word):
    headers={'User-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763'}
    url = 'https://google.com/search?q={}'.format(word)
    res= requests.get(url, headers=headers)
    tree= html.fromstring(res.text)
    li = tree.xpath("//a[@href]") #list of links that conatin href
    y = [link.get('href') for link in li if link.get('href').startswith("https://") if "google" not in link.get('href')]

Now, this code collects the right link that starts with " https://" , what I want to do is add the "http://" as well. 现在,此代码收集以“ https://"开头的正确链接,我想要做的是添加"http://" What do I need to add to the list comprehension in order to make that work (I am trying to do it in one line)? 我需要添加到列表理解中才能使其工作(我试图在一行中完成)?

将元组添加到startswith

y = [link.get('href') for link in li if link.get('href').startswith(("https://", "http://")) if "google" not in link.get('href')]

This line: 这一行:

y = [link.get('href') for link in li if link.get('href').startswith("https://") if "google" not in link.get('href')]

Should be the below instead: 应该是以下代替:

y = [link.get('href') for link in li if link.get('href').startswith(("https://", "http://"))]

You can use regex to do this. 您可以使用正则表达式执行此操作。 Here's how: 这是如何做:

y = [link.get('href') for link in li if re.match("https*://", link.get('href')) if "google" not in link.get('href')]

This will match from zero to unlimited number of occurrences of s (there will be 0 or 1 in real situations). 这将匹配从零到无限次出现的s (实际情况下将有0或1)。

If you are looking for a way to get search results from google, I would suggest you to use the googlesearch library itself. 如果您正在寻找从谷歌获取搜索结果的方法,我建议您使用googlesearch库本身。

It would be much easier for you to get the results. 您可以更轻松地获得结果。 There is no need of scraping the entire query page and search for getting results. 无需抓取整个查询页面并搜索获取结果。 It provides you with both http and https links. 它为您提供httphttps链接。 Here's an article that might be helpul to you. 这篇文章可能对您有帮助。

https://www.geeksforgeeks.org/performing-google-search-using-python-code/ https://www.geeksforgeeks.org/performing-google-search-using-python-code/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM