Python打印结果中包含特定的字符串

Question

I am trying to get google search result description. 我正在尝试获取Google搜索结果说明。

from selenium import webdriver
import re
chrome_path = r"C:\Users\xxxx\Downloads\Compressed\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://www.google.co.in/search?q=stackoverflow")
posts = driver.find_elements_by_class_name("st")
for post in posts:
    print(post.text)

Here Im getting correct results. 我在这里得到正确的结果。 But I only want to print links from description. 但是我只想从描述中打印链接。 And want to get results from 5 google search pages. 并希望从5个Google搜索页面中获取结果。 Here I am only getting from 1 page. 在这里，我只能从1页获得信息。

I have tried using 我尝试使用

print(post.get_attribute('href'))

but description links are not clickable so this returns None. 但是描述链接不可点击，因此返回None。

Answer 1

Try the below code: 试试下面的代码：

for i in range(1, 6, 1):
    print("--------------------------------------------------------------------")
    print("Page "+str(i)+" Results : ")
    print("--------------------------------------------------------------------")
    staticLinks = driver.find_elements_by_xpath("//*[@class='st']")
    for desc in staticLinks:
        txt = desc.text+''
        if txt.count('http://') > 0 or txt.count('https://') > 0:
            for c in txt.split():
                if c.startswith('http') or c.startswith('https'):
                    print(c)

    dynamicLinks = driver.find_elements_by_xpath("//*[@class='st']//a")
    for desc in dynamicLinks:
        link = desc.get_attribute('href')
        if link is not None:
            print(link)
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    nextPage = driver.find_element_by_xpath("//a[@aria-label='Page "+str(i+1)+"']");
    nextPage.click();

Will try to fetch the static & dynamic links from the google's first 5 search results description. 将尝试从Google的前5个搜索结果描述中获取静态和动态链接。

Python打印结果中包含特定的字符串

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-01-23 18:52:43

Python打印结果中包含特定的字符串

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-01-23 18:52:43

解决方案1
0 已采纳 2019-01-23 18:52:43