简体   繁体   English

使用 selenium 抓取 Tripadvisor 时,如何点击“更多”按钮?

[英]How do I click on “More” button when webscraping Tripadvisor using selenium?

I'm trying to webscrape a page of written reviews on Tripadvisor, but am encountering difficulties clicking on the "more" button that expands all the written reviews on the page.我正在尝试在 Tripadvisor 上抓取一页书面评论,但在单击“更多”按钮时遇到了困难,该按钮可展开页面上的所有书面评论。 I've taken a look at similar queries (thank you Saurabh Gaur) but when the button is clicked using selenium this login page pops up.我查看了类似的查询(谢谢 Saurabh Gaur),但是当使用 selenium 单击按钮时,会弹出此登录页面。

login page photo登录页面照片

Is there a way to click on the "more" button without triggering this?有没有办法点击“更多”按钮而不触发它? Thank you!谢谢! :) :)

from selenium import webdriver
import re
from bs4 import BeautifulSoup

def clicker(url):
    browser = webdriver.Firefox()
    browser.get(url)

    
    # Use regex to find that button link
    pageSource = browser.page_source
    soup = BeautifulSoup(pageSource, 'html.parser')

    # Example: soup.findAll(True, {'class': re.compile(r'\bclass1\b')})
    Regex = re.compile(r'.*\bmoreLink.ulBlueLinks.*')
    linkElem = soup.find('span', class_=Regex)['class']
    linkElem = '.'.join(linkElem[0:(len(linkElem)+1)])
    moreButton = 'span.' + linkElem

    print(moreButton)

    button = browser.find_element_by_css_selector(moreButton)
    print(button)
    
    browser.execute_script("arguments[0].click()", button) 
    
clicker('https://www.tripadvisor.com.sg/Hotel_Review-g295424-d1209362-Reviews-Residence_Spa_at_One_Only_Royal_Mirage_Dubai-Dubai_Emirate_of_Dubai.html')
 

Here is a sample code for your reference, you can use selenium with phantomjs and click on the button.这是一个示例代码供您参考,您可以将 selenium 与 phantomjs 一起使用,然后单击按钮。 I have used name attribute of the tag which is required in the function "find_element_by_name", you can modify this according to your requirement.我已经使用了函数“find_element_by_name”中所需的标签的名称属性,您可以根据您的要求进行修改。

from urllib.request import urlopen
from urllib.error import HTTPError
from bs4 import BeautifulSoup
from selenium import webdriver
def openUrl(link):
    driver = webdriver.PhantomJS(
                executable_path='../../phantomjs/bin/phantomjs')
            try:
                driver.get(link)
            except HTTPError as e:
                print ('Error opening ' + link)
                continue
            try:
                bsObj = BeautifulSoup(driver.page_source)
            except AttributeError as e:
                return None

            try:
                elem1 = driver.find_element_by_name('checkAndShowAnswers')
                elem1.click()
            except:
                continue

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Webscraping 上使用 Selenium 加载更多 - Load More using Selenium on Webscraping 如何使用Python Selenium单击没有ID或HTML的按钮? - How do I click on a button that has no ID or HTML using Python Selenium? 当我点击它时,如何让我的阅读更多按钮显示更多文本(javascript/css/html)? - How do I get my read more button to show more text when i click on it (javascript/css/html)? 我如何单击“确实使用硒”上的“立即应用”按钮? - How do I click the Apply Now button on Indeed with Selenium? 如何使用python硒单击javascript按钮 - How do you click a javascript button using python selenium 如何使用Selenium webdriver测试对SVG对象的单击? - How do I test a click on SVG objects using Selenium webdriver? 如何使用Selenium Webdriver单击具有特定文本的div? - How do I click a div with a specific text using Selenium Webdriver? 使用d3-如何在单击按钮时从数组中选择特定数据以突出显示? - Using d3 - How do I select specific data from array to highlight when I click a button? 当我使用jQuery单击选择按钮时,如何选择所有复选框 - How do I select all check box when I click on Select button using jQuery 当使用React和Redux来自api的数据为空时,如何隐藏更多加载按钮 - How do I hide a load more button when data coming from api is empty using react and redux
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM