简体   繁体   English

刮谷歌地图审查一家公司的文本数据

[英]Scrape google maps review text data for one company

I wanted to scrape text review data from google maps review for one company in order to perform sentiment analysis.我想从一家公司的谷歌地图评论中抓取文本评论数据,以便进行情绪分析。 However, my code is not running.但是,我的代码没有运行。 I am getting error.我收到错误。 I was wondering if you could guide me to fix this.我想知道你是否可以指导我解决这个问题。 Thanks!谢谢!

!pip install selenium
!apt-get update 
!apt install chromium-chromedriver

from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
wd = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
driver =webdriver.Chrome('chromedriver',chrome_options=chrome_options)

#add your google map link whose data you want to scrape
from selenium import webdriver                       
from bs4 import BeautifulSoup                       
import time                       
import io                       
import pandas as pd

from selenium import webdriver
from bs4 import BeautifulSoup
import time
import io
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC  
from selenium.webdriver.common.by import By  

driver.get('https://www.google.com/maps/place/Embassy+of+Bangladesh/@38.9418017,-77.0679642,15z/data=!4m7!3m6!1s0x0:0x5621455e7625f36e!8m2!3d38.9418017!4d-77.0679642!9m1!1b1')

wait = WebDriverWait(driver, 10)
menu_bt = wait.until(EC.element_to_be_clickable(
                       (By.XPATH, '//button[@data-value=\'Sort\']'))
                   )  
menu_bt.click()
recent_rating_bt = driver.find_elements_by_xpath(
                                     '//div[@role=\'menuitem\']')[50]
recent_rating_bt.click()
time.sleep(5)

Error message:错误信息:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-35-94b4c6e89470> in <module>()
      5 menu_bt.click()
      6 recent_rating_bt = driver.find_elements_by_xpath(
----> 7                                      '//div[@role=\'menuitem\']')[50]
      8 recent_rating_bt.click()
      9 time.sleep(5)

IndexError: list index out of range

You're accessing the item indexed by 50 on the list returned by find_elements_by_xpath().您正在访问 find_elements_by_xpath() 返回的列表中由 50 索引的项目。 The error message indicates that this index does not exist, ie the returned list is smaller than that.错误信息表明该索引不存在,即返回的列表小于该索引。

You should check the length of the returned list before accessing it.您应该在访问它之前检查返回列表的长度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM