[英]i can't scrape mobile no as getting same inspect output after clicking on view contact using python selenium
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.common.exceptions import TimeoutException
driver = webdriver.Chrome(r'C:\chromedriver.exe')
driver.get('https://www.gigadocs.com/hyderabad/dentist')
driver.find_element_by_xpath('//*[@id="listingTab"]/div[2]/div/div[1]/div[1]/div/div[2]/div[2]/ul/li[1]/span'). click()
soup = BeautifulSoup(driver.page_source,'html.parser')
mobile = soup.find('ul',class_='detailsList')
print(mobile)
i am trying to click on view contact to scrape the mobile number but after clicking, getting same output as view contact.我正在尝试单击查看联系人以刮取手机号码,但单击后,得到与查看联系人相同的 output。
Induce WebDriverWait
and element_to_be_clickable()
and click the view contact and then get the li tag under ul tag.诱导WebDriverWait
和element_to_be_clickable()
并点击查看联系人,然后获取 ul 标签下的 li 标签。
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
driver = webdriver.Chrome(r'C:\chromedriver.exe')
driver.get('https://www.gigadocs.com/hyderabad/dentist')
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.XPATH,"//span[@data-source='mobile'][text()='View Contact']"))).click()
time.sleep(1)
soup = BeautifulSoup(driver.page_source,'html.parser')
mobile = soup.find('ul',class_='detailsList')
print(mobile.find('li').text)
You don't need the overhead of selenium.您不需要 selenium 的开销。 The page makes POST requests using ids for doctor and clinic to retrieve telephone numbers.该页面使用医生和诊所的 ID 发出 POST 请求以检索电话号码。 You can scrape these ids from the initial page then mimic those requests to get the tel numbers.您可以从初始页面中抓取这些 ID,然后模仿这些请求以获取电话号码。 I use doctor id as the key for a dictionary and update the values with the tel number.我使用医生 ID 作为字典的键,并使用电话号码更新值。
import requests
from bs4 import BeautifulSoup as bs
headers = {
'User-Agent': 'Mozilla/5.0',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'X-Requested-With': 'XMLHttpRequest'
}
data = {
'doctorId': '3806', #[data-doctor]
'clinicId': '1519', #[data-clinic]
'clickSource': 'mobile'
}
with requests.Session() as s:
s.headers = headers
r = s.get('https://www.gigadocs.com/hyderabad/dentist')
soup = bs(r.content, 'lxml')
tel_numbers = {i['data-doctor']:i['data-clinic'] for i in soup.select('.appointmentBtn')}
for k, v in tel_numbers.items():
data['doctorId'] = k
data['clinicId'] = v
r = s.post('https://www.gigadocs.com/search/getmobilenumbers', data=data).json()
tel_numbers[k] = r['mobile']
print(tel_numbers)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.