網頁搜刮-無法使用python和BeautifulSoup打印電話號碼

Question

我可以同時獲得所有人的姓名和職務說明，但是只有少數電話號碼。

這是我的代碼：

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.raywhite.com/contact/?type=People&target=people&suburb=Sydney%2C+NSW+2000&radius=5&firstname=&lastname=&_so=people'

# opening connection
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

    page_soup = soup(page_html, "html.parser")

containers = page_soup.findAll("div",{"class":"card horizontal-split vcard"})

for container in containers:
    agent_name = container.findAll("li", {"class":"agent-name"})
    name = agent_name[0].text

    agent_role = container.findAll("li", {"class":"agent-role"})
    role = agent_role[0].text

    phone = container.find("a").text

    print("name: " + name)
    print("role: " + role)
    print("phone: " + phone)

這是打印的第一對夫婦的樣本，只有前兩個代理列出了他們的電話號碼：

name: Mark Constantine
role: Principal
phone: 0418 222 643
name: Dawn Veloskey
role: Operations Manager
phone: 0418 449 600
name: Yvonne Lau
role: Sales
phone:

name: Anthony Cavallaro
role: Managing Director | Selling Principal
phone:

name: Ciara OConnor
role: Sales Executive
phone:

name: Michael Buium
role: Commercial Sales Manager and Auctioneer
phone:

name: Albert Hui
role: Senior Commercial Property Manager
phone:

name: Jessie Yee
role: Associate Director, Commercial Leasing & Management
phone:

不知道為什么不打印其他電話號碼，任何建議將不勝感激。

Answer 1

那是因為前兩個沒有照片，否則照片是第一個“ a”標簽。

更換：

phone = container.find("a").text

與：

 filterfn = lambda x: 'href' in x.attrs and x['href'].startswith("tel")
 phones = map(lambda x: x.text,filter(filterfn,container.findAll("a"))) 

 for phone in phones:
     print("phone number: " + phone)

網頁搜刮-無法使用python和BeautifulSoup打印電話號碼

問題描述

1 個解決方案

解決方案1
3 已采納 2017-07-26 05:49:46

網頁搜刮-無法使用python和BeautifulSoup打印電話號碼

問題描述

1 個解決方案

解決方案1 3 已采納 2017-07-26 05:49:46

解決方案1
3 已采納 2017-07-26 05:49:46