美丽的汤返回“[]”

Question

I'm attempting to pull the company information off the bloomberg company profile website using the below code:我正在尝试使用以下代码从彭博公司简介网站上提取公司信息：

import requests
from bs4 import BeautifulSoup

URL = 'https://www.bloomberg.com/profile/company/AAPL:US'

source = requests.get(URL)

soup = BeautifulSoup(source.content, 'lxml')

company_name = soup.findAll('h1', class_= 'companyName__9bd88132')

company_description = soup.findAll('div', class_ = 'description__ce057c5c')

print(company_name)
print(company_description)

But I am only getting two "[ ]" back as a result.但结果我只得到了两个“[]”。 In the responses I've seen to similar questions, they have said its because the incorrect divs are being pulled, but I don't think that is the case here.在我看到的类似问题的回复中，他们说这是因为不正确的 div 被拉出，但我认为情况并非如此。 Would someone know why it isn't working?有人会知道为什么它不起作用吗？ Edit: I've attached the section of html I am trying to pull from below:编辑：我附上了 html 的部分，我试图从下面拉出：

<section class="companyProfileOverview__aa874298 up__e13cf193"><section class="info__d075c560"><h1 class="companyName__9bd88132">Apple Inc</h1><div class="description__ce057c5c">Apple Inc. designs, manufactures, and markets personal computers and related personal computing and mobile communication devices along with a variety of related software, services, peripherals, and networking solutions. Apple sells its products worldwide through its online stores, its retail stores, its direct sales force, third-party wholesalers, and resellers.</div></section><section class="currentPriceContainer"><p class="currentPriceLabel__f1524605">CURRENT PRICE</p><div><div class="inlineRow__7728fc34"><span class="tickerText__d2e1ee30">AAPL:US</span><span class="priceText__0feeaba3">343.99</span><span class="currency__bef924de">USD</span></div><span class="triangle__73a7d8b2 up__a3b61807"></span><div class="inlineRow__7728fc34"><span class="priceChange__5e691975">+10.53</span><span class="percentChange__3c14f7c4">+3.16%</span></div><div class="time__245ca7bb "><span>As of 08:00 PM EDT 06/09/2020 </span></div><a class="quoteLink__d3ac120b" href="/quote/AAPL:US">SEE QUOTE</a></div></section><div class="infoTable__96162ad6"><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">SECTOR</h2><div class="infoTableItemValue__e188b0cb">Technology</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">INDUSTRY</h2><div class="infoTableItemValue__e188b0cb">Hardware</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">SUB-INDUSTRY</h2><div class="infoTableItemValue__e188b0cb">Communications Equipment</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">FOUNDED</h2><div class="infoTableItemValue__e188b0cb">01/03/1977</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">ADDRESS</h2><div class="infoTableItemValue__e188b0cb">1 Infinite Loop
Cupertino, CA 95014
United States</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">PHONE</h2><div class="infoTableItemValue__e188b0cb">1-408-996-1010</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">WEBSITE</h2><div class="infoTableItemValue__e188b0cb">www.apple.com</div></section><section class="infoTableItem__1003ce53"><h2 class="infoTableItemLabel__c9a5d511">NO. OF EMPLOYEES</h2><div class="infoTableItemValue__e188b0cb">100000</div></section></div></section>

I am trying to pull the company name(companyName__9bd88132) and the company description(description__ce057c5c).我正在尝试提取公司名称（companyName__9bd88132）和公司描述（description__ce057c5c）。 Eventually I would like to pull the sector information as well.最终我也想提取部门信息。

Answer 1

Use this code:使用此代码：

import requests
from bs4 import BeautifulSoup

URL = 'https://www.bloomberg.com/profile/company/AAPL:US'
from fake_useragent import UserAgent
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
ua=UserAgent()
hdr = {'User-Agent': ua.random,
      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
      'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
      'Accept-Encoding': 'none',
      'Accept-Language': 'en-US,en;q=0.8',
      'Connection': 'keep-alive'}
source = requests.get(URL,headers=hdr)

soup = BeautifulSoup(source.content, features="html.parser")
# print(soup)
company_name = soup.find_all('h1', class_= 'companyName__9bd88132')

company_description = soup.find_all('div', class_ = 'description__ce057c5c')

print(company_name)
print(company_description)

美丽的汤返回“[]”

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-06-10 13:23:54

美丽的汤返回“[]”

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-06-10 13:23:54

解决方案1
0 已采纳 2020-06-10 13:23:54