I am trying to get data from Yahoo Finance. I use beautiful soup. I want to get sector(s) and industry information. However, the result is not impressive, it output a lot of unneccessary data. I just want to get:
Sector(s): Healthcare Industry: Drug Manufacturers—General Full Time Employees: 11,800
Here is the code:
url = "https://finance.yahoo.com/quote/GILD/profile?p=GILD"
with req.urlopen(url) as response:
data = response.read().decode("utf-8")
import bs4
root = bs4.BeautifulSoup(data, "html.parser")
titles = root.find_all("span")
the result is as attached pic enter image description here
I would to it this way:
from bs4 import BeautifulSoup as bs
import requests
url = "https://finance.yahoo.com/quote/GILD/profile?p=GILD"
req = requests.get(url)
soup = bs(req.text,'lxml')
for p in (soup.select("p.D\(ib\).Va\(t\)")):
print(p.text)
or even simpler:
soup.select_one('p.D(ib).Va(t)').text
Output:
Sector(s): HealthcareIndustry: Drug Manufacturers—GeneralFull Time Employees: 11,800
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.