简体   繁体   中英

I am trying to screen scrape the Date and GDPnow number from the Atlanta Fed

https://www.atlantafed.org/cqer/research/gdpnow

I am trying to screen scrape the current GDPnow number with the date from the Atlanta Fed. Currently "Latest estimate: 3.5 percent — January 20, 2023." I then want to take the GDP number and the date and add it to my existing DF.

from bs4 import BeautifulSoup
from urllib.request import urlopen

url = "https://www.atlantafed.org/cqer/research/gdpnow"
page = urlopen(url)
html = page.read().decode("utf-8")
soup = BeautifulSoup(html, "html.parser")

Try:

import re
import requests
from bs4 import BeautifulSoup

url = 'https://www.atlantafed.org/cqer/research/gdpnow'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')
x = soup.select_one('#Slot').get_text()
gdp, date = re.search(r'(?sm)(-?\d+\.\d*).*—\s*(.*?)\s*$', x).groups()
print(gdp)
print(date)

Prints:

3.5
January 20, 2023

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM