简体   繁体   中英

Scraping data from a website using Python 2

I am trying to scrape data from the stock market but I keep getting nothing when I print out the data. I want the price of apple.

import urllib
import re



htmlfile = urllib.urlopen("http://finance.yahoo.com/q?s=AAPL&q1=1")

htmltext = htmlfile.read()

regex = '<span class="Fw(b) Fz(36px) Mb(-4px)" data-reactid="270">(.+?)</span>'

pattern = re.compile(regex)

price = re.findall(pattern,htmltext)

print price

Can you elaborate on what exactly you're trying to pull from the page? I was able to pull your tag with the code below (Note: Uses Python 3, BeautifulSoup and requests, all of which I recommend for web scraping; also to find out what you need to put for the headers variable, I suggest: http://www.whatsmyua.com/ )

import requests
from bs4 import BeautifulSoup

url = 'http://finance.yahoo.com/q?s=AAPL&q1=1'

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; InfoPath.3; .NET4.0C; .NET4.0E; rv:11.0) like Gecko'}

r = requests.get(url, headers=headers)

soup = BeautifulSoup(r.text, "html.parser")

for item in soup.find_all('span', {"class":"Fw(500) Pstart(10px) Fz(24px) C($dataRed)"}):
    print(item)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM