Scraping headlines from Yahoo Finance using Python

Question

I am using beautiful soup to extract headlines from this page http://in.finance.yahoo.com/q?s=AAPL but I need headlines for past 3 months ie from 10 Dec 2013 to 10 March 2014. But I am able to extract only the headlines that are their on this specific page. How to extract the required headlines for any specific company?

Code:

url = 'http://in.finance.yahoo.com/q?s=AAPL'
data = urllib2.urlopen(url)
soup = BeautifulSoup(data)

divs = soup.find('div',attrs={'id':'yfi_headlines'})
div = divs.find('div',attrs={'class':'bd'})
ul = div.find('ul')
lis = ul.findAll('li')
hls = []
for li in lis:
    headlines = li.find('a').contents[0]
    print headlines

Answer 1

我认为您的问题与从何处获取数据有关，如果您需要最近三个月的数据，则应查询http://in.finance.yahoo.com/q/hp?s=AAPL ，您要查找的所有数据都显示在表格上。

Answer 2

on http://in.finance.yahoo.com/q?s=AAPL , click on 'more headlines from AAPL'. from there you'll get a link that has a datetime field in it. modify that and you should be good. ( http://in.finance.yahoo.com/q/h?s=AAPL&t=2014-02-08T15:06:40+05:30 )

Scraping headlines from Yahoo Finance using Python

Question

2 answers

solution1
0 2014-03-11 16:39:43

solution2
0 2014-03-11 17:52:42

Scraping headlines from Yahoo Finance using Python

Question

2 answers

solution1 0 2014-03-11 16:39:43

solution2 0 2014-03-11 17:52:42

solution1
0 2014-03-11 16:39:43

solution2
0 2014-03-11 17:52:42