简体   繁体   中英

Using Python 3 to Retrieve Stock Information From Yahoo Finance Site

I have been trying to port a script which will request fundamental data from Yahoo Finance site, but I would like to look for specific items instead of the entire reports, like price to book ratios, for example. So, I have followed a tutorial from Sentdex on how to do that. The problem is that the example code is written for Python 2.7 and I am trying to make that work for Python 3, and of course expand on it by adding more features.

Here is how it is looking so far:

import time
import urllib
import urllib.request


sp500short = ['a', 'aa', 'aapl', 'abbv', 'abc', 'abt', 'ace', 'aci', 'acn', 'act', 'adbe', 'adi', 'adm', 'adp']


def yahooKeyStats(stock):

    try:
        sourceCode = urllib.request.urlopen('http://finance.yahoo.com/q/ks?s='+stock).read()
        pbr = sourceCode.split('Price/Book (mrq):</td><td class="yfnc_tabledata1">')[1].split('</td>')[0]       
        print ('price to book ratio:'),stock,pbr

    except Exception as e:
        print ('failed in the main loop'),str(e)


for eachStock in sp500short:
    yahooKeyStats(eachStock)
    time.sleep(1)

I'm almost sure the problem is on the pbr variable definition, on the splitting part of it. The:

 Price/Book (mrq):</td><td class="yfnc_tabledata1">

And...:

</td>

...are just sort of delimiters as what I'm looking for, the actual value, is in between those two items listed above.But, so far it is only giving me the exception message when executing it.

Any help will be much appreciated. Cheers,

It looks like urllib.request.urlopen and .read() is returning data with type bytes .

From the python docs:

Note that urlopen returns a bytes object. This is because there is no way for urlopen to automatically determine the encoding of the byte stream it receives from the http server. In general, a program will decode the returned bytes object to string once it determines or guesses the appropriate encoding.

The split method is failing here. Try appending .decode() after .read() . The issue is that you are trying to split the sourceCode variable which is of type bytes by a string. Decoding sourceCode will convert it from bytes to string. Alternatively, you could .encode() both of your delimiters.

bytes.decode

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM