I have this simple one line script:
from pandas import read_html
print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')
Which works, fine, but the column names are missing, they are being identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know I could just store the names as a list and set them, and then skip the first row, but am wondering if there is an easier/better way.
Currently it prints:
0 1 2 3
0 Company Price Change % Change
1 AAPL Apple Inc 115.31 +6.17 +5.65%
2 BAC Bank of America Corp 15.20 -0.43 -2.75%
3 YHOO Yahoo! Inc 46.46 -1.53 -3.19%
4 MSFT Microsoft Corp 41.19 -1.47 -3.45%
5 FB Facebook Inc 76.24 +0.46 +0.61%
6 GE General Electric Co 23.84 -0.54 -2.21%
7 T AT&T Inc 32.68 -0.13 -0.40%
8 F Ford Motor Co 14.46 -0.24 -1.63%
9 INTC Intel Corp 33.78 -0.41 -1.20%
10 CSCO Cisco Systems Inc 26.80 -0.09 -0.35%
'read_html` takes a header parameter. You can pass a row index:
read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')
Worth noting this caveat in the docs:
For example, you might need to manually assign column names if the column names are converted to NaN when you pass the header=0 argument
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.