[英]How do I use pd.read_html and loop through many different urls and store each set of dfs into a master list of dfs?
I was wondering how to pull tickers from an excel file, load a bunch of websites and run pd.read_html on each website in order to get a big list of dfs that contained the tables of each page?我想知道如何从 excel 文件中提取代码,加载一堆网站并在每个网站上运行 pd.read_html 以获得包含每个页面表格的大 dfs 列表?
This is my list of tickerss: https://docs.google.com/spreadsheets/d/16kdjtOlV1M_rDnM73lPi6ZcMvowQPmtjKu6bYTXK588/edit?usp=sharing这是我的行情列表: https ://docs.google.com/spreadsheets/d/16kdjtOlV1M_rDnM73lPi6ZcMvowQPmtjKu6bYTXK588/edit ? usp = sharing
This is my current code:这是我当前的代码:
from six.moves import urllib
import pandas as pd
df = pd.read_excel('C:/Users/Jacob/Downloads/CEF Tickers.xlsx', sheet_name='Sheet1')
tickers_list = df['Ticker'].tolist()
df_list = []
for ticker in tickers_list:
df_list[ticker] = pd.read_html(f'https://www.cefconnect.com/fund/{ticker}', header=0)
print(df_list)
And then when I do that, I get:然后当我这样做时,我得到:
TypeError: list indices must be integers or slices, not str
Thank you for your time.感谢您的时间。
from six.moves import urllib
import pandas as pd
df = pd.read_excel('C:/Users/Jacob/Downloads/CEF Tickers.xlsx', sheet_name='Sheet1')
tickers_list = df['Ticker'].tolist()
df_list = []
for ticker in range(len(tickers_list)):
df_list[ticker] = pd.read_html(f'https://www.cefconnect.com/fund/{ticker}', header=0)
print(df_list)
This is what I did.这就是我所做的。
df = pd.read_excel('C:/Users/Jacob/Downloads/CEF Tickers.xlsx', sheet_name='Sheet1')
tickers_list = df['Ticker'].tolist()
data = pd.DataFrame(columns=tickers_list)
for ticker in tickers_list:
data[ticker] = pd.read_html(f'https://www.cefconnect.com/fund/{ticker}', header=0)
print(data)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.