简体   繁体   English

如何使用 pd.read_html 并遍历许多不同的 url 并将每组 dfs 存储到 dfs 的主列表中?

[英]How do I use pd.read_html and loop through many different urls and store each set of dfs into a master list of dfs?

I was wondering how to pull tickers from an excel file, load a bunch of websites and run pd.read_html on each website in order to get a big list of dfs that contained the tables of each page?我想知道如何从 excel 文件中提取代码,加载一堆网站并在每个网站上运行 pd.read_html 以获得包含每个页面表格的大 dfs 列表?

This is my list of tickerss: https://docs.google.com/spreadsheets/d/16kdjtOlV1M_rDnM73lPi6ZcMvowQPmtjKu6bYTXK588/edit?usp=sharing这是我的行情列表: https ://docs.google.com/spreadsheets/d/16kdjtOlV1M_rDnM73lPi6ZcMvowQPmtjKu6bYTXK588/edit ? usp = sharing

This is my current code:这是我当前的代码:

from six.moves import urllib
import pandas as pd

df = pd.read_excel('C:/Users/Jacob/Downloads/CEF Tickers.xlsx', sheet_name='Sheet1')

tickers_list = df['Ticker'].tolist()

df_list = []

for ticker in tickers_list:
    df_list[ticker] = pd.read_html(f'https://www.cefconnect.com/fund/{ticker}', header=0)

print(df_list)

And then when I do that, I get:然后当我这样做时,我得到:

TypeError: list indices must be integers or slices, not str

Thank you for your time.感谢您的时间。

from six.moves import urllib
import pandas as pd

df = pd.read_excel('C:/Users/Jacob/Downloads/CEF Tickers.xlsx', sheet_name='Sheet1')

tickers_list = df['Ticker'].tolist()

df_list = []

for ticker in range(len(tickers_list)):
    df_list[ticker] = pd.read_html(f'https://www.cefconnect.com/fund/{ticker}', header=0)

print(df_list)

This is what I did.这就是我所做的。


df = pd.read_excel('C:/Users/Jacob/Downloads/CEF Tickers.xlsx', sheet_name='Sheet1')

tickers_list = df['Ticker'].tolist()
data = pd.DataFrame(columns=tickers_list)


for ticker in tickers_list:
    data[ticker] = pd.read_html(f'https://www.cefconnect.com/fund/{ticker}', header=0)


print(data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM