[英]Saving multiple data frames from loop
I have been searching for a solution to my problem, but all answers I find uses print() at the end of the answer, and NOT saving the data frames as I would like to. 我一直在寻找解决问题的方法,但是找到的所有答案在答案的末尾都使用print(),而不是按我的意愿保存数据帧。
Below I have a (almost) functioning code that prints 3 seperate tables. 下面,我有一个(几乎)有效的代码,可以打印3个单独的表。 How do I save these three tables in 3 seperate data frames with the names matches_october, matches_november and matches_december?
如何将这三个表保存在3个单独的数据框中,名称分别为matchs_october,matches_november和matchs_december?
The last line in my code is not working as I want it to work. 代码中的最后一行无法正常运行,因为我希望它能够正常工作。 I hope it is clear what I would like the code to do (Saving a data frame at the end of each of the 3 rounds in the loop)
我希望很清楚我想要代码做什么(在循环的3个回合的每个循环的末尾保存一个数据帧)
import pandas as pd
import requests
from bs4 import BeautifulSoup
base_url = 'https://www.basketball-reference.com/leagues/NBA_2019_games-'
valid_pages = ['october','november','december']
end = '.html'
for i in valid_pages:
url = '{}{}{}'.format(base_url, i, end)
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
print(df)
matches + valid_pages = df[0]
You can case it, but that's not very robust (and it's rather ugly). 您可以使用它,但这不是很可靠(而且很丑陋)。
if i == 'october':
matches_october = pd.read_html(str(table))
if i == 'november':
# so on and so forth
A more elegant solution is to use a dictionary. 一个更优雅的解决方案是使用字典。 Before the loop, declare
matches = {}
. 在循环之前,声明
matches = {}
。 Then, in each iteration: 然后,在每次迭代中:
matches[i] = pd.read_html(str(table))
Then you can access the October matches DataFrame via matches['october']
. 然后,您可以通过matchs
matches['october']
访问十月比赛数据框架。
You can't compose variable names using +
, try using a dict
instead: 您不能使用
+
来组成变量名,而应尝试使用dict
来代替:
import pandas as pd
import requests
from bs4 import BeautifulSoup
matches = {} # create an empty dict
base_url = 'https://www.basketball-reference.com/leagues/NBA_2019_games-'
valid_pages = ['october','november','december']
end = '.html'
for i in valid_pages:
url = '{}{}{}'.format(base_url, i, end)
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
print(df)
matches[i] = df[0] # store it in the dict
Thanks guys. 多谢你们。 That worked!
可行! :)
:)
import pandas as pd
import requests
from bs4 import BeautifulSoup
matches = {} # create an empty dict
base_url = 'https://www.basketball-reference.com/leagues/NBA_2019_games-'
valid_pages = ['october','november','december']
end = '.html'
for i in valid_pages:
url = '{}{}{}'.format(base_url, i, end)
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
matches[i] = df[0] # store it in the dict
matches_october = matches['october']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.