[英]How do I loop and append my code to a CSV file? - Python
I'm brand new to coding, I've been working on this for about a week and hit a dead end so please be gentle.我是编码新手,我已经为此工作了大约一个星期,但遇到了死胡同,所以请保持温和。
What I'm looking to do is get all the data from the url in the format that the print statement is displaying and put that into a CSV file.我想要做的是从 url 以打印语句显示的格式获取所有数据,并将其放入 CSV 文件中。
I've managed to successfully print a single line, but I've got no idea how I get that to loop through all the other lines and append them to a CSV file.我已经成功打印了一行,但我不知道如何让它循环遍历所有其他行并将它们附加到 CSV 文件中。 Any hints or tips?
任何提示或提示?
import io
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding = 'utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding = 'utf-8')
import urllib.request
from bs4 import BeautifulSoup
url = "https://en.wikipedia.org/wiki/2022_FIFA_World_Cup_qualification_%E2%80%93_CAF_First_Round"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "lxml")
dateLists = soup.find_all(attrs={"class" : "bday dtstart published updated"})
timeLists = soup.find_all(attrs={"class" : "mobile-float-reset ftime"})
homeTeamLists = soup.find_all(attrs={"class" : "fhome"})
awayTeamLists = soup.find_all(attrs={"class" : "faway"})
scoreLists = soup.find_all(attrs={"class" : "fscore"})
venueLists = soup.find_all('span', attrs={"itemprop" : "name address"})
date = dateLists[0].text.strip()
time = timeLists[0].text.strip()
homeTeam = homeTeamLists[0].text.strip()
awayTeam = awayTeamLists[0].text.strip()
score = scoreLists[0].text.strip()
venue = venueLists[0].text.strip()
print(date, time, homeTeam, score, awayTeam, venue)
You just have to iterate through each item in the list.您只需要遍历列表中的每个项目。 You can use
enumerate
to get the index position and then use that to append each item into a list and for the dataframe:您可以使用
enumerate
获取索引位置,然后使用它将每个项目附加到列表和数据帧中:
import urllib.request
from bs4 import BeautifulSoup
import pandas as pd
url = "https://en.wikipedia.org/wiki/2022_FIFA_World_Cup_qualification_%E2%80%93_CAF_First_Round"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "lxml")
dateLists = soup.find_all(attrs={"class" : "bday dtstart published updated"})
timeLists = soup.find_all(attrs={"class" : "mobile-float-reset ftime"})
homeTeamLists = soup.find_all(attrs={"class" : "fhome"})
awayTeamLists = soup.find_all(attrs={"class" : "faway"})
scoreLists = soup.find_all(attrs={"class" : "fscore"})
venueLists = soup.find_all('span', attrs={"itemprop" : "name address"})
dateList = []
timeList = []
homeTeamList = []
awayTeamList = []
scoreList = []
venueList = []
for idx, v in enumerate(dateLists):
dateList.append(dateLists[idx].text.strip())
timeList.append(timeLists[idx].text.strip())
homeTeamList.append(homeTeamLists[idx].text.strip())
awayTeamList.append(awayTeamLists[idx].text.strip())
scoreList.append(scoreLists[idx].text.strip())
venueList.append(venueLists[idx].text.strip())
df = pd.DataFrame({'date':dateList,
'time':timeList,
'home':homeTeamList,
'away':awayTeamList,
'score':scoreList,
'venue':venueList})
Output:输出:
print(df.to_string())
date time home away score venue
0 2019-09-04 16:00 UTC+3 Ethiopia Lesotho 0–0 Bahir Dar Stadium, Bahir Dar
1 2019-09-08 15:00 UTC+2 Lesotho Ethiopia 1–1 Setsoto Stadium, Maseru
2 2019-09-05 18:00 UTC+3 Somalia Zimbabwe 1–0 El Hadj Hassan Gouled Aptidon Stadium, Djibout...
3 2019-09-10 15:00 UTC+2 Zimbabwe Somalia 3–1 National Sports Stadium, Harare
4 2019-09-04 16:00 UTC+3 Eritrea Namibia 1–2 Denden Stadium, Asmara
5 2019-09-10 19:00 UTC+2 Namibia Eritrea 2–0 Sam Nujoma Stadium, Windhoek
6 2019-09-04 15:00 UTC+2 Burundi Tanzania 1–1 Prince Louis Rwagasore Stadium, Bujumbura
7 2019-09-08 16:00 UTC+3 Tanzania Burundi 1–1 (a.e.t.) National Stadium, Dar es Salaam
8 2019-09-04 18:00 UTC+3 Djibouti Eswatini 2–1 El Hadj Hassan Gouled Aptidon Stadium, Djibouti
9 2019-09-10 15:00 UTC+2 Eswatini Djibouti 0–0 Mavuso Sports Centre, Manzini
10 2019-09-07 16:00 UTC+2 Botswana Malawi 0–0 Francistown Stadium, Francistown
11 2019-09-10 14:00 UTC+2 Malawi Botswana 1–0 Kamuzu Stadium, Blantyre
12 2019-09-06 17:00 UTC±0 Gambia Angola 0–1 Independence Stadium, Bakau
13 2019-09-10 16:00 UTC+1 Angola Gambia 2–1 Estádio 11 de Novembro, Luanda
14 2019-09-04 18:00 UTC±0 Liberia Sierra Leone 3–1 Samuel Kanyon Doe Sports Complex, Paynesville
15 2019-09-08 16:30 UTC±0 Sierra Leone Liberia 1–0 Siaka Stevens Stadium, Freetown
16 2019-09-04 18:30 UTC+4 Mauritius Mozambique 0–1 Stade Anjalay, Belle Vue
17 2019-09-10 16:00 UTC+2 Mozambique Mauritius 2–0 Estádio do Zimpeto, Maputo
18 2019-09-04 15:30 UTC±0 São Tomé and Príncipe Guinea-Bissau 0–1 Estádio Nacional 12 de Julho, São Tomé
19 2019-09-10 16:30 UTC±0 Guinea-Bissau São Tomé and Príncipe 2–1 Estádio 24 de Setembro, Bissau
20 2019-09-04 16:00 UTC+2 South Sudan Equatorial Guinea 1–1 Al-Hilal Stadium, Omdurman (Sudan)[note 2]
21 2019-09-08 17:00 UTC+1 Equatorial Guinea South Sudan 1–0 Nuevo Estadio de Malabo, Malabo
22 2019-09-06 15:00 UTC+3 Comoros Togo 1–1 Stade de Moroni, Moroni
23 2019-09-10 16:00 UTC±0 Togo Comoros 2–0 Stade de Kégué, Lomé
24 2019-09-05 15:30 UTC+1 Chad Sudan 1–3 Stade Omnisports Idriss Mahamat Ouya, N'Djamena
25 2019-09-10 19:00 UTC+2 Sudan Chad 0–0 Al-Merrikh Stadium, Omdurman
26 2019-09-05 16:00 UTC+4 Seychelles Rwanda 0–3 Stade Linité, Victoria
27 2019-09-10 18:00 UTC+2 Rwanda Seychelles 7–0 Stade Régional Nyamirambo, Kigali
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.