简体   繁体   English

Python CSV Writerow附加了for循环?

[英]Python CSV Writerow append with a for loop?

I'm making a web scraper to scrape box scores for MLB games from a certain site. 我正在制作一个网络抓取工具,以从某个站点抓取MLB游戏的比分。 I'm basically trying to create a header row in my output CSV file, that consists of "Teams" and then a div class from the website that has each inning number from the game, followed by R, H, E. Normally, one could just .writerow['Teams', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'R', 'H', 'E'] to make that row, however sometimes the games go to extra innings, so that div class with the inning numbers/RHE changes dynamically, so I want the scraper to recognize that and adjust the row accordingly. 我基本上是在尝试在输出CSV文件中创建标题行,该文件行包含“团队”,然后是网站上的div类,其中包含游戏中的每个局号,其后是R,H,E。通常,一个可以只是.writerow ['Teams','1','2','3','4','5','6','7','8','9','R',' H','E']进入该行,但是有时游戏会转到额外的局,因此带有局号/ RHE的div类会动态更改,因此我希望刮板程序能够识别并相应地调整行。

My code looks like this: 我的代码如下所示:

from bs4 import BeautifulSoup
import requests
import csv

with open('BoxScoreURLS.csv', newline='') as f_urls, open('IndividualBoxScoresOutput.csv', 'w', newline='') as f_output:
    csv_urls = csv.reader(f_urls)
    csv_output = csv.writer(f_output)
    #csv_output.writerow(['Teams', 'Box Scores'])


    for line in csv_urls:
        page = requests.get(line[0]).text
        soup = BeautifulSoup(page, 'html.parser')
        topline = soup.findAll('div', {'class' :'LineScoreCard__lineScoreColumnElement--1byQk LineScoreCard__header--3ZO_N'})

        for t in range(len(topline)):
            csv_output.writerow(['Teams', topline[t].text])

This is the URL I'm trying to scrape: (My code will read from a list of URL's in a separate CSV) 这是我要抓取的网址:(我的代码将从单独的CSV网址列表中读取)

https://www.thescore.com/mlb/events/63853 https://www.thescore.com/mlb/events/63853

This is what it outputs to the CSV: 这是它输出到CSV的内容:

Teams   1
Teams   2
Teams   3
Teams   4
Teams   5
Teams   6
Teams   7
Teams   8
Teams   9
Teams   R
Teams   H
Teams   E

And when there's a game that goes extra innings, it looks like: 当有一个游戏进行额外的比赛时,它看起来像:

Teams   1
Teams   2
Teams   3
Teams   4
Teams   5
Teams   6
Teams   7
Teams   8
Teams   9
Teams   10
Teams   11
Teams   R
Teams   H
Teams   E

And this is what I would like to see: 这就是我希望看到的:

Teams   1   2   3   4   5   6   7   8   9   R   H   E

So 'Finding All' of that div class will gather the info correctly, but what this creates right now is two columns in the CSV. 因此,该div类的“查找全部”将正确收集信息,但是现在创建的是CSV中的两列。 I tried several different row.append combinations too but to no success. 我也尝试了几种不同的row.append组合,但没有成功。 Once I can get a solution for appending the div class to the same row as where "Teams" displays, I will have my header row to then put the scores underneath. 一旦获得将div类附加到与“ Teams”显示所在行相同的解决方案之后,我将在标题行中将分数放在下面。

Is it possible to somehow loop to find all of the div class and then append it to appear horizontally across the same row as "Teams"? 是否可以通过某种方式循环查找所有div类,然后将其附加为与“ Teams”在同一行中水平出现? Let me know if I can answer any more questions. 让我知道是否可以回答其他问题。

Thank! 谢谢!

@SergeBallesta provided me with a perfect solution. @SergeBallesta为我提供了一个完美的解决方案。

writerow(['Teams'] + [topline[t].text for t in range(len(topline))]) writerow(['Teams'] + [topline [t] .t范围内的文本(len(topline))])

Thanks a lot! 非常感谢!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM