简体   繁体   English

csv.writer 没有将整个 output 写入 CSV 文件

[英]csv.writer not writing entire output to CSV file

I am attempting to scrape the artists' Spotify streaming rankings from Kworb.net into a CSV file and I've nearly succeeded except I'm running into a weird issue.我正试图从 Kworb.net 将艺术家的 Spotify 流媒体排名抓取到一个 CSV 文件中,我几乎成功了,只是我遇到了一个奇怪的问题。

The code below successfully scrapes all 10,000 of the listed artists into the console:下面的代码成功地将列出的所有 10,000 名艺术家抓取到控制台中:

import requests
from bs4 import BeautifulSoup
import csv

URL = "https://kworb.net/spotify/artists.html"
result = requests.get(URL)
src = result.content
soup = BeautifulSoup(src, 'html.parser')

table = soup.find('table', id="spotifyartistindex")

header_tags = table.find_all('th')
headers = [header.text.strip() for header in header_tags]

rows = []
data_rows = table.find_all('tr')

for row in data_rows:
    value = row.find_all('td')
    beautified_value = [dp.text.strip() for dp in value]
    print(beautified_value)

    if len(beautified_value) == 0:
        continue

    rows.append(beautified_value)

The issue arises when I use the following code to save the output to a CSV file:当我使用以下代码将 output 保存到 CSV 文件时出现问题:

with open('artist_rankings.csv', 'w', newline="") as output:
    writer = csv.writer(output)
    writer.writerow(headers)
    writer.writerows(rows)

For whatever reason, only 738 of the artists are saved to the file.无论出于何种原因,只有 738 位艺术家被保存到文件中。 Does anyone know what could be causing this?有谁知道这可能是什么原因造成的?

Thanks so much for any help!非常感谢您的帮助!

As an alternative approach, you might want to make your life easier next time and use pandas .作为替代方法,您可能希望下次让您的生活更轻松并使用pandas

Here's how:这是如何做:

import requests
import pandas as pd

source = requests.get("https://kworb.net/spotify/artists.html")
df = pd.concat(pd.read_html(source.text, flavor="bs4"))
df.to_csv("artists.csv", index=False)

This outputs a .csv file with 10,000 artists.这会输出一个包含10,000位艺术家的.csv文件。

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM