简体   繁体   English

Python 代码写入 csv 文件,仅打印两个文本条目

[英]Python code writes csv file, only prints two texts entries

#Stuff needed to run
import requests
import urllib.request
import io
from bs4 import BeautifulSoup as soup

#Pick url, request it, save response, read response, soup it into variable
my_url = 'https://old.reddit.com/r/all/'
request = urllib.request.Request(my_url,headers={'User-Agent': 'your bot 0.1'})
response = urllib.request.urlopen(request)
page_html = response.read()
page_soup = soup(page_html, "html.parser")

#get all the posts, get one post, get all the authors, get one author
posts = page_soup.findAll("div", {"class": "top-matter"})
post = posts[0]
authors = page_soup.findAll("p", {"class":"tagline"})
author = authors[0]

#make filename, open to write, set the headers, write the headers,
filename = "redditAll.csv"
f = open(filename, "w")
headers = "Title of the post, Author of the post\n"
f.write(headers)

#for the post and author in posts and authors, get one of each, open the file & write it, repeat
for post, author in zip(posts, authors):
    post_text = post.p.a.text.replace(",", " -")
    username = author.a.text
    with open(filename, "w", encoding="utf-8") as f:
        f.write(post_text + "," + username + "\n")

#close the file
f.close()

After running this code and opening the csv file, there's only two cells that have text in them.运行此代码并打开 csv 文件后,只有两个单元格中有文本。

There should be more than two, as there is more than two posts on reddit.com/r/all应该有两个以上,因为 reddit.com/r/all 上有两个以上的帖子

Changed this改变了这个

for post, author in zip(posts, authors):
    post_text = post.p.a.text.replace(",", " -")
    username = author.a.text
    with open(filename, "w", encoding="utf-8") as f:
        f.write(post_text + "," + username + "\n")

To this对此

with open(filename, "w", encoding="utf-8") as f:
    for post, author in zip(posts, authors):
        post_text = post.p.a.text.replace(",", " -")
        username = author.a.text
        f.write(post_text + "," + username + "\n")

Try this:尝试这个:

# for the post and author in posts and authors, get one of each, open the file & write it, repeat
def writer():
    with open(filename, "w", encoding="utf-8") as f:
        for post_, author_ in zip(posts, authors):
            post_text = post_.p.a.text.replace(",", " -")
            username = author_.a.text
            # with open(filename, "w", encoding="utf-8") as f:
            f.write(post_text + "," + username + "\n")

writer()

You could open the file in append mode using the a parameter, the second time you open the file writing, check this SO thread on how to do this.您可以使用a参数以 append 模式打开文件,第二次打开文件写入时,请查看此SO 线程以了解如何执行此操作。 Or move the with open(filename, "w", encoding="utf-8") as f: outside the loop或将with open(filename, "w", encoding="utf-8") as f:移到循环外

The w param will overwrite the previous data in the file so each time the loop runs, the record will be overwritten with the new record leaving just the final record in the file w参数将覆盖文件中的先前数据,因此每次循环运行时,记录将被新记录覆盖,只留下文件中的最终记录

Also I would go with the builtin csv library to read/write csv files as one of the comments mentions.此外,我会使用内置的csv库 go 来读取/写入 csv 文件作为评论中提到的文件之一。 Here is the documentation for it是它的文档

Changed this改变了这个

for post, author in zip(posts, authors):
    post_text = post.p.a.text.replace(",", " -")
    username = author.a.text
    with open(filename, "w", encoding="utf-8") as f:
        f.write(post_text + "," + username + "\n")

To this对此

with open(filename, "w", encoding="utf-8") as f:
    for post, author in zip(posts, authors):
        post_text = post.p.a.text.replace(",", " -")
        username = author.a.text
        f.write(post_text + "," + username + "\n")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM