简体   繁体   English

Python 代码在 csv 文件中仅打印一行

[英]Python code printing only one row in csv file

Recently I've tried to code a yp.com list scraper.最近我尝试编写一个 yp.com 列表刮板。 But could not figure out why the code is printing only one row in the.csv file.但无法弄清楚为什么代码在.csv 文件中只打印一行。

yp_urls.txt urls are: yp_urls.txt网址是:

https://www.yellowpages.com/search-map?search_terms=restaurant&geo_location_terms=Boston https://www.yellowpages.com/search-map?search_terms=restaurant&geo_location_terms=Boston&page=2 https://www.yellowpages.com/search-map?search_terms=restaurant&geo_location_terms=波士顿https://www.yellowpages.com/search-map?search_terms=restaurant&page=2=Boston

Here is the code:这是代码:

from urllib.request import urlopen
from bs4 import BeautifulSoup as soup
with open('yp_urls.txt', 'r') as f:
    for url in f:
        print(url)        
        uClient = urlopen(url)
        page_html = uClient.read()        
        uClient.close()
        page_soup = soup(page_html, "html.parser")
        containers = page_soup.findAll("div",{"class":"v-card"})
        #container= containers[0]
        out_filename = "yp_listing.csv"
        headers = "URL \n"
        f = open(out_filename, "w")
f.write(headers)
for container in containers:
            business = container.a["href"].title()
print("business:" + business + "\n" )
f.write(business + "," + "\n")
f.close()  # Close the file

Issues:问题:

  1. Code for your if blocks wasn't properly indented. if 块的代码没有正确缩进。

  2. Open output file handle outside the for loop.在 for 循环外打开 output 文件句柄。

Try:尝试:

from urllib.request import urlopen
from bs4 import BeautifulSoup as soup

out_filename = "yp_listing.csv"
with open('yp_urls.txt', 'r') as f, open(out_filename, "w") as fout:
    headers = "URL \n"
    fout.write(headers)

    for url in f:
        print(url)        
        uClient = urlopen(url)
        page_html = uClient.read()        
        uClient.close()
        page_soup = soup(page_html, "html.parser")
        containers = page_soup.findAll("div",{"class":"v-card"})
        #container= containers[0]
        for container in containers:
            business = container.a["href"].title()
            print("business:" + business + "\n" )
            fout.write(business + "," + "\n")
#f.close()  # Close the file (closed by with)

It appears that the f.write commands are outside of your loops, so are only being hit once the loops are completed. f.write 命令似乎在您的循环之外,因此只有在循环完成后才会被命中。

For example, the code loops through the urls, then exits the loop and executes f.write(headers), then loops through containers, exits that loop and f.write(business:..)例如,代码循环通过 url,然后退出循环并执行 f.write(headers),然后循环通过容器,退出该循环和 f.write(business:..)

You may also wish to check if the output file is being opened in right state with 'w' (write/overwrite) versus 'a' (append).您可能还希望检查 output 文件是否在右侧 state 中以“w”(写入/覆盖)与“a”(附加)打开。 Perhaps also consider changing the handles so both are not 'f'.也许还可以考虑更改手柄,因此两者都不是“f”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM