Python 從導入的列表中解析問題 csv

Question

我拼湊了一個腳本，該腳本運行一個導入的 url 列表，並從 html 部分中獲取所有“p”標簽，該部分具有 class 的“holder”。 它有效，但它只查看導入的 CSV 中的第一個 url：

import csv
from urllib.request import urlopen
from bs4 import BeautifulSoup

contents = []
with open('list.csv','r') as csvf: # Open file in read mode
    urls = csv.reader(csvf)
    for url in urls:
        contents.append(url) # Add each url to list contents

for url in contents:  # Parse through each url in the list.
    page = urlopen(url[0]).read()
    soup = BeautifulSoup(page, "lxml")

n = 0
for container in soup.find_all("section",attrs={'class': 'holder'}):
    n += 1
    print('==','Section',n,'==')
    for paragraph in container.find_all("p"):
        print(paragraph)

任何想法我如何讓它循環遍歷每個 url 而不是一個？

Answer 1

問題在於代碼的縮進。 正確的是：

contents = []
with open('list.csv','r') as csvf: # Open file in read mode
    urls = csv.reader(csvf)
    for url in urls:
        contents.append(url) # Add each url to list contents

for url in contents:  # Parse through each url in the list.
    page = urlopen(url[0]).read()
    soup = BeautifulSoup(page, "lxml")

    n = 0
    for container in soup.find_all("section",attrs={'class': 'holder'}):
        n += 1
        print('==','Section',n,'==')
        for paragraph in container.find_all("p"):
            print(paragraph)

否則，您從最后一個 URL 中提取“p”標簽的內容（前一個循環的最后一個值分配給soup 。

Answer 2

你必須for container in soup.find_all(): ，嘗試這樣的事情：

import csv
from urllib.request import urlopen
from bs4 import BeautifulSoup


with open('list.csv','r') as csvf: # Open file in read mode
    urls = csv.reader(csvf)
    for url in urls:
        page = urlopen(url).read()
        soup = BeautifulSoup(page, "lxml")

        n = 0
        for container in soup.find_all("section",attrs={'class': 'holder'}):
            n += 1
            print('==','Section',n,'==')
            for paragraph in container.find_all("p"):
                print(paragraph)

Python 從導入的列表中解析問題 csv

問題描述

2 個解決方案

解決方案1
1 已采納 2020-02-24 16:23:06

解決方案2
0 2020-02-24 16:27:15

Python 從導入的列表中解析問題 csv

問題描述

2 個解決方案

解決方案1 1 已采納 2020-02-24 16:23:06

解決方案2 0 2020-02-24 16:27:15

解決方案1
1 已采納 2020-02-24 16:23:06

解決方案2
0 2020-02-24 16:27:15