繁体   English   中英

如何在“ For”循环中垂直附加多个列表?

[英]How to Append Multiple Lists in a “For” Loop Vertically?

我的“ for”循环是从多个页面抓取(在这种情况下,是我放入列表中的三个页面)。 但是print输出/ csv输出并没有选择循环中的先前迭代(它只是给我最后第三页的结果)。 我认为我在这里寻找的术语是“数组”,因为我希望每个页面的结果相互垂直附加。 我似乎在误解此功能的工作方式:

results.append(details)

这全都归功于QHarr在这里找到的出色答案: 如何将经过筛选的数据水平导出到Excel?

这是我正在使用的完整的工作代码:

import requests, re
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
import time

examplelist = [['1'], ['2'], ['3']]

pages = [i for I in examplelist for i in I]

for key in pages:
    driver = webdriver.Chrome(executable_path=r"C:\Users\User\Downloads\chromedriver_win32\chromedriver.exe")
    driver.get('https://www.restaurant.com/listing?&&st=KS&p=KS&p=PA&page=' + str(key) + '&&searchradius=50&loc=10021')
    time.sleep(10)
    WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".restaurants")))
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    restaurants = soup.select('.restaurants')
    results = []
    for restaurant in restaurants:
        details = [re.sub(r'\s{2,}|[,]', '', i) for i in restaurant.select_one('h3 + p').text.strip().split('\n') if i != '']
        details.insert(0, restaurant.select_one('h3 a').text)
        results.append(details)
#print(results)

df = pd.DataFrame(results, columns= ['Name', 'Address', 'City', 'State', 'Zip', 'Phone', 'AdditionalInfo'])
df.to_csv(r'C:\Users\User\Documents\Restaurants.csv', sep=',', encoding='utf-8-sig', index = False)

driver.close()

谢谢

我认为您会在循环内使用results = []清空results ,因此您会丢失已经放入其中的内容。 像这样在循环外初始化

results=[]
for key in pages:
    driver = webdriver.Chrome(executable_path=r"C:\Users\User\Downloads\chromedriver_win32\chromedriver.exe")
    driver.get('https://www.restaurant.com/listing?&&st=KS&p=KS&p=PA&page=' + str(key) + '&&searchradius=50&loc=10021')
    time.sleep(10)
    WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".restaurants")))
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    restaurants = soup.select('.restaurants')
    for restaurant in restaurants:
        details = [re.sub(r'\s{2,}|[,]', '', i) for i in restaurant.select_one('h3 + p').text.strip().split('\n') if i != '']
        details.insert(0, restaurant.select_one('h3 a').text)
        results.append(details)
#print(results)

并从循环内部删除该初始化。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM