繁体   English   中英

将清理后的 BS4 数据写入 csv 文件

[英]Write cleaned BS4 data to csv file

from selenium import webdriver

from bs4 import BeautifulSoup

import csv

chrome_path = r"C:\Users\chromedriver_win32\chromedriver.exe"

driver = webdriver.Chrome(chrome_path)

driver.get('http://www.yell.com')

search = driver.find_element_by_id("search_keyword")

search.send_keys("plumbers")

place = driver.find_element_by_id("search_location")

place.send_keys("London")

driver.find_element_by_xpath("""//*[@id="searchBoxForm"]/fieldset/div[1]/div[3]/button""").click()

soup = BeautifulSoup(driver.page_source, 'html.parser')

for names in soup.find_all("span", {"class": "businessCapsule--name"}):
    print(names.text)

Output = soup.find_all("span", {"class": "businessCapsule--name"})

with open('comple16.csv', 'w') as csv_file:
    csv.register_dialect('custom', delimiter='\n', quoting=csv.QUOTE_NONE, escapechar='\\')
    writer = csv.writer(csv_file, 'custom')
    row = Output
    writer.writerow(row)

目前,代码在 csv file = class": "businessCapsule-- (scraped text) 中生成这个

我只想将抓取的文本打印到 CSV 文件中(不带标签)

请帮忙。

from selenium import webdriver

from bs4 import BeautifulSoup`

import csv

chrome_path = r"C:\Users\chromedriver_win32\chromedriver.exe"

driver = webdriver.Chrome(chrome_path)

driver.get('http://www.yell.com')

search = driver.find_element_by_id("search_keyword")

search.send_keys("plumbers")

place = driver.find_element_by_id("search_location")

place.send_keys("London")

driver.find_element_by_xpath("""//*[@id="searchBoxForm"]/fieldset/div[1]/div[3]/button""").click()

soup = BeautifulSoup(driver.page_source, 'html.parser')

Output = []
for names in soup.find_all("span", {"class": "businessCapsule--name"}):
    Output.append(names.text)

with open('comple16.csv', 'w') as csv_file:
    csv.register_dialect('custom', delimiter='\n', quoting=csv.QUOTE_NONE, escapechar='\\')
    writer = csv.writer(csv_file, 'custom')
    row = Output
    writer.writerow(row)

后:

Output = soup.find_all("span", {"class": "businessCapsule--name"})

添加:

Output = [row.text for row in Output]

为了从 SPAN 字段中提取文本。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM