簡體   English   中英

解析后Python不創建csv文件 - 使用selenium和beautifulsoup

[英]Python not creating csv file after parsing - using selenium and beautifulsoup

我的代碼分為兩部分

  1. 使用selenium打開瀏覽器並添加詳細信息以從頁面獲取結果

  2. 解析結果頁面的html並將其寫入csv文件。

問題第二部分只有在我下載頁面並手動添加本地URL(在我的計算機上)時才有效。 如果我添加代碼的第一部分,selenium會打開瀏覽器但不會導出csv文件。

我曾經寫過的東西 - Ubuntu Mate 18.04 Pycharm編輯器Firefox瀏覽器

我打印了每個級別的代碼並得到了正確的輸出。 但是,輸出在for循環后停止。

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
import pandas as pd
import csv
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

os.environ["PATH"] += os.pathsep + r'/home/pierre/PycharmProjects/scraping/venv'
browser = webdriver.Firefox()
browser.get('http://karresults.nic.in/indexPUC_2019.asp')

reg = browser.find_element_by_id('reg')
reg.send_keys('738286')

sub = browser.find_element_by_class_name('btn-default')

sub.click()

url = browser.current_url

my_url = url

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")

results = []

for record in page_soup.findAll('tr'):
    for data in record.findAll('td'):
        results = results + [data.text.replace(u'\xa0', u'').strip()]

        print(results)

        with open('myfile.csv', 'w') as f:
            for item in results:
                f.write(item + ',')

Pycharm控制台上沒有錯誤

無需重新請求更新的URL,您必須使用時間模塊等待幾秒鍾。

from bs4 import BeautifulSoup
from selenium import webdriver
import time

os.environ["PATH"] += os.pathsep + r'/home/pierre/PycharmProjects/scraping/venv'
browser = webdriver.Firefox()
browser.get('http://karresults.nic.in/indexPUC_2019.asp')

reg = browser.find_element_by_id('reg')
reg.send_keys('738286')

sub = browser.find_element_by_class_name('btn-default')

sub.click()

time.sleep(3)

soup = BeautifulSoup(browser.page_source, 'lxml')
results = []
for record in soup.find_all('tr'):
    for data in record.find_all('td'):
        results = results + [data.text.replace(u'\xa0', u'').strip()]
        with open('myfile.csv', 'w') as f:
            for item in results:
                f.write(item + ',')

csv文件O / P:

Name,ANIKET ANIL BALEKUNDRI,Reg. No.,738286,ENGLISH,76,,76P,HINDI,76,,76P,Part A - TOTAL,152,PHYSICS,44,30,74P,CHEMISTRY,46,30,76P,MATHEMATICS,73,,73P,BIOLOGY,55,29,84P,Part B - TOTAL,307,GRAND TOTAL MARKS,459,FINAL RESULT,First Class,

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM