[英]Python not creating csv file after parsing - using selenium and beautifulsoup
我的代碼分為兩部分
使用selenium打開瀏覽器並添加詳細信息以從頁面獲取結果
解析結果頁面的html並將其寫入csv文件。
問題第二部分只有在我下載頁面並手動添加本地URL(在我的計算機上)時才有效。 如果我添加代碼的第一部分,selenium會打開瀏覽器但不會導出csv文件。
我曾經寫過的東西 - Ubuntu Mate 18.04 Pycharm編輯器Firefox瀏覽器
我打印了每個級別的代碼並得到了正確的輸出。 但是,輸出在for循環后停止。
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
import pandas as pd
import csv
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
os.environ["PATH"] += os.pathsep + r'/home/pierre/PycharmProjects/scraping/venv'
browser = webdriver.Firefox()
browser.get('http://karresults.nic.in/indexPUC_2019.asp')
reg = browser.find_element_by_id('reg')
reg.send_keys('738286')
sub = browser.find_element_by_class_name('btn-default')
sub.click()
url = browser.current_url
my_url = url
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
results = []
for record in page_soup.findAll('tr'):
for data in record.findAll('td'):
results = results + [data.text.replace(u'\xa0', u'').strip()]
print(results)
with open('myfile.csv', 'w') as f:
for item in results:
f.write(item + ',')
Pycharm控制台上沒有錯誤
無需重新請求更新的URL,您必須使用時間模塊等待幾秒鍾。
from bs4 import BeautifulSoup
from selenium import webdriver
import time
os.environ["PATH"] += os.pathsep + r'/home/pierre/PycharmProjects/scraping/venv'
browser = webdriver.Firefox()
browser.get('http://karresults.nic.in/indexPUC_2019.asp')
reg = browser.find_element_by_id('reg')
reg.send_keys('738286')
sub = browser.find_element_by_class_name('btn-default')
sub.click()
time.sleep(3)
soup = BeautifulSoup(browser.page_source, 'lxml')
results = []
for record in soup.find_all('tr'):
for data in record.find_all('td'):
results = results + [data.text.replace(u'\xa0', u'').strip()]
with open('myfile.csv', 'w') as f:
for item in results:
f.write(item + ',')
csv文件O / P:
Name,ANIKET ANIL BALEKUNDRI,Reg. No.,738286,ENGLISH,76,,76P,HINDI,76,,76P,Part A - TOTAL,152,PHYSICS,44,30,74P,CHEMISTRY,46,30,76P,MATHEMATICS,73,,73P,BIOLOGY,55,29,84P,Part B - TOTAL,307,GRAND TOTAL MARKS,459,FINAL RESULT,First Class,
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.