[英]how can I export scraped data to csv - selenium
我有 2 個從 google 抓取數據的函數,但我不確定如何將我的結果導出到帶有標題和鏈接列的 csv 文件中。 你能幫我解決這個問題嗎?
def get_search_attributes(driver):
headers = driver.find_elements_by_xpath('//*[@id="rso"]/div/div/div/div/a/h3')
headers = [header.text for header in headers]
# print(headers)
links = driver.find_elements_by_xpath('//*[@id="rso"]/div/div/div/div/a')
links = [link.get_attribute('href') for link in links]
#print(links)
headers_df = pd.DataFrame(headers, columns=["headers"])
links_df = pd.DataFrame(links, columns=["links"])
return headers_df, links_df
def search_multiple_pages(driver, page_limit = 5):
insert_search_value(driver)
pagecounter = 0
while pagecounter <= page_limit:
get_search_attributes(driver)
next_page_btn = driver.find_elements_by_xpath("//a[@id='pnnext']")
if len(next_page_btn) < 1:
print('no more pages')
break
else:
element = WebDriverWait(driver, 5).until(expected_conditions.element_to_be_clickable((By.ID, 'pnnext')))
driver.execute_script("return arguments[0].scrollIntoView();", element)
element.click()
pagecounter += 1
return
header_csv = headers_df.to_csv(..optional args)
links_csv = links_df.to_csv(..optional args)
f = open("filename.csv", "a")
f.write(header_csv)
f.write(links_csv) // or any order
f.close()
有關詳細信息,請參閱panda to_csv doc和python 編寫函數
您應該使用字典將所有內容放在一個 dataframe 中
#headers_df = pd.DataFrame(headers, columns=["headers"])
#links_df = pd.DataFrame(links, columns=["links"])
df = pd.DataFrame({"headers": headers, "links": links})
df.to_csv(filename)
例子
import pandas as pd
df = pd.DataFrame({
"Headers": ['A', 'B', 'C'],
"Links": ['https://A', 'https://B', 'https://C']
})
print(df)
df.to_csv('data.csv')
結果:
Headers Links
0 A https://A
1 B https://B
2 C https://C
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.