簡體   English   中英

如何將循環生成的數據轉換為數據幀?

[英]How to convert for loop generated data into Data Frame?

我使用 selenium 自動化 webdriver 從帶有 for 循環的網站中的表中提取數據。 如何將該數據轉換為數據框並導出到 csv 文件。 我試圖在 Pandas 數據框中分配“值”,但它拋出錯誤。

from selenium import webdriver

url = "https://www.jambalakadi.info/status/"

driver = webdriver.Chrome(executable_path="chromedriver.exe")

driver.get(url)

row_count = len(driver.find_elements_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr "))
col_count = len(driver.find_elements_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr[1]/td "))

print('Number of row counts:', row_count)
print("Number of column counts:", col_count)


for r in range(2, row_count+1):
    for c in range(1, col_count+1):
        value = driver.find_element_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr["+str(r)+"]/td["+str(c)+"] ").text
        print(value, end=" ")

    print(" ")

在此處輸入圖片說明

當我運行 for 循環時,'value' 變量打印數據,但我無法創建數據框並使用 Pandas 將其導出到 CSV 文件。

我更新了代碼格式是否正確?

my_data = []
for r in range(2, row_count+1):
    for c in range(1, col_count+1):
        value = driver.find_element_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr["+str(r)+"]/td["+str(c)+"] ").text
        print(value, end=" ")
        for line in value:
            my_data.append(line[0],line[1],line[2])
        pd.DataFrame.from_records(my_data, columns=column).to_csv('output.csv')

    print(" ")

您需要使用函數pd.DataFrame.from_records()

用例:

import pandas as pd
#Reading the data
my_data = []
for line in my_database:
    #preprocess the line (say you get 3 columns date,customer,price)
    #say you use line.split(" "), now your line is actually an array of values (line = line.split(" ")
    my_data.append([line[0],line[1],line[2]]) #each index corresponds to date, customer and price respectively

pd.DataFrame.from_records(my_data, columns=['date','customer','price']).to_csv('output.csv')

這是使用熊貓在dataframe獲取數據然后導入到 csv 的代碼。

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
from bs4 import BeautifulSoup


driver=webdriver.Chrome(executable_path="chromedriver.exe")
driver.get("https://yourwebsitename.com")
WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table#main_table_countries_today")))
html=driver.page_source
soup=BeautifulSoup(driver.page_source,'html.parser')
table=soup.find('table',attrs={"id":"main_table_countries_today"})
df=pd.read_html(str(table))
print(df[0])
df[0].to_csv('output.csv',index=False)

更新

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd


driver=webdriver.Chrome(executable_path = "chromedriver.exe")
driver.get("https://yourwebsitename.com")
element=WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table#main_table_countries_today")))
table=driver.execute_script("return arguments[0].outerHTML;",element)
df=pd.read_html(str(table))
print(df[0])
df[0].to_csv('output.csv',index=False)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM