如何將循環生成的數據轉換為數據幀？

Question

我使用 selenium 自動化 webdriver 從帶有 for 循環的網站中的表中提取數據。 如何將該數據轉換為數據框並導出到 csv 文件。 我試圖在 Pandas 數據框中分配“值”，但它拋出錯誤。

from selenium import webdriver

url = "https://www.jambalakadi.info/status/"

driver = webdriver.Chrome(executable_path="chromedriver.exe")

driver.get(url)

row_count = len(driver.find_elements_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr "))
col_count = len(driver.find_elements_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr[1]/td "))

print('Number of row counts:', row_count)
print("Number of column counts:", col_count)


for r in range(2, row_count+1):
    for c in range(1, col_count+1):
        value = driver.find_element_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr["+str(r)+"]/td["+str(c)+"] ").text
        print(value, end=" ")

    print(" ")

當我運行 for 循環時，'value' 變量打印數據，但我無法創建數據框並使用 Pandas 將其導出到 CSV 文件。

我更新了代碼格式是否正確？

my_data = []
for r in range(2, row_count+1):
    for c in range(1, col_count+1):
        value = driver.find_element_by_xpath(" //*[@id='main_table_countries_today']/tbody[1]/tr["+str(r)+"]/td["+str(c)+"] ").text
        print(value, end=" ")
        for line in value:
            my_data.append(line[0],line[1],line[2])
        pd.DataFrame.from_records(my_data, columns=column).to_csv('output.csv')

    print(" ")

Answer 1

您需要使用函數pd.DataFrame.from_records()

用例：

import pandas as pd
#Reading the data
my_data = []
for line in my_database:
    #preprocess the line (say you get 3 columns date,customer,price)
    #say you use line.split(" "), now your line is actually an array of values (line = line.split(" ")
    my_data.append([line[0],line[1],line[2]]) #each index corresponds to date, customer and price respectively

pd.DataFrame.from_records(my_data, columns=['date','customer','price']).to_csv('output.csv')

Answer 2

這是使用熊貓在dataframe獲取數據然后導入到 csv 的代碼。

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
from bs4 import BeautifulSoup


driver=webdriver.Chrome(executable_path="chromedriver.exe")
driver.get("https://yourwebsitename.com")
WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table#main_table_countries_today")))
html=driver.page_source
soup=BeautifulSoup(driver.page_source,'html.parser')
table=soup.find('table',attrs={"id":"main_table_countries_today"})
df=pd.read_html(str(table))
print(df[0])
df[0].to_csv('output.csv',index=False)

更新：

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd


driver=webdriver.Chrome(executable_path = "chromedriver.exe")
driver.get("https://yourwebsitename.com")
element=WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"table#main_table_countries_today")))
table=driver.execute_script("return arguments[0].outerHTML;",element)
df=pd.read_html(str(table))
print(df[0])
df[0].to_csv('output.csv',index=False)

如何將循環生成的數據轉換為數據幀？

問題描述

2 個解決方案

解決方案1
1 2020-03-30 13:41:50

解決方案2
1 已采納 2020-03-30 14:16:56

如何將循環生成的數據轉換為數據幀？

問題描述

2 個解決方案

解決方案1 1 2020-03-30 13:41:50

解決方案2 1 已采納 2020-03-30 14:16:56

解決方案1
1 2020-03-30 13:41:50

解決方案2
1 已采納 2020-03-30 14:16:56