我想使用 selenium 從表中獲取內容到數組

Question

嘿伙計們，我想通過使用 selenium 從網站上的表格中獲取內容

這是我的第一次嘗試：

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = driver.find_elements_by_xpath("//td/table")
print(texts)

結果

然后我想通過使用將其更改為 textcontent

get_property('textContent')

並且錯誤說 AttributeError: 'NoneType' object has no attribute 'get_property'

所以我想知道如何獲取表格內容並將其轉換為數組

附言。 python 版本 3.7.4

Answer 1

如果您要查找表中的所有文本，只需使用.text 。 像這樣：

from pprint import pprint
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = [i.text for i in driver.find_elements_by_xpath("//td/table")]
pprint(texts)

Output：

['Tag Description\n'
 '<table> It defines a table.\n'
 '<tr> It defines a row in a table.\n'
 '<th> It defines a header cell in a table.\n'
 '<td> It defines a cell in a table.\n'
 '<caption> It defines the table caption.\n'
 '<colgroup> It specifies a group of one or more columns in a table for '
 'formatting.\n'
 '<col> It is used with <colgroup> element to specify column properties for '
 'each column.\n'
 '<tbody> It is used to group the body content in a table.\n'
 '<thead> It is used to group the header content in a table.\n'
 '<tfooter> It is used to group the footer content in a table.',
 'First_Name Last_Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'First_Name Last_Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'Name Last Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'Name Last Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'Name Mobile No.\nAjeet Maurya 7503520801 9555879135',
 'Name Ajeet Maurya\nMobile No. 7503520801\n9555879135',
 'Element Chrome IE Firefox Opera Safari\n<table> Yes Yes Yes Yes Yes']

如果你想使用get_property('textContent')你可以這樣做：

from pprint import pprint
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = [i.get_property('textContent') for i in driver.find_elements_by_xpath("//td/table")]
pprint(texts)

Output：

['\n'
 'TagDescription\n'
 '<table>It defines a table.\n'
 '<tr>It defines a row in a table.\n'
 '<th>It defines a header cell in a table.\n'
 '<td>It defines a cell in a table.\n'
 '<caption>It defines the table caption.\n'
 '<colgroup>It specifies a group of one or more columns in a table for '
 'formatting.\n'
 '<col>It is used with <colgroup> element to specify column properties for '
 'each\n'
 'column.\n'
 '<tbody>It is used to group the body content in a table.\n'
 '<thead>It is used to group the header content in a table.\n'
 '<tfooter>It is used to group the footer content in a table.\n',
 '\n'
 'First_NameLast_NameMarks\n'
 'SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 'First_NameLast_NameMarks\n'
 'SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 '  \n'
 '  Name\n'
 '  Last Name\n'
 '  Marks\n'
 '  \n'
 ' SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 '  \n'
 '  Name\n'
 '  Last Name\n'
 '  Marks\n'
 '  \n'
 ' SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 '  \n'
 '  Name\n'
 '  Mobile No.\n'
 '  \n'
 '  \n'
 '  Ajeet Maurya\n'
 '  7503520801\n'
 '  9555879135\n'
 '  \n',
 '  \nNameAjeet Maurya  \nMobile No.7503520801  \n9555879135  \n',
 '\nElement Chrome IE Firefox Opera Safari\n<table>YesYesYesYesYes\n']

Answer 2

這會將表格單元格內容存儲到數組中

from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('/usr/local/bin/chromedriver')  # Optional argument, if not specified will search path.
driver.implicitly_wait(15)

driver.get("https://www.javatpoint.com/html-table");


table_rows= driver.find_elements(By.XPATH,"(//p[contains(text(),'HTML table')]/following-sibling::table[@class='alt'])[1]//tr/td")
x=[]
for rows in table_rows:
    #print rows.text
    x.append( rows.text)
print x
driver.quit()

Output

<table>
It defines a table.
<tr>
It defines a row in a table.
<th>
It defines a header cell in a table.
<td>
It defines a cell in a table.
<caption>
It defines the table caption.
<colgroup>
It specifies a group of one or more columns in a table for formatting.
<col>
It is used with <colgroup> element to specify column properties for each column.
<tbody>
It is used to group the body content in a table.
<thead>
It is used to group the header content in a table.
<tfooter>
It is used to group the footer content in a table.

我想使用 selenium 從表中獲取內容到數組

問題描述

2 個解決方案

解決方案1
0 已采納 2019-10-06 10:37:52

解決方案2
0 2019-10-06 11:28:47

我想使用 selenium 從表中獲取內容到數組

問題描述

2 個解決方案

解決方案1 0 已采納 2019-10-06 10:37:52

解決方案2 0 2019-10-06 11:28:47

解決方案1
0 已采納 2019-10-06 10:37:52

解決方案2
0 2019-10-06 11:28:47