简体   繁体   中英

I want to get content from table to array by using selenium

hey guys i wan to get content from table on website by using selenium

here my first try:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = driver.find_elements_by_xpath("//td/table")
print(texts)

the retult

then i want to change it to textcontent by using

get_property('textContent')

and error said AttributeError: 'NoneType' object has no attribute 'get_property'

so I want to know how to get table content and turn it to array

ps. python version 3.7.4

If you are looking for all the text in the table just use .text . Like this:

from pprint import pprint
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = [i.text for i in driver.find_elements_by_xpath("//td/table")]
pprint(texts)

Output:

['Tag Description\n'
 '<table> It defines a table.\n'
 '<tr> It defines a row in a table.\n'
 '<th> It defines a header cell in a table.\n'
 '<td> It defines a cell in a table.\n'
 '<caption> It defines the table caption.\n'
 '<colgroup> It specifies a group of one or more columns in a table for '
 'formatting.\n'
 '<col> It is used with <colgroup> element to specify column properties for '
 'each column.\n'
 '<tbody> It is used to group the body content in a table.\n'
 '<thead> It is used to group the header content in a table.\n'
 '<tfooter> It is used to group the footer content in a table.',
 'First_Name Last_Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'First_Name Last_Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'Name Last Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'Name Last Name Marks\n'
 'Sonoo Jaiswal 60\n'
 'James William 80\n'
 'Swati Sironi 82\n'
 'Chetna Singh 72',
 'Name Mobile No.\nAjeet Maurya 7503520801 9555879135',
 'Name Ajeet Maurya\nMobile No. 7503520801\n9555879135',
 'Element Chrome IE Firefox Opera Safari\n<table> Yes Yes Yes Yes Yes']

If you want to use get_property('textContent') you can do:

from pprint import pprint
from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = [i.get_property('textContent') for i in driver.find_elements_by_xpath("//td/table")]
pprint(texts)

Output:

['\n'
 'TagDescription\n'
 '<table>It defines a table.\n'
 '<tr>It defines a row in a table.\n'
 '<th>It defines a header cell in a table.\n'
 '<td>It defines a cell in a table.\n'
 '<caption>It defines the table caption.\n'
 '<colgroup>It specifies a group of one or more columns in a table for '
 'formatting.\n'
 '<col>It is used with <colgroup> element to specify column properties for '
 'each\n'
 'column.\n'
 '<tbody>It is used to group the body content in a table.\n'
 '<thead>It is used to group the header content in a table.\n'
 '<tfooter>It is used to group the footer content in a table.\n',
 '\n'
 'First_NameLast_NameMarks\n'
 'SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 'First_NameLast_NameMarks\n'
 'SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 '  \n'
 '  Name\n'
 '  Last Name\n'
 '  Marks\n'
 '  \n'
 ' SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 '  \n'
 '  Name\n'
 '  Last Name\n'
 '  Marks\n'
 '  \n'
 ' SonooJaiswal60\n'
 'JamesWilliam80\n'
 'SwatiSironi82\n'
 'ChetnaSingh72\n',
 '\n'
 '  \n'
 '  Name\n'
 '  Mobile No.\n'
 '  \n'
 '  \n'
 '  Ajeet Maurya\n'
 '  7503520801\n'
 '  9555879135\n'
 '  \n',
 '  \nNameAjeet Maurya  \nMobile No.7503520801  \n9555879135  \n',
 '\nElement Chrome IE Firefox Opera Safari\n<table>YesYesYesYesYes\n']

This will store the table cell content to an array

from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('/usr/local/bin/chromedriver')  # Optional argument, if not specified will search path.
driver.implicitly_wait(15)

driver.get("https://www.javatpoint.com/html-table");


table_rows= driver.find_elements(By.XPATH,"(//p[contains(text(),'HTML table')]/following-sibling::table[@class='alt'])[1]//tr/td")
x=[]
for rows in table_rows:
    #print rows.text
    x.append( rows.text)
print x
driver.quit()

Output

<table>
It defines a table.
<tr>
It defines a row in a table.
<th>
It defines a header cell in a table.
<td>
It defines a cell in a table.
<caption>
It defines the table caption.
<colgroup>
It specifies a group of one or more columns in a table for formatting.
<col>
It is used with <colgroup> element to specify column properties for each column.
<tbody>
It is used to group the body content in a table.
<thead>
It is used to group the header content in a table.
<tfooter>
It is used to group the footer content in a table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM