简体   繁体   中英

How do I get the text and not the webelement object?

I am able to go through the loop and print the right result but when I try to download I am unable to download the same data to a text file. I know I am missing something very simple or maybe I am making a mistake in integration of pandas library. If anyone can help out then that would be great.

from time import sleep
import pandas as pd

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

term = driver.find_elements_by_xpath("//tbody/tr")

for row in term:
    columns = row.find_elements_by_xpath("./td")
    termres = []
    for i in columns:
        termres.append(i.text)
    if len(termres) == 7:
        print(termres[0] + '\t' + termres[1] + '\t' + termres[2] + '\t' + termres[3] + '\t' + termres[4])
    elif len(termres) == 10:
        print(termres[0] + '\t' + termres[1] + '\t' + termres[3] + '\t' + termres[4] + '\t' + termres[5])
    elif len(termres) == 1 and termres[0] == 'Unofficial Transcript':
        print('-')
    elif len(termres) == 6 and termres[0].isalpha():
        print(termres[0] + '\t' + termres[1] + '\t' + termres[3] + '\t' + termres[4] )
    #"""
df = pd.DataFrame({'':term})
df.to_csv('term.txt', index= False)
print('downloaded')

The output from print statements is a huge list so I put part of the output as a sample:

CHEM    101     General Chemistry I     TR      3.000
CHEM    103     General Chemistry Lab I TR      1.000
CHEM    151     General Chemistry I     TR      3.000
CHEM    153     General Chemistry I Laboratory  TR      1.000
-
-

Then this is what gets downloaded to text file:

df = pd.DataFrame({'':term})
df.to_csv('term.txt', index= False)
print('downloaded')

#result from the above code in a text file.

"<selenium.webdriver.remote.webelement.WebElement (session=""0c17f0126422f2144127b971ad19e1f6"", element=""65e1e68a-e2c3-48e9-a89d-72c7c1d2811e"")>"
"<selenium.webdriver.remote.webelement.WebElement (session=""0c17f0126422f2144127b971ad19e1f6"", element=""d6f0374b-fe70-4499-abb8-75275f92fc59"")>"

So the question is how do I get the text and not the webelement object?

As per the line of code:

term = driver.find_elements_by_xpath("//tbody/tr")

term is a list of <tr> elements and each WebElement is represented as:

<selenium.webdriver.remote.webelement.WebElement (session="d4f20fd17bf4037ed8cf50b00e844a7f", element="f12cf837-6c77-4c90-9da2-7b5fb9da9e5d")>

Moving ahead in your program though you have traversed down to the descendents of the <tr> elements and even printed the desired texts but while constructing the dataframe instead of considering the desired texts, you have considered the <tr> stored in the term list, which was a list of WebElements.

Hence, the same WebElements are written within the textfile term.txt .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM