[英]How do I get the text and not the webelement object?
I am able to go through the loop and print the right result but when I try to download I am unable to download the same data to a text file.我能够通过循环 go 并打印正确的结果,但是当我尝试下载时,我无法将相同的数据下载到文本文件中。 I know I am missing something very simple or maybe I am making a mistake in integration of pandas library.我知道我遗漏了一些非常简单的东西,或者我在集成 pandas 库时犯了一个错误。 If anyone can help out then that would be great.如果有人可以提供帮助,那就太好了。
from time import sleep
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
term = driver.find_elements_by_xpath("//tbody/tr")
for row in term:
columns = row.find_elements_by_xpath("./td")
termres = []
for i in columns:
termres.append(i.text)
if len(termres) == 7:
print(termres[0] + '\t' + termres[1] + '\t' + termres[2] + '\t' + termres[3] + '\t' + termres[4])
elif len(termres) == 10:
print(termres[0] + '\t' + termres[1] + '\t' + termres[3] + '\t' + termres[4] + '\t' + termres[5])
elif len(termres) == 1 and termres[0] == 'Unofficial Transcript':
print('-')
elif len(termres) == 6 and termres[0].isalpha():
print(termres[0] + '\t' + termres[1] + '\t' + termres[3] + '\t' + termres[4] )
#"""
df = pd.DataFrame({'':term})
df.to_csv('term.txt', index= False)
print('downloaded')
The output from print statements is a huge list so I put part of the output as a sample:打印语句中的 output 是一个巨大的列表,所以我将 output 的一部分作为示例:
CHEM 101 General Chemistry I TR 3.000
CHEM 103 General Chemistry Lab I TR 1.000
CHEM 151 General Chemistry I TR 3.000
CHEM 153 General Chemistry I Laboratory TR 1.000
-
-
Then this is what gets downloaded to text file:然后这是下载到文本文件的内容:
df = pd.DataFrame({'':term})
df.to_csv('term.txt', index= False)
print('downloaded')
#result from the above code in a text file.
"<selenium.webdriver.remote.webelement.WebElement (session=""0c17f0126422f2144127b971ad19e1f6"", element=""65e1e68a-e2c3-48e9-a89d-72c7c1d2811e"")>"
"<selenium.webdriver.remote.webelement.WebElement (session=""0c17f0126422f2144127b971ad19e1f6"", element=""d6f0374b-fe70-4499-abb8-75275f92fc59"")>"
So the question is how do I get the text and not the webelement object?所以问题是我如何获取文本而不是 webelement object?
As per the line of code:根据代码行:
term = driver.find_elements_by_xpath("//tbody/tr")
term
is a list of <tr>
elements and each WebElement is represented as: term
是<tr>
元素的列表,每个WebElement表示为:
<selenium.webdriver.remote.webelement.WebElement (session="d4f20fd17bf4037ed8cf50b00e844a7f", element="f12cf837-6c77-4c90-9da2-7b5fb9da9e5d")>
Moving ahead in your program though you have traversed down to the descendents of the <tr>
elements and even printed the desired texts but while constructing the dataframe instead of considering the desired texts, you have considered the <tr>
stored in the term
list, which was a list of WebElements.尽管您已经向下遍历到<tr>
元素的后代甚至打印了所需的文本,但在您的程序中继续前进,但是在构建dataframe而不是考虑所需的文本时,您已经考虑了存储在term
列表中的<tr>
,这是一个 WebElements 列表。
Hence, the same WebElements are written within the textfile term.txt .因此,相同的 WebElements 写在文本文件term.txt中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.