简体   繁体   English

使用 Selenium 抓取文本后,如何使用 Python 将替换字符转换为 UTF-8 字符

[英]How to convert replacement characters into UTF-8 characters with Python after grabbing text using Selenium

I grabbed text from a specific classes child nodes using this subroutine我使用此子例程从特定类子节点中获取文本

elements = WebDriverWait(self.driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "gamesRow")))
for a in elements:
    self.formatBets(a.text)

and the converted string after applying the .text attribute looks something like this:应用.text属性后转换后的字符串如下所示:

May 05 1
NASHVILLE PREDATORS
-1�+150
o5�+110
-185
More +
4:08 PM
  2 COLUMBUS BLUE JACKETS
+1�-170
u5�-130
+165

How would I go about converting those replacement characters (�), as they appear in the original HTML.我将如何 go 转换这些替换字符 (�),因为它们出现在原始 HTML 中。

In the original HTML, the replacement characters are denoted as fractions, specifically 1/2在原来的 HTML 中,替换字符表示为分数,具体为1/2

Sorry if this is a bit confusing, I'm new to webscraping and HTML so if this needs more clarification please let me know and I'll edit the question!抱歉,如果这有点令人困惑,我是网络抓取和 HTML 的新手,所以如果需要更多说明,请告诉我,我会编辑问题!

Try this change on your code, just encode the text.尝试对您的代码进行此更改,只需对文本进行编码。 Are you using python2 or python3?你用的是python2还是python3?

self.formatBets(a.text.encode("utf-8"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM