简体   繁体   English

Python 3 requests.get()。text返回未编码的字符串

[英]Python 3 requests.get().text returns unencoded string

Python 3 requests.get().text returns unencoded string. Python 3 requests.get()。text返回未编码的字符串。 If I execute: 如果我执行:

import requests
request = requests.get('https://google.com/search?q=Кто является президентом России?').text.lower()
print(request)

I get kind of this: 我得到这样的:

Кто является презид

I've tried to change google.com to google.ru 我试图将google.com更改为google.ru

If I execute: 如果我执行:

import requests
request = requests.get('https://google.ru/search?q=Кто является президентом России?').text.lower()
print(request)

I get kind of this: 我得到这样的:

d0%9a%d1%82%d0%be+%d1%8f%d0%b2%d0%bb%d1%8f%d0%b5%d1%82%d1%81%d1%8f+%d0%bf%d1%80%d0%b5%d0%b7%d0%b8%d0%b4%d0%b5%d0%bd%d1%82%d0%be%d0%bc+%d0%a0%d0%be%d1%81%d1%81%d0%b8%d0

I need to get an encoded normal string. 我需要获取一个编码的普通字符串。

You were getting this error because requests was not able to identify the correct encoding of the response. 您收到此错误是因为请求无法识别响应的正确编码。 So if you are sure about the response encoding then you can set it like the following: 因此,如果您对响应编码有把握,则可以像下面这样设置:

response = requests.get(url) response.encoding --> to check the encoding response.encoding = "utf-8" --> or any other encoding.

And then get the content with .text method. 然后使用.text方法获取内容。

I fixed it with urllib.parse.unquote() method: 我用urllib.parse.unquote()方法修复了它:

import requests
from urllib.parse import unquote

request = unquote(requests.get('https://google.ru/search?q=Кто является президентом России?').text.lower())
print(request)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM