如何从请求 python 的响应中检索文本

Question

I am trying to search inside the response of a request (I used Requests and Python).我正在尝试在请求的响应中进行搜索（我使用了请求和 Python）。 I get the response and check the type of it, which is UNICODE.我得到响应并检查它的类型，即 UNICODE。

I want to retrieve a specific link which is located between two other strings.我想检索位于其他两个字符串之间的特定链接。 I have tried different ways found online such as the:我尝试过在网上找到的不同方法，例如：

result = re.**search**('Currently: <a ', s)
url_file = response.**find**('Currently: <a ', beg=0, end=len(response))

Also tried to transform the UNICODE string to a normal string:还尝试将 UNICODE 字符串转换为普通字符串：

s = unicodedata.normalize(response, title).encode('ascii','ignore')

I get an error.我收到一个错误。

EDITED已编辑

For example:例如：

This works:这有效：

    s = 'asdf=5;iwantthis123jasd'
    result = re.search('asdf=5;(.*)123jasd', s)
    print result.group(1)

This doesn't work (returns error):这不起作用（返回错误）：

    s = 'Currently: <a '
    result = re.search(r.text, s)
    print result.group(1)

Answer 1

You can access the raw text from the response object with the text attribute.您可以使用text属性从响应对象访问原始文本。

res = requests.get("http://google.com")
re.search('pattern', res.text)

Then, just use a regular expression to "search" or "match" the entire response.然后，只需使用正则表达式来“搜索”或“匹配”整个响应。

Answer 2

You are using re.search wrong.您正在使用re.search错误。 The first argument of the function is the pattern and the second one is the source string:该函数的第一个参数是模式，第二个参数是源字符串：

import re
import requests

s = '<a class=gb1 href=[^>]+>'
r = requests.get('https://www.google.com/?q=python')
result = re.search(s, r.text)

print result.group(0)

If you simply need the list of all matches you can use: re.findall(s, r.text)如果您只需要所有匹配项的列表，您可以使用： re.findall(s, r.text)

如何从请求 python 的响应中检索文本

问题描述

2 个解决方案

解决方案1
3 2016-10-20 12:39:23

解决方案2
2 已采纳 2016-10-20 13:03:46

如何从请求 python 的响应中检索文本

问题描述

2 个解决方案

解决方案1 3 2016-10-20 12:39:23

解决方案2 2 已采纳 2016-10-20 13:03:46

解决方案1
3 2016-10-20 12:39:23

解决方案2
2 已采纳 2016-10-20 13:03:46