简体   繁体   English

获取 python 中的好友列表

[英]Fetch list of friends in python

I have got a task to fetch a list of friends of some test user from site livejournal but have never done smth like this.我的任务是从网站livejournal获取一些测试用户的朋友列表,但从来没有像这样做过。 How do we work with API and that stuff in Python?我们如何使用 API 和 Python 中的那些东西?

livejournal.com/bots livejournal.com/bots

Generally avoid using webscraping methods, where you take data directly from the HTML code of the website, as websites tend to be dynamic.通常避免使用网络抓取方法,直接从网站的 HTML 代码获取数据,因为网站往往是动态的。 Use webscraping as a last resort!使用网络抓取作为最后的手段!

So always search first if the website provides an API.因此,如果网站提供 API,请始终先搜索。 As I can see livejournal does provide a kind of an API but maybe itdoes not provide you with the information you are looking for.正如我所见,livejournal 确实提供了一种 API 但也许它没有为您提供您正在寻找的信息。

Nevertheless, working with an API is straightforward.尽管如此,使用 API 还是很简单的。 First you must find the endpoint which you want to reach, that is many times a link like: https://exampleusername.livejournal.com/data/rss which as you can see from the link you wrote returns: A user's recent entries syndicated using the Real Simple Syndication XML format.首先,您必须找到要到达的端点,即很多时候是这样的链接: https://exampleusername.livejournal.com/data/rss从您编写的链接中可以看到返回: A user's recent entries syndicated使用 Real Simple Syndication XML 格式。

After you find your endpoint in Python you can use the requests module which in my experience is really good.在 Python 中找到您的端点后,您可以使用requests 模块,根据我的经验,它非常好。 With this module you can send a request to the service endpoint to return the data that you queried.使用此模块,您可以向服务端点发送请求以返回您查询的数据。 You can do that with the.get() method:您可以使用 .get() 方法来做到这一点:

response = requests.get(API_ENDPOINT_URL)

Then you need to check if the service didn't respond with the data if the response code of the reply message was not 200:然后,如果回复消息的响应码不是 200,则需要检查服务是否没有响应数据:

# throw exception if response code is different than 200
if response.status_code != 200:
    print("There was an error in the response. You didn't get the data you wanted back")

If everything went alright though and the response code was 200 then you (most likely) have the data you wanted.如果一切正常并且响应代码为 200,那么您(很可能)拥有您想要的数据。 Now you only need to handle them as you wish.现在您只需要按照您的意愿处理它们。

Note that requests does not provide support for XML data but you can use the built-in XML parsers in python as it explained in this post .请注意,请求支持 XML 数据,但您可以使用 python 中的内置XML解析器,如本文所述。 So after you get the data you can use something like this to handle the XML data:因此,在获得数据后,您可以使用类似这样的方法来处理 XML 数据:

from xml.etree import ElementTree
tree = ElementTree.fromstring(response.content)

So a complete approach would look like this:所以一个完整的方法看起来像这样:

import requests
from xml.etree import ElementTree

# note here that 'ohnotheydidnt' is the name of the user of whom you wanna get the data
API_ENDPOINT_URL = "https://ohnotheydidnt.livejournal.com/data/rss"

# send the request and await for a response
response = requests.get(API_ENDPOINT_URL)
# throw exception if response code is different than 200
if response.status_code != 200:
    print("There was an error in the response. You didn't get the data you wanted back")

# get the XML data from the response
tree = ElementTree.fromstring(response.content)

# parse the tree and handle the data

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM