简体   繁体   English

使用 Python 访问 LinkedIn 个人资料

[英]Access LinkedIn Profile with Python

I am trying to computationally access my own LinkedIn profile via API to download my own posts.我正在尝试通过 API 以计算方式访问我自己的 LinkedIn 个人资料以下载我自己的帖子。 There are three recent Python wrappers to access my profile, eg linkedin-sdk , pawl , LinkedIn V2 .最近有三个 Python 包装器可以访问我的个人资料,例如linkedin-sdkpawlLinkedIn V2 However, I have been unable to make them work.但是,我一直无法让它们工作。 The problem is the authentication.问题是身份验证。 I have seen the famous LinkedIn-API wrapper , but its authentication process is complex and difficult probably due to LinkedIn changing its authentication process and access scope.我见过著名的LinkedIn-API wrapper ,但它的身份验证过程复杂困难,可能是由于 LinkedIn 更改了其身份验证过程和访问范围。

Based on this tutorial from last year I have been able to access my own profile to view my name, country, language and id.根据去年的这个教程,我已经能够访问我自己的个人资料来查看我的姓名、国家、语言和 ID。

import requests

#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
    URL = "https://www.linkedin.com/oauth/v2/authorization"
    client_id= 'XXXX'
    redirect_uri = 'http://localhost:8080/login'
    scope='r_liteprofile'
    PARAMS = {'response_type':'code', 'client_id':client_id,  'redirect_uri':redirect_uri, 'scope':scope}
    r = requests.get(url = URL, params = PARAMS)
    return_url = r.url
    print('Please copy the URL and paste it in browser for getting authentication code')
    print('')
    print(return_url)

get_auth_link()

# Make a POST request to exchange the Authorization Code for an Access Token
import json

def get_access_token():
    headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
    AUTH_CODE = 'XXXX'
    ACCESS_TOKEN_URL = 'https://www.linkedin.com/oauth/v2/accessToken'
    client_id= 'XXXX'
    client_secret= 'XXXX'
    redirect_uri = 'http://localhost:8080/login'
    PARAM = {'grant_type': 'authorization_code',
      'code': AUTH_CODE,
      'redirect_uri': redirect_uri,
      'client_id': client_id,
      'client_secret': client_secret}
    response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
    data = response.json()
    print(data)
    access_token = data['access_token']
    return access_token

get_access_token()

access_token = 'XXXX'

def get_profile(access_token):
    URL = "https://api.linkedin.com/v2/me"
    headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
    response = requests.get(url=URL, headers=headers)
    print(response.json())

get_profile(access_token)

As soon as I change the scope from r_liteprofile to r_basicprofile I get the an unauthorized_scope_error: r_basicprofile is not authorised for your application.一旦我将范围从r_liteprofile更改为r_basicprofile我就会得到一个未授权的_scope_error:r_basicprofile 未授权您的应用程序。 In my developpers page I have the scopes r_emailaddress , r_liteprofile and w_member_social authorised.在我的开发者页面中,我有r_emailaddressr_liteprofilew_member_social授权范围。 But only r_liteprofile works.但只有r_liteprofile有效。 From what I understand from the LinkedIn documentation , comments cannot be downloaded?据我从LinkedIn文档中了解到,评论无法下载?

So the big question really is, can comments be downloaded via API?所以真正的大问题是,评论可以通过 API 下载吗?

Bots or scrapers are not an option as they require explicit permission from LinkedIn, which I do not have.机器人或爬虫不是一种选择,因为它们需要 LinkedIn 的明确许可,而我没有。

Up-date: so no illegal solutions please.更新:所以请不要非法解决方案。 I knew before I have written this post that they exist.我在写这篇文章之前就知道它们存在。

Thanks for your help!谢谢你的帮助!

I found that login with the linkedin-api by tomquirk was really easy.我发现使用 tomquirk 的linkedin-api登录非常简单。 However, a KeyError was raised when a post does not have any comment.但是,当帖子没有任何评论时会引发 KeyError。 I fixed it in a fork and just submitted a pull request.我将它固定在一个叉子中,然后提交了一个拉取请求。 If you install the fork with python setup.py install , following code will get all your posts with comments:如果您使用python setup.py install安装 fork,以下代码将获取您所有带有评论的帖子:

from linkedin_api import Linkedin
import getpass

print("Please enter your LinkedIn credentials first (2FA must be disabled)")
username = input("user: ")
password = getpass.getpass('password: ')

api = Linkedin(username, password)

my_public_id = api.get_user_profile()['miniProfile']['publicIdentifier']

my_posts = api.get_profile_posts(public_id=my_public_id)
for post in my_posts:
    post_urn = post['socialDetail']['urn'].rsplit(':', 1)[1]
    print('POST:' + post_urn + '\n')
    comments = api.get_post_comments(post_urn, comment_count=100)
    for comment in comments:
        commenter = comment['commenter']['com.linkedin.voyager.feed.MemberActor']['miniProfile']
        print(f"\t{commenter['firstName']} {commenter['lastName']}: {comment['comment']['values'][0]['value']}\n")

Note: this does not use the official API, and according to the README.md:注意:这里不使用官方API,根据README.md:

This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account.该项目违反了 Linkedin 的用户协议第 8.2 条,因此,Linkedin 可能(并且将)暂时或永久禁止您的帐户。

However, as long as you scrape comments only from your own account you should be fine.但是,只要您仅从自己的帐户中抓取评论,就可以了。

There are two legal options to download comments that do not breach LinkedIn's terms and conditions.下载不违反 LinkedIn 条款和条件的评论有两种合法选择。 Both require LinkedIn's permission.两者都需要领英的许可。

Option A: Comment API选项 A: 评论 API

The Comment API is part of the Page Management APIs which in turn is part of the Marketing Developer Program (MDP). 评论 API是页面管理 API 的一部分,而页面管理 API 又是营销开发人员计划 (MDP) 的一部分。 LinkedIn describes the application process for its marketing developer program here . LinkedIn 在此处描述了其营销开发人员计划的申请流程。 It requires filling out a form specifying the use case.它需要填写一个指定用例的表格。 Then LinkedIn decides whether to grant access.然后 LinkedIn 决定是否授予访问权限。 These use cases will be restricted or not approved. 这些用例将受到限制或不被批准。



Option B: Web crawling and scraping LinkedIn with an exemption (whitelist)选项 B: Web 爬取和抓取 LinkedIn 的豁免(白名单)
Exemption process is described here .此处描述了豁免过程。

I am going for option A. Let's see if they give me access.我选择选项 A。让我们看看他们是否允许我访问。 I'll up-date the post accordingly.我会相应地更新帖子。

Up-date 19/05/2022 2022 年 19 月 5 日更新
LinkedIn has granted the permissions for the MDP. LinkedIn 已授予 MDP 的权限。 It took about 2 weeks.大约花了2周时间。

Up-date 27/05/2022 2022 年 5 月 27 日更新
Here is a great tutorial to get individual posts. 是获取个人帖子的绝佳教程。 Getting company page posts is another story - entirely - so opened a new query获取公司页面帖子是另一回事- 完全- 所以打开了一个新查询

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM