简体   繁体   中英

Access LinkedIn Profile with Python

I am trying to computationally access my own LinkedIn profile via API to download my own posts. There are three recent Python wrappers to access my profile, eg linkedin-sdk , pawl , LinkedIn V2 . However, I have been unable to make them work. The problem is the authentication. I have seen the famous LinkedIn-API wrapper , but its authentication process is complex and difficult probably due to LinkedIn changing its authentication process and access scope.

Based on this tutorial from last year I have been able to access my own profile to view my name, country, language and id.

import requests

#get access_token by post with user & password
#Step 1 - GET to request for authentication
def get_auth_link():
    URL = "https://www.linkedin.com/oauth/v2/authorization"
    client_id= 'XXXX'
    redirect_uri = 'http://localhost:8080/login'
    scope='r_liteprofile'
    PARAMS = {'response_type':'code', 'client_id':client_id,  'redirect_uri':redirect_uri, 'scope':scope}
    r = requests.get(url = URL, params = PARAMS)
    return_url = r.url
    print('Please copy the URL and paste it in browser for getting authentication code')
    print('')
    print(return_url)

get_auth_link()

# Make a POST request to exchange the Authorization Code for an Access Token
import json

def get_access_token():
    headers = {'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'OAuth gem v0.4.4'}
    AUTH_CODE = 'XXXX'
    ACCESS_TOKEN_URL = 'https://www.linkedin.com/oauth/v2/accessToken'
    client_id= 'XXXX'
    client_secret= 'XXXX'
    redirect_uri = 'http://localhost:8080/login'
    PARAM = {'grant_type': 'authorization_code',
      'code': AUTH_CODE,
      'redirect_uri': redirect_uri,
      'client_id': client_id,
      'client_secret': client_secret}
    response = requests.post(ACCESS_TOKEN_URL, data=PARAM, headers=headers, timeout=600)
    data = response.json()
    print(data)
    access_token = data['access_token']
    return access_token

get_access_token()

access_token = 'XXXX'

def get_profile(access_token):
    URL = "https://api.linkedin.com/v2/me"
    headers = {'Content-Type': 'application/x-www-form-urlencoded','Authorization':'Bearer {}'.format(access_token),'X-Restli-Protocol-Version':'2.0.0'}
    response = requests.get(url=URL, headers=headers)
    print(response.json())

get_profile(access_token)

As soon as I change the scope from r_liteprofile to r_basicprofile I get the an unauthorized_scope_error: r_basicprofile is not authorised for your application. In my developpers page I have the scopes r_emailaddress , r_liteprofile and w_member_social authorised. But only r_liteprofile works. From what I understand from the LinkedIn documentation , comments cannot be downloaded?

So the big question really is, can comments be downloaded via API?

Bots or scrapers are not an option as they require explicit permission from LinkedIn, which I do not have.

Up-date: so no illegal solutions please. I knew before I have written this post that they exist.

Thanks for your help!

I found that login with the linkedin-api by tomquirk was really easy. However, a KeyError was raised when a post does not have any comment. I fixed it in a fork and just submitted a pull request. If you install the fork with python setup.py install , following code will get all your posts with comments:

from linkedin_api import Linkedin
import getpass

print("Please enter your LinkedIn credentials first (2FA must be disabled)")
username = input("user: ")
password = getpass.getpass('password: ')

api = Linkedin(username, password)

my_public_id = api.get_user_profile()['miniProfile']['publicIdentifier']

my_posts = api.get_profile_posts(public_id=my_public_id)
for post in my_posts:
    post_urn = post['socialDetail']['urn'].rsplit(':', 1)[1]
    print('POST:' + post_urn + '\n')
    comments = api.get_post_comments(post_urn, comment_count=100)
    for comment in comments:
        commenter = comment['commenter']['com.linkedin.voyager.feed.MemberActor']['miniProfile']
        print(f"\t{commenter['firstName']} {commenter['lastName']}: {comment['comment']['values'][0]['value']}\n")

Note: this does not use the official API, and according to the README.md:

This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account.

However, as long as you scrape comments only from your own account you should be fine.

There are two legal options to download comments that do not breach LinkedIn's terms and conditions. Both require LinkedIn's permission.

Option A: Comment API

The Comment API is part of the Page Management APIs which in turn is part of the Marketing Developer Program (MDP). LinkedIn describes the application process for its marketing developer program here . It requires filling out a form specifying the use case. Then LinkedIn decides whether to grant access. These use cases will be restricted or not approved.



Option B: Web crawling and scraping LinkedIn with an exemption (whitelist)
Exemption process is described here .

I am going for option A. Let's see if they give me access. I'll up-date the post accordingly.

Up-date 19/05/2022
LinkedIn has granted the permissions for the MDP. It took about 2 weeks.

Up-date 27/05/2022
Here is a great tutorial to get individual posts. Getting company page posts is another story - entirely - so opened a new query

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM