简体   繁体   中英

Downloading a Text File from a Public Sharepoint Link using Requests in Python

I'm trying to automate the downloading of a text file from a publicly shared Sharepoint link. The original link is to a folder containing two files but I got the direct download link of the file that I need which is:

https://-my.sharepoint.com/personal/gamma_/_layouts/15/download.aspx?UniqueId=a0db276e%2Ddf75%2D49b7%2Db671%2D1c49e365ef3f

When I enter the above url into a browser I get the popup option to open or download the file. I'm trying to write some Python code to get this automatically and this what I've come up with so far

import requests

url = "https://<abc>-my.sharepoint.com/personal/gamma_<abc>/_layouts/15" \
      "/download.aspx?UniqueId=a0db276e%2Ddf75%2D49b7%2Db671%2D1c49e365ef3f "

hdr = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0',
       'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
       'Accept-Encoding': 'gzip, deflate, br',
       'Accept-Language': 'en-US,en;q=0.5',
       'Upgrade-Insecure-Requests': '1',
       'Sec-Fetch-Dest': 'document',
       'Sec-Fetch-Mode': 'navigate',
       'Sec-Fetch-Site': 'none',
       'Sec-Fetch-User': '?1',
       'Connection': 'keep-alive'}

myfile = requests.get(url, headers=hdr)

open('c:/users/scott/onedrive/desktop/gamma.las', 'wb').write(myfile.content)

I originally tried without the user agent and when I opened gamma.las there was only 403 in the file. It'll now connect and create the file but I just get a whole bunch of HTML for what looks like a Microsoft login page, so I'm assuming that I'm missing some authentication step. I don't have any user id or password for this company's Sharepoint so I'm not sure if I can use REST API as the examples I've seen seem to require these.

Am I able to do this using Requests? If not, am I able to use REST API without user credentials for this company's Sharepoint?

Not supplying your credentials most probably means you are (implicitly) using built-in windows authentication in your organization. Check out if this helps: Handling windows authentication while accessing url using requests

The python library mentioned there to handle built-in windows auth is requests-negotiate-sspi . Not sure, if it's going to work with federation (your website ends with ".sharepoint.com" meaning you are probably using federation as well), but may be worth trying.

So, I would try something like this (I doubt headers really matter in your case, but you could try adding them as well)

import requests
from requests_negotiate_sspi import HttpNegotiateAuth

url = ...

myfile = requests.get(url, auth=HttpNegotiateAuth())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM