簡體   English   中英

如何使用工作或學校帳戶將 SharePoint Online (Office365) Excel 文件讀取到 Python 特別是 pandas?

[英]How to read SharePoint Online (Office365) Excel files into Python specifically pandas with Work or School Account?

這個問題與下面的鏈接非常相似。 如何使用工作或學校帳戶閱讀 SharePoint Online (Office365) Excel 中的 Python 文件?

本質上,我想將 excel 文件從 SharePoint 導入 pandas 以進行進一步分析。

問題是當我運行下面的代碼時出現以下錯誤。

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\r\n<!DOCT'

我的代碼:

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File 

url = 'https://companyname.sharepoint.com/SitePages/Home.aspx'
username = 'fakeaccount@company.com'
password = 'password!'
relative_url = '/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number%Four/Target_Excel_File_v4.xlsx?d=w8f97c2341898_random_numbers_and_letters_a065c12cbcsf=1&e=KXoU4s'


ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  #this gives me a KeyError: 'Title'
  #print("Web title: {0}".format(web.properties['Title']))
  print('Authentication Successful')
else:
  print(ctx_auth.get_last_error())


import io
import pandas as pd

response = File.open_binary(ctx, relative_url)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read file into pandas dataframe
df = pd.read_excel(bytes_file_obj)

print(df)

對於那些在這個問題上像我一樣結束的人,我發現必須將完整的 URL 指向File ,而不僅僅是路徑:

#import all the libraries
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File 
import io
import pandas as pd

#target url taken from sharepoint and credentials
url = 'https://company.sharepoint.com/Shared%20Documents/Folder%20Number1/Folder%20Number2/Folder3/Folder%20Number4/Target_Excel_File_v4.xlsx?cid=_Random_letters_and_numbers-21dbf74c'
username = 'Dumby_account@company.com'
password = 'Password!'

ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
  ctx = ClientContext(url, ctx_auth)
  web = ctx.web
  ctx.load(web)
  ctx.execute_query()
  print("Authentication successful")

response = File.open_binary(ctx, url)

#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start

#read excel file and each sheet into pandas dataframe 
df = pd.read_excel(bytes_file_obj, sheetname = None)

也許值得注意的是,官方存儲庫中包含許多關於共享點、驅動器和團隊的常見操作的示例。

安裝注意事項一:

pip 安裝 Office365-REST-Python-Client

還有一個office365包,但上面的似乎是正確的,請在此處輸入鏈接描述

我知道這是 2-3 年之后,但也許有人會知道。 我正在使用這段代碼:

from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File
import pandas as pd
import io
url = 'some link to sharepoint'
username = 'mail'
password = 'password'
ctx_auth = AuthenticationContext(url)
if ctx_auth.acquire_token_for_user(username, password):
 ctx = ClientContext(url, ctx_auth)
 web = ctx.web
 ctx.load(web)
 ctx.execute_query()
 print("Authentication successful")
response = File.open_binary(ctx, url)
#save data to BytesIO stream
bytes_file_obj = io.BytesIO()
bytes_file_obj.write(response.content)
bytes_file_obj.seek(0) #set file object to start
#read excel file and each sheet into pandas dataframe
df = pd.read_excel(bytes_file_obj, sheet_name = None, engine='openpyxl')
print(df)

我有錯誤

zipfile.BadZipFile: File is not a zip file

我找不到任何方法來解決這個問題。 也許你們中有人知道如何弄清楚?

以下適用於客戶端 ID 和密碼(庫:Office365)

# Credential to connect to your SP Site
SITE_URL ='https://XXXXXX.sharepoint.com/sites/yoursitename'
CLIENT_ID = 'xxxxxxxx-xxx-xxxx-xxxxxxxxxxxxxxxxx'
CLIENT_SECRET= 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

# Establish the connection
context = ClientContext(SITE_URL).with_credentials(ClientCredential(CLIENT_ID, CLIENT_SECRET))

response = File.open_binary(context, '/'.join(['/sites/yoursitename/Shared Documents/Work/OnlyFolderName', 
                                           'yourfilename.xlsx']))


df = pd.read_excel(io.BytesIO(response.content))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM