簡體   English   中英

我無法弄清楚如何自動進行 OAuth 身份驗證以訪問 Google Drive。 (Python)

[英]I'm having trouble figuring out how to automate OAuth authentication to access Google Drive. (Python)

我們要解決的問題

  1. 通過 OAuth 身份驗證使用用戶帳戶訪問共享驅動器
  2. 檢索電子表格 -> 轉換為鑲木地板類型 3. 保存到 GCS
  3. 保存到 GCS

這些過程寫在下面的 main() 函數中,我想每天使用 CloudFunction 和 CloudScheduler 將它們應用到定期處理中。

但是,實際上,下面的代碼要求用戶通過瀏覽器手動登錄到他/她的 Google 帳戶。 我想重寫代碼,以便可以自動完成此登錄,但我無法理解它......如果有人可以幫助我,我將不勝感激......

使用www.DeepL.com/Translator翻譯(免費版)

 ### ※※Authentication is required by browser※※
creds = flow.run_local_server(port=0)
### Result
Please visit this URL to authorize this application: 
https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=132987612861-
4j24afrouontpeiv5ryy7sn64inhr.apps.googleusercontent.com&redirect_uri=
http%yyy%2Flocalhost%3yy6%2F&scope=httpsyyF%2Fwww.googleapis.com%2Fauth%2Fdrive.
readonly&state=XXXXXXXXXXXXXXXXXXXXXXXXXXX&access_type=offline

readonly&state=XXXXXXXXXXXXXXXXXXXXX 部分隨着每次執行而改變。

執行上述代碼部分時轉換的瀏覽器屏幕

完整的相關源代碼

from __future__ import print_function
import io
import os
import key
import json
import os.path
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
from pprint import pprint
from webbrowser import Konqueror
from google.cloud import storage as gcs
from google.oauth2 import service_account
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.http import MediaIoBaseDownload, MediaIoBaseUpload, MediaFileUpload
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

SCOPES = ['https://www.googleapis.com/auth/drive.readonly']

def main(event, context):
    """Drive v3 API
    Function to access shared Drive→get Spreadsheet→convert to parquet→upload to GCS    """
    creds = None
    file_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxx' #Unedited data in shared drive
    mime_type = 'text/csv'

    # OAuth authentication to access shared drives
    if os.path.exists('token.json'):
        creds = Credentials.from_authorized_user_file('token.json', SCOPES)
     # Allow users to log in if there are no (valid) credentials available    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            ### ※※Browser authentication required※※
            creds = flow.run_local_server(port=0)##Currently, we need a manual login here!
        with open('token.json', 'w') as token:
            token.write(creds.to_json())
    try:
        # Retrieve spreadsheets from shared drives
        service = build('drive', 'v3', credentials=creds)
        request = service.files().export_media(fileId=file_id, mimeType=mime_type)
        fh = io.BytesIO()
        downloader = MediaIoBaseDownload(fh, request)
        done = False
        print(io.StringIO(fh.getvalue().decode()))

        while done is False:
            status, done = downloader.next_chunk()
        # Read "Shared Drive/SpreadSheet" -> convert to parquet
        df = pd.read_csv(io.StringIO(fh.getvalue().decode()))
        table = pa.Table.from_pandas(df)
        buf = pa.BufferOutputStream()
        pq.write_table(table, buf,compression=None)

        # service_account for save to GCS
        key_path = 'service_account_file.json'
        service_account_info = json.load(open(key_path))
        credentials = service_account.Credentials.from_service_account_info(service_account_info)
        client = gcs.Client(
            credentials=credentials,
            project=credentials.project_id,
        )

        # GCS information to be saved 
        bucket_name = 'bucket-name'
        blob_name = 'sample-folder/daily-data.parquet'#save_path
        bucket = client.get_bucket(bucket_name)
        blob = bucket.blob(blob_name)

        # parquet save to GCS
        blob.upload_from_string(data=buf.getvalue().to_pybytes())
        # ↓If a print appears, the data has been saved.
        print("Blob '{}' created to '{}'!".format(blob_name, bucket_name))

    except HttpError as error:
        # TODO(developer) - Handle errors from drive API.
        print(f'An error occurred: {error}')

我自己嘗試過的

我嘗試使用 selenium 運行瀏覽器,但由於瀏覽器登錄 URL 每次都不一樣,無法很好地實現。 ←我也許能找到辦法。

試試這個方法。 為我工作!

該解決方案包括創建一個服務帳戶並與 SA 電子郵件共享您的數據文件夾。

雲端硬盤 API 服務帳號

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM