[英]I'm having trouble figuring out how to automate OAuth authentication to access Google Drive. (Python)
這些過程寫在下面的 main() 函數中,我想每天使用 CloudFunction 和 CloudScheduler 將它們應用到定期處理中。
但是,實際上,下面的代碼要求用戶通過瀏覽器手動登錄到他/她的 Google 帳戶。 我想重寫代碼,以便可以自動完成此登錄,但我無法理解它......如果有人可以幫助我,我將不勝感激......
使用www.DeepL.com/Translator翻譯(免費版)
### ※※Authentication is required by browser※※
creds = flow.run_local_server(port=0)
### Result
Please visit this URL to authorize this application:
https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=132987612861-
4j24afrouontpeiv5ryy7sn64inhr.apps.googleusercontent.com&redirect_uri=
http%yyy%2Flocalhost%3yy6%2F&scope=httpsyyF%2Fwww.googleapis.com%2Fauth%2Fdrive.
readonly&state=XXXXXXXXXXXXXXXXXXXXXXXXXXX&access_type=offline
readonly&state=XXXXXXXXXXXXXXXXXXXXX 部分隨着每次執行而改變。
執行上述代碼部分時轉換的瀏覽器屏幕
from __future__ import print_function
import io
import os
import key
import json
import os.path
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
from pprint import pprint
from webbrowser import Konqueror
from google.cloud import storage as gcs
from google.oauth2 import service_account
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.http import MediaIoBaseDownload, MediaIoBaseUpload, MediaFileUpload
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
SCOPES = ['https://www.googleapis.com/auth/drive.readonly']
def main(event, context):
"""Drive v3 API
Function to access shared Drive→get Spreadsheet→convert to parquet→upload to GCS """
creds = None
file_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxx' #Unedited data in shared drive
mime_type = 'text/csv'
# OAuth authentication to access shared drives
if os.path.exists('token.json'):
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
# Allow users to log in if there are no (valid) credentials available if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'credentials.json', SCOPES)
### ※※Browser authentication required※※
creds = flow.run_local_server(port=0)##Currently, we need a manual login here!
with open('token.json', 'w') as token:
token.write(creds.to_json())
try:
# Retrieve spreadsheets from shared drives
service = build('drive', 'v3', credentials=creds)
request = service.files().export_media(fileId=file_id, mimeType=mime_type)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
print(io.StringIO(fh.getvalue().decode()))
while done is False:
status, done = downloader.next_chunk()
# Read "Shared Drive/SpreadSheet" -> convert to parquet
df = pd.read_csv(io.StringIO(fh.getvalue().decode()))
table = pa.Table.from_pandas(df)
buf = pa.BufferOutputStream()
pq.write_table(table, buf,compression=None)
# service_account for save to GCS
key_path = 'service_account_file.json'
service_account_info = json.load(open(key_path))
credentials = service_account.Credentials.from_service_account_info(service_account_info)
client = gcs.Client(
credentials=credentials,
project=credentials.project_id,
)
# GCS information to be saved
bucket_name = 'bucket-name'
blob_name = 'sample-folder/daily-data.parquet'#save_path
bucket = client.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
# parquet save to GCS
blob.upload_from_string(data=buf.getvalue().to_pybytes())
# ↓If a print appears, the data has been saved.
print("Blob '{}' created to '{}'!".format(blob_name, bucket_name))
except HttpError as error:
# TODO(developer) - Handle errors from drive API.
print(f'An error occurred: {error}')
我嘗試使用 selenium 運行瀏覽器,但由於瀏覽器登錄 URL 每次都不一樣,無法很好地實現。 ←我也許能找到辦法。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.