簡體   English   中英

谷歌雲存儲文件系統,Python Package 錯誤:AttributeError: 'GCSFile' object 沒有屬性 'gcsfs'

[英]Google Cloud Storage File System, Python Package Error: AttributeError: 'GCSFile' object has no attribute 'gcsfs'

我正在嘗試運行 python 代碼,它將從源 URL 下載和 stream 數據塊到目標雲存儲 blob。 它在獨立電腦、本地 function 等上運行良好。 但是當我嘗試使用 GCP Cloud RUN 時,它會拋出奇怪的錯誤。

AttributeError: 'GCSFile' object has no attribute 'gcsfs'

完全錯誤:

Traceback (most recent call last):
  File "/home/<user>/.local/lib/python3.9/site-packages/fsspec/spec.py", line 1683, in __del__
    self.close()
  File "/home/<user>/.local/lib/python3.9/site-packages/fsspec/spec.py", line 1661, in close
    self.flush(force=True)
  File "/home/<user>/.local/lib/python3.9/site-packages/fsspec/spec.py", line 1527, in flush
    self._initiate_upload()
  File "/home/<user>/.local/lib/python3.9/site-packages/gcsfs/core.py", line 1443, in _initiate_upload
    self.gcsfs.loop,
AttributeError: 'GCSFile' object has no attribute 'gcsfs'

它消耗了我的一周,任何幫助或方向都非常感謝,在此先感謝。

實際使用的代碼:

from flask import Flask, request
import os
import gcsfs
import requests

app = Flask(__name__)


@app.route('/urltogcs')
def urltogcs():
    try:
        os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "secret.json"
        gcp_file_system = gcsfs.GCSFileSystem(project='<project_id>')
        session = requests.Session()
        url = request.args.get('source', 'temp')
        blob_path = request.args.get('destination', 'temp')
        with session.get(url, stream=True) as r:
            r.raise_for_status()
            with gcp_file_system.open(blob_path, 'wb') as f_obj:
                for chunk in r.iter_content(chunk_size=1024 * 1024):
                    f_obj.write(chunk)
        return f'Successfully downloaded from {url} to {blob_path} :)'
    except Exception as e:
        print("Failure")
        print(e)
        return f'download failed for  {url} :('


if __name__ == "__main__":
    app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))

您的代碼(帶有提議的更改)對我有用:

main.py

from flask import Flask, request
import os
import gcsfs
import requests

app = Flask(__name__)

project = os.getenv("PROJECT")
port = os.getenv("PORT", 8080)

@app.route('/urltogcs')
def urltogcs():
    try:
        gcp_file_system = gcsfs.GCSFileSystem(project=project)
        session = requests.Session()
        url = request.args.get('source', 'temp')
        blob_path = request.args.get('destination', 'temp')
        with session.get(url, stream=True) as r:
            r.raise_for_status()
            with gcp_file_system.open(blob_path, 'wb') as f_obj:
                for chunk in r.iter_content(chunk_size=1024 * 1024):
                    f_obj.write(chunk)
        return f'Successfully downloaded from {url} to {blob_path} :)'
    except Exception as e:
        print("Failure")
        print(e)
        return f'download failed for  {url}


if __name__ == "__main__":
    app.run(debug=True, host="0.0.0.0", port=int(port))

注意代碼需要來自不理想環境的project 如果gcsfs.GCSFileSystem不需要project會更好。 或者, project可以從 Google 的元數據服務獲得。 為了方便 (,)。 我正在使用環境進行設置。

requirements.txt

Flask==2.2.2
gcsfs==2022.7.1
gunicorn==20.1.0

Dockerfile

FROM python:3.10-slim

ENV PYTHONUNBUFFERED True

ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./

RUN pip install --no-cache-dir -r requirements.txt

CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 main:app

Bash 腳本:

BILLING="[YOUR-BILLING]"
PROJECT="[YOUR-PROJECT]"
REGION="[YOUR-REGION]"
BUCKET="[YOUR-BUCKET]"

# Create Project
gcloud projects create ${PROJECT}

# Associate with Billing Account
gcloud beta billing projects link ${PROJECT} \
--billing-account=${BILLING}

# Enabled services
SERVICES=(
  "artifactregistry"
  "cloudbuild"
  "run"
)
for SERVICE in ${SERVICES[@]}
do
  gcloud services enable ${SERVICE}.googleapis.com \
  --project=${PROJECT}
done

# Create Bucket
gsutil mb -p ${PROJECT} gs://${BUCKET}

# Service Account
ACCOUNT=tester
EMAIL=${ACCOUNT}@${PROJECT}.iam.gserviceaccount.com

# Create Service Account
gcloud iam service-accounts create ${ACCOUNT} \
--project=${PROJECT}

# Create Service Account key
gcloud iam service-accounts keys create ${PWD}/${ACCOUNT}.json \
--iam-account=${EMAIL} \
--project=${PROJECT}

# Ensure Service Account can write to storage
gcloud projects add-iam-policy-binding ${PROJECT} \
--role=roles/storage.admin \
--member=serviceAccount:${EMAIL}

# Only needed for local testing
export GOOGLE_APPLICATION_CREDENTIALS=${PWD}/${ACCOUNT}.json

# Deploy Cloud Run service
# Run service as Service Account
NAME="urltogcs"
gcloud run deploy ${NAME} \
--source=${PWD}  \
--set-env-vars=PROJECT=${PROJECT} \
--no-allow-unauthenticated \
--service-account=${EMAIL} \
--region=${REGION} \
--project=${PROJECT}

# Grab the Cloud Run service's endpoint
ENDPOINT=$(gcloud run services describe ${NAME} \
--region=${REGION} \
--project=${PROJECT} \
--format="value(status.url)")

# Cloud Run service requires auth
TOKEN=$(gcloud auth print-identity-token)

# This page
SRC="https://stackoverflow.com/questions/73393808/"

# Generate a GCS Object name by epoch
DST="gs://${BUCKET}/$(date +%s)"

curl \
--silent \
--get \
--header "Authorization: Bearer ${TOKEN}" \
--data-urlencode "source=${SRC}" \
--data-urlencode "destination=${DST}" \
--write-out '%{response_code}' \
--output /dev/null \
${ENDPOINT}/urltogcs

產量OK:

200

和:

gsutil ls gs://${BUCKET}

gs://${BUCKET}/1660780270

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM