简体   繁体   English

httplib2.socks.HTTPError: (403, b'Forbidden') python apache-beam 数据流

[英]httplib2.socks.HTTPError: (403, b'Forbidden') python apache-beam dataflow

I work on a google cloud environment where i don't have inte.net access.我在没有 inte.net 访问权限的谷歌云环境中工作。 I'm trying to launch a dataflow job.我正在尝试启动数据流作业。 I'm using a proxy to access the inte.net.我正在使用代理访问 inte.net。 when i run a simple wordcount.py with dataflow i get this error当我用数据流运行一个简单的 wordcount.py 时,我得到这个错误

WARNING:apache_beam.utils.retry:Retry with exponential backoff: waiting for 4.750968074377858 seconds before retrying _uncached_gcs_file_copy because we caught exception: httplib2.socks.HTTPError: (403, b'Forbidden')
 Traceback for above exception (most recent call last):
  File "/opt/py38/lib64/python3.8/site-packages/apache_beam/utils/retry.py", line 275, in wrapper
    return fun(*args, **kwargs)
  File "/opt/py38/lib64/python3.8/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 631, in _uncached_gcs_file_copy
    self.stage_file(to_folder, to_name, f, total_size=total_size)
  File "/opt/py38/lib64/python3.8/site-packages/apache_beam/runners/dataflow/internal/apiclient.py", line 735, in stage_file
    response = self._storage_client.objects.Insert(request, upload=upload)
  File "/opt/py38/lib64/python3.8/site-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1152, in Insert
    return self._RunMethod(
  File "/opt/py38/lib64/python3.8/site-packages/apitools/base/py/base_api.py", line 728, in _RunMethod
    http_response = http_wrapper.MakeRequest(
  File "/opt/py38/lib64/python3.8/site-packages/apitools/base/py/http_wrapper.py", line 359, in MakeRequest
    retry_func(ExceptionRetryArgs(http, http_request, e, retry,
  File "/opt/py38/lib64/python3.8/site-packages/apache_beam/io/gcp/gcsio_overrides.py", line 45, in retry_func
    return http_wrapper.HandleExceptionsAndRebuildHttpConnections(retry_args)
  File "/opt/py38/lib64/python3.8/site-packages/apitools/base/py/http_wrapper.py", line 304, in HandleExceptionsAndRebuildHttpConnections
    raise retry_args.exc
  File "/opt/py38/lib64/python3.8/site-packages/apitools/base/py/http_wrapper.py", line 348, in MakeRequest
    return _MakeRequestNoRetry(
  File "/opt/py38/lib64/python3.8/site-packages/apitools/base/py/http_wrapper.py", line 397, in _MakeRequestNoRetry
    info, content = http.request(
  File "/opt/py38/lib64/python3.8/site-packages/google_auth_httplib2.py", line 209, in request
    self.credentials.before_request(self._request, method, uri, request_headers)
  File "/opt/py38/lib64/python3.8/site-packages/google/auth/credentials.py", line 134, in before_request
    self.refresh(request)
  File "/opt/py38/lib64/python3.8/site-packages/google/auth/compute_engine/credentials.py", line 111, in refresh
    self._retrieve_info(request)
  File "/opt/py38/lib64/python3.8/site-packages/google/auth/compute_engine/credentials.py", line 87, in _retrieve_info
    info = _metadata.get_service_account_info(
  File "/opt/py38/lib64/python3.8/site-packages/google/auth/compute_engine/_metadata.py", line 234, in get_service_account_info
    return get(request, path, params={"recursive": "true"})
  File "/opt/py38/lib64/python3.8/site-packages/google/auth/compute_engine/_metadata.py", line 150, in get
    response = request(url=url, method="GET", headers=_METADATA_HEADERS)
  File "/opt/py38/lib64/python3.8/site-packages/google_auth_httplib2.py", line 119, in __call__
    response, data = self.http.request(
  File "/opt/py38/lib64/python3.8/site-packages/httplib2/__init__.py", line 1701, in request
    (response, content) = self._request(
  File "/opt/py38/lib64/python3.8/site-packages/httplib2/__init__.py", line 1421, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/opt/py38/lib64/python3.8/site-packages/httplib2/__init__.py", line 1343, in _conn_request
    conn.connect()
  File "/opt/py38/lib64/python3.8/site-packages/httplib2/__init__.py", line 1026, in connect
    self.sock.connect((self.host, self.port) + sa[2:])
  File "/opt/py38/lib64/python3.8/site-packages/httplib2/socks.py", line 504, in connect
    self.__negotiatehttp(destpair[0], destpair[1])
  File "/opt/py38/lib64/python3.8/site-packages/httplib2/socks.py", line 465, in __negotiatehttp
    raise HTTPError((statuscode, statusline[2]))

My service account have this role:我的服务帐户有这个角色:

BigQuery Data Editor BigQuery User Dataflow Developer Dataflow Worker Service Account User Storage Admin BigQuery 数据编辑器 BigQuery 用户数据流开发人员数据流工作人员服务帐户用户存储管理员

The istance have Cloud API access scopes: Allow full access to all Cloud APIs该实例具有 Cloud API 访问范围:允许完全访问所有 Cloud API

what is the problem?问题是什么?

Based on the comment @luca the above error is solved using an internal proxy that will allow access to the inte.net.根据@luca 的评论,上述错误已使用允许访问 inte.net 的内部代理解决。 Add this --no_use_public_ip to the command and set no_proxy="metadata.google.internal,www.googleapis.com,dataflow.googleapis.com,bigquery.googleapis.com".将此 --no_use_public_ip 添加到命令并设置 no_proxy="metadata.google.internal,www.googleapis.com,dataflow.googleapis.com,bigquery.googleapis.com"。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 apache-beam 中使用 python ReadFromKafka 不支持的信号:2 - ReadFromKafka with python in apache-beam Unsupported signal: 2 apache-beam 从 GCS 桶的多个文件夹中读取多个文件并加载它 biquery python - apache-beam reading multiple files from multiple folders of GCS buckets and load it biquery python 如何解决 Apache-Beam 中的 BeamDeprecationWarning - How to solve BeamDeprecationWarning in Apache-Beam 我们可以在 apache-beam 的批处理管道中使用 Windows + GroupBy 或 State & timely 打破 fusion b/w ParDo 吗? - Can we break fusion b/w ParDo using Windows + GroupBy or State & timely in batch pipeline of apache-beam? 在现有的谷歌云 VM 上运行 Apache-beam 管道作业 - Run Apache-beam pipeline job on existing google cloud VM 无法使用 Apache-Beam JDBC 连接到 Cloud SQL - Cannot connect to Cloud SQL using Apache-Beam JDBC Google Dataflow 上的 Apache Beam 示例的权限错误 - Permissions error with Apache Beam example on Google Dataflow Spring Cloud Dataflow 与 Apache Beam/GCP 数据流说明 - Spring Cloud Dataflow vs Apache Beam/GCP Dataflow Clarification 如何为 Apache Beam/Dataflow 经典模板(Python)和数据管道实现 CI/CD 管道 - How to implement a CI/CD pipeline for Apache Beam/Dataflow classic templates (Python) & data pipelines 为什么在 apache-beam 中出现错误:“TypeError:使用 SessionWindow 时无法将 GlobalWindow 转换为 _IntervalWindowBase? - Why in apache-beam I get error: "TypeError: Cannot convert GlobalWindow to _IntervalWindowBase when using SessionWindow?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM