简体   繁体   English

如何认证本地POSTGRESQL服务器访问谷歌云存储

[英]How to authenticate local POSTGRESQL server to access Google Cloud Storage

I am new to the cloud and to data engineering as well.我也是云计算和数据工程的新手。

I have a large csv file stored in a GCS bucket.我有一个很大的 csv 文件存储在 GCS 存储桶中。 I would like to write a python script to bulk-insert the data into a postgresql database on my local machine using a COPY statement.我想编写一个 python 脚本,使用 COPY 语句将数据批量插入到本地计算机上的 postgresql 数据库中。 I cannot figure out the authentication though.我无法弄清楚身份验证。

I would like to do something like this:我想做这样的事情:

import psycopg2

conn = psycopg2.connect(database=database,
                        user=user,
                        password=password,
                        host=host,
                        port=port)

cursor = conn.cursor()
file = 'https://storage.cloud.google.com/<my_project>/<my_file.csv>'
sql_query = f"COPY <MY_TABLE> FROM {file} WITH CSV"
cursor.execute(sql_query)
conn.commit()
conn.close()

I get this error message:我收到此错误消息:

psycopg2.errors.UndefinedFile: could not open file "https://storage.cloud.google.com/<my_project>/<my_file.csv>" for reading: No such file or directory HINT: COPY FROM instructs the PostgreSQL server process to read a file. psycopg2.errors.UndefinedFile:无法打开文件“https://storage.cloud.google.com/<my_project>/<my_file.csv>”进行阅读:没有这样的文件或目录提示:COPY FROM 指示 PostgreSQL 服务器进程读取文件。 You may want a client-side facility such as psql's \copy.您可能需要客户端工具,例如 psql 的 \copy。

The same happens when I run the query in psql.当我在 psql 中运行查询时,也会发生同样的情况。

I assume the problem is in authentication.我认为问题出在身份验证中。 I have set up Application Default Credentials with Google Cloud CLI and when acting like the authenticated user, I can easily download the file using wget.我已经使用 Google Cloud CLI 设置了应用程序默认凭据,当像经过身份验证的用户一样操作时,我可以使用 wget 轻松下载文件。 When I switch to postgres user, I get "access denied" error.当我切换到 postgres 用户时,出现“拒绝访问”错误。

The ADC seem to work only with client libraries and command-line tools. ADC 似乎只能与客户端库和命令行工具一起使用。

I use Ubuntu 22.04.1 LTS.我使用 Ubuntu 22.04.1 LTS。

Thanks for any help.谢谢你的帮助。

This is not going to work for you.这对你不起作用。 The file will need to be in a location permitted to the server process and also not fetched over http (it's a local file path it is expecting).该文件将需要位于服务器进程允许的位置,并且也不能通过 http 获取(这是它期望的本地文件路径)。

You can supply a program/script that will fetch the file for you and print it to STDOUT which the server can consume.您可以提供一个程序/脚本,它将为您获取文件并将其打印到服务器可以使用的 STDOUT。

Or - do what the error message suggests and handle it locally with psycopg's copy support .或者 - 执行错误消息的建议并使用psycopg 的复制支持在本地处理它。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将本地存储(活动存储)迁移到谷歌云存储 - How to migrate local storage (active storage) to google cloud storage 如何在不使用本地服务帐户密钥文件的情况下从 Cloud Run 服务访问 Google Cloud Storage 存储桶? - How to access to a Google Cloud Storage bucket from a Cloud Run service without using a local Service Account key file? 如何将 Google Cloud Storage 存储桶与本地文件夹同步以供离线使用? - How to sync Google Cloud Storage bucket with local folder for offline usage? 本地 Postgres 数据库到谷歌云 PostgreSQL Github - Local Postgres database to Google Cloud PostgreSQL Github 如何在谷歌云存储中获取 GS_SECRET_ACCESS_KEY 和 GS_ACCESS_KEY_ID - How to get GS_SECRET_ACCESS_KEY and GS_ACCESS_KEY_ID in google cloud storage 将本地机器上的数据导入谷歌云存储 - Import the data from the local machine to the Google cloud Storage 在谷歌云存储中访问 Spacy 的训练模型/文件夹 - Access Spacy's trained model/folder in google cloud storage 在我不属于的项目中访问谷歌云存储 object - Access Google Cloud Storage object in a project that I don't belong to 如何使用 pip 下载 python package 到具有公共访问权限的谷歌云存储桶中并从那里安装 - How to download a python package using pip into google Cloud Storage bucket with public access and to install from there 如何通过Cloud Functions上传文件到Cloud Storage,并使用Firestore控制对Cloud Storage的访问? - How can I upload files to Cloud Storage through Cloud Functions and use Firestore to control access to Cloud Storage?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM