简体   繁体   English

如何使用Python客户端库将数据批量上传到Google Cloud Spanner?

[英]How do I batch upsert data into Google Cloud Spanner using the Python client library?

I would like to upsert the contents of a pandas dataframe into a table in a Google Cloud Spanner database. 我想将熊猫数据框的内容向上插入Google Cloud Spanner数据库的表中。 The documentation here recommends using the insert_or_update() method of the batch object. 此处的文档建议使用批处理对象的insert_or_update()方法。

If the batch object is created by running this 如果批处理对象是通过运行此命令创建的

from google.cloud import spanner_v1
client = spanner_v1.Client()
batch = client.batch()

Then this object does not have that method available. 然后,该对象没有可用的方法。 Running dir(client) gives me these results 运行dir(client)给我这些结果

['SCOPE', 
'_SET_PROJECT', 
'__class__', 
'__delattr__', 
'__dict__', 
'__dir__', 
'__doc__', 
'__eq__', 
'__format__', 
'__ge__', 
'__getattribute__', 
'__getstate__', 
'__gt__', 
'__hash__', 
'__init__', 
'__init_subclass__', 
'__le__', 
'__lt__', 
'__module__', 
'__ne__', 
'__new__', 
'__reduce__', 
'__reduce_ex__', 
'__repr__', 
'__setattr__', 
'__sizeof__', 
'__str__', 
'__subclasshook__', 
'__weakref__', 
'_credentials', 
'_database_admin_api', 
'_determine_default', 
'_http', 
'_http_internal', 
'_instance_admin_api', 
'_item_to_instance', 
'copy', 
'credentials', 
'database_admin_api', 
'from_service_account_json', 
'instance', 
'instance_admin_api', 
'list_instance_configs', 
'list_instances', 
'project', 
'project_name', 
'user_agent']

How do I do batch upsert in Spanner? 如何在Spanner中批量上传?

The snippets has an example of batch insert. 摘录中有一个批量插入的示例。 I checked that the batch object created in the snippet also has an insert_or_update field. 我检查了代码段中创建的批处理对象是否也具有insert_or_update字段。

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/spanner/cloud-client/snippets.py#L72 https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/spanner/cloud-client/snippets.py#L72

[' class ', ' delattr ', ' dict ', ' doc ', ' enter ', ' exit ', ' format ', ' getattribute ', ' hash ', ' init ', ' module ', ' new ', ' reduce ', ' reduce_ex ', ' repr ', ' setattr ', ' sizeof ', ' str ', ' subclasshook ', ' weakref ', '_check_state', '_mutations', '_session', 'commit', 'committed', 'delete', 'insert', 'insert_or_update', 'replace', 'update'] [' class ',' delattr ',' dict ',' doc ',' enter ',' exit ',' format ',' getattribute ',' hash ',' init ',' module ',' new ',' reduce ',' reduce_ex ',' repr ',' setattr ',' sizeof ',' str ',' subclasshook ',' weakref ','_check_state','_mutations','_session','commit','committed' ,“删除”,“插入”,“插入或更新”,“替换”,“更新”]

Can you try that out? 你可以尝试一下吗?

If you have a pandas dataframe, here a random 5 x 3 with columns a,b,c, you can transform the dataframe to column names and the rows and batch insert. 如果您有一个熊猫数据框,这里是一个随机的5 x 3列,其中包含a,b,c列,则可以将数据框转换为列名,行和批处理插入。

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 3)),
                  columns=['a', 'b', 'c'])

You can insert this into Google Cloud Spanner by extracting the columns and values from df and batch inserting. 您可以通过从df提取列和值并批量插入来将其插入Google Cloud Spanner。

from google.cloud import spanner

spanner_client = spanner.Client()
instance = spanner_client.instance(instance_id)
database = instance.database(database_id)

columns = df.columns
values = df.values.tolist()

with database.batch() as batch:
    batch.insert(
        table='table',
        columns=columns
        values=values
    )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 python3 中使用 google-cloud-ndb 库在谷歌云数据存储上进行交易 - How to do transactions using on google cloud datastore using google-cloud-ndb library in python3 使用 Google 云功能写入 Google Cloud Spanner 时出错 - Error on writing to Google cloud spanner using Google cloud functions 如何将辅助python文件上传到Google云平台? - How do I upload a secondary python file to google cloud platform? 如何使用 python 从 google-cloud-platform 下载我的数据? - how can i download my data from google-cloud-platform using python? 如何在 Google Cloud Shell 中安装 Python? - How do I install Python in Google Cloud Shell? 使用python将数据写入谷歌云存储 - Writing data to google cloud storage using python 使用于访问谷歌云存储的谷歌python客户端库达到了一个存根API - Make the Google python client library for accessing Google cloud storage hit a stubbed API 如何使用Python和win32com.client将数据添加到Excel图表中 - How do I add data to an Excel chart using Python and win32com.client Python Google Cloud Storage偶尔会挂起-如何检测并中止? - Python Google Cloud Storage hangs occasionally - how do I detect and abort? 使用 python 获取某个文件后,如何从 Google 云存储桶中获取文件? - How do you fetch files from Google cloud storage bucket after a certain file is fetched using python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM