[英]Appending pandas Data Frame to Google spreadsheet
Case: My script returns a data frame that needs has to be appended to an existing google spreadsheet as new rows of data.As of now, I'm appending a data frame as multiple single rows through gspread.案例:我的脚本返回一个数据框,该数据框必须作为新数据行附加到现有的谷歌电子表格中。截至目前,我通过 gspread 将数据框附加为多个单行。
My Code:我的代码:
import gspread
import pandas as pd
df = pd.DataFrame()
# After some processing a non-empty data frame has been created.
output_conn = gc.open("SheetName").worksheet("xyz")
# Here 'SheetName' is google spreadsheet and 'xyz' is sheet in the workbook
for i, row in df.iterrows():
output_conn.append_row(row)
Is there a way to append entire data-frame rather than multiple single rows?有没有办法 append 整个数据框而不是多个单行?
I can recommend gspread-dataframe
:我可以推荐
gspread-dataframe
:
import gspread_dataframe as gd
# Connecting with `gspread` here
ws = gc.open("SheetName").worksheet("xyz")
existing = gd.get_as_dataframe(ws)
updated = existing.append(your_new_data)
gd.set_with_dataframe(ws, updated)
Here is the code to write, append(without loading the existing sheet into memory), and read to google sheets.这是编写、附加(不将现有工作表加载到内存中)和读取到谷歌工作表的代码。
import gspread_dataframe as gd
import gspread as gs
gc = gs.service_account(filename="your/cred/file.json")
def export_to_sheets(worksheet_name,df,mode='r'):
ws = gc.open("SHEET_NAME").worksheet("worksheet_name")
if(mode=='w'):
ws.clear()
gd.set_with_dataframe(worksheet=ws,dataframe=df,include_index=False,include_column_header=True,resize=True)
return True
elif(mode=='a'):
ws.add_rows(df.shape[0])
gd.set_with_dataframe(worksheet=ws,dataframe=df,include_index=False,include_column_header=False,row=ws.row_count+1,resize=False)
return True
else:
return gd.get_as_dataframe(worksheet=ws)
df = pd.DataFrame.from_records([{'a': i, 'b': i * 2} for i in range(100)])
export_to_sheets("SHEET_NAME",df,'a')
ws.clear()
.Second using set_with_dataframe()
uploading the dataframe, here note that resize=True
, which strictily set the row and col in worksheet to df.shape.ws.clear()
。其次使用set_with_dataframe()
上传数据帧,这里注意resize=True
,它将工作表中的行和列严格设置为 df.shape。 This will help later in append method.resize=False
as we are adding rows and row=ws.row_count+1
anchoring its row value for append.resize=False
因为我们正在添加行和row=ws.row_count+1
锚定其行值以进行追加。I was facing the same problem, here's what I did converted the dataframe into list and used gspread's append_rows()
我遇到了同样的问题,这就是我将数据框转换为列表并使用 gspread 的
append_rows()
gc = gspread.service_account(filename="credentials.json")
sh = gc.open_by_key('<your_key>')
ws = sh.sheet1
##data is the original data frame
data_list = data.values.tolist()
ws.append_rows(data_list)
The following approach, using gspread
, may help one understand the procedures and solve the problem以下使用
gspread
的方法可能有助于理解程序并解决问题
Install the libraries in your environment.在您的环境中安装这些库。
Import the libraries in the script导入脚本中的库
import pandas as pd import gspread from gspread_dataframe import set_with_dataframe
Create credentials in Google API console .在Google API 控制台中创建凭据。
Add the following to the script, to access the Google Sheet将以下内容添加到脚本中,以访问 Google 表格
gc = gspread.service_account(filename='GoogleAPICredentials.json') sh = gc.open_by_key('GoogleSheetID')
Assuming one wants to add to the first sheet, use 0
in get_worksheet
(for the second sheet use 1, and so on)假设要添加到第一个工作表,在
get_worksheet
中使用0
(第二个工作表使用 1,依此类推)
worksheet = sh.get_worksheet(0)
Then, in order to export the dataframe, considering that the dataframe name is df
, to a Google Sheet然后,为了导出数据框,考虑到数据框名称是
df
,到谷歌表
set_with_dataframe(worksheet, df)
I came up with the following solution.我想出了以下解决方案。 It does not overwrite current data but just appends entire pandas DataFrame
df
to the end of Sheet with name sheet
in the Spreadsheet with the name spread_sheet
.它不会覆盖当前数据,而只是将整个 pandas DataFrame
df
附加到 Sheet 的末尾,并在电子表格中使用名为spread_sheet
的名称sheet
。
import gspread
from google.auth.transport.requests import AuthorizedSession
from oauth2client.service_account import ServiceAccountCredentials
def append_df_to_gs(df, spread_sheet:str, sheet_name:str):
scopes = [
'https://spreadsheets.google.com/feeds',
'https://www.googleapis.com/auth/drive',
]
credentials = ServiceAccountCredentials.from_json_keyfile_name(
path_to_credentials,
scopes=scopes
)
gsc = gspread.authorize(credentials)
sheet = gsc.open(spread_sheet)
params = {'valueInputOption': 'USER_ENTERED'}
body = {'values': df.values.tolist()}
sheet.values_append(f'{sheet_name:str}!A1:G1', params, body)
For params valueInputOption
please consult this .有关参数
valueInputOption
请参阅此。 I used USER_ENTERED
here as I needed some formulas to be valid once I append the data to Google Sheets.我在这里使用
USER_ENTERED
,因为一旦我将数据附加到 Google 表格,我需要一些公式才能生效。
ws = gc.open("sheet title").worksheet("Sheet1")
gd.set_with_dataframe(ws, dataframe)
#simply transform your dataframe to google sheet #simply 将您的数据框转换为谷歌表格
I came up with the following solution using try/catch statement, in case the spreadsheet doesn't exsit he will create it for you and set the dataframe otherwise he will append it.我使用 try/catch 语句提出了以下解决方案,如果电子表格不存在,他会为您创建它并设置 dataframe,否则他会设置 append。
def load_to_sheet(conn_sheet, spreadsheet_name, df):
try:
worksheet = conn_sheet.worksheet(spreadsheet_name)
worksheet.add_rows(df.shape[0])
set_with_dataframe(worksheet=worksheet, row=worksheet.row_count, dataframe=df, include_index=False,
include_column_header=False,
resize=False)
except Exception:
worksheet = conn_sheet.add_worksheet(title=spreadsheet_name, rows=100, cols=100)
set_with_dataframe(worksheet=worksheet, dataframe=df, include_index=False, include_column_header=True,
resize=True)
以下不需要 gspread 以外的外部库:
worksheet.update([dataframe.columns.values.tolist()] + dataframe.values.tolist())
如果 Google 电子表格采用 .csv 格式,那么您可以使用 df.to_csv() 将 pandas 数据帧转换为 csv 并以该格式保存
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.