繁体   English   中英

如何将数据从谷歌驱动器导入谷歌colab?

[英]How to import data into google colab from google drive?

我的谷歌驱动器上上传了一些数据文件。 我想将这些文件导入 google colab。

REST API 方法和 PyDrive 方法展示了如何创建新文件并将其上传到 drive 和 colab。 使用它,我无法弄清楚如何在我的 python 代码中读取驱动器上已经存在的数据文件。

我完全是新手。 有人可以帮我吗?

(2018 年 4 月 15 日更新:gspread 经常更新,所以为了确保稳定的工作流程,我指定了版本)

对于电子表格文件,基本思想是使用包 gspread 和 pandas 来读取 Drive 中的电子表格并将它们转换为 pandas 数据帧格式。

在 Colab 笔记本中:

#install packages
!pip install gspread==2.1.1
!pip install gspread-dataframe==2.1.0
!pip install pandas==0.22.0


#import packages and authorize connection to Google account:
import pandas as pd
import gspread
from gspread_dataframe import get_as_dataframe, set_with_dataframe
from google.colab import auth
auth.authenticate_user()  # verify your account to read files which you have access to. Make sure you have permission to read the file!
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default()) 

然后我知道了 3 种阅读 Google 电子表格的方法。

按文件名:

spreadsheet = gc.open("goal.csv") # Open file using its name. Use this if the file is already anywhere in your drive
sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
df2 = pd.DataFrame(sheet.get_all_records())
df2.head()

通过网址:

 spreadsheet = gc.open_by_url('https://docs.google.com/spreadsheets/d/1LCCzsUTqBEq5pemRNA9EGy62aaeIgye4XxwReYg1Pe4/edit#gid=509368585') # use this when you have the complete url (the edit#gid means permission)
    sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
    df2 = pd.DataFrame(sheet.get_all_records())
    df2.head()

按文件键/ID:

spreadsheet = gc.open_by_key('1vpukIbGZfK1IhCLFalBI3JT3aobySanJysv0k5A4oMg') # use this when you have the key (the string in the url following spreadsheet/d/)
sheet =  spreadsheet.get_worksheet(0)  # 0 means the first sheet in the file
df2 = pd.DataFrame(sheet.get_all_records())
df2.head()

我在 Colab 笔记本中分享了上面的代码: https ://drive.google.com/file/d/1cvur-jpIpoEN3vAO8Fd_yVAT5Qgbr4GV/view ? usp = sharing

来源: https : //github.com/burnash/gspread

!) 将您的数据设置为公开可用,然后用于公共电子表格:

from StringIO import StringIO  # got moved to io in python3.

import requests
r = requests.get('https://docs.google.com/spreadsheet/ccc? 
key=0Ak1ecr7i0wotdGJmTURJRnZLYlV3M2daNTRubTdwTXc&output=csv')
data = r.content

In [10]: df = pd.read_csv(StringIO(data), index_col=0,parse_dates= 
['Quradate'])

In [11]: df.head()

更多信息:将Google 电子表格 CSV 导入 Pandas 数据框

如果私人数据类型相同,但您将不得不进行一些身份验证...

来自 Google Colab 片段

from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

worksheet = gc.open('Your spreadsheet name').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

# Convert to a DataFrame and render.
import pandas as pd
pd.DataFrame.from_records(rows)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM