简体   繁体   English

使用gdata库从公共Google电子表格中检索数据?

[英]Retrieve data from public Google Spreadsheet using gdata library?

I'm working in Python and trying to retrieve data from a public Google Spreadsheet ( this one ) but struggling a bit with the developer documentation . 我正在使用Python并尝试从公共Google电子表格( 这一个 )中检索数据,但在开发人员文档方面有点挣扎。

I'd like to avoid client authentication if possible, as it's a public spreadsheet. 如果可能的话,我想避免客户端身份验证,因为它是一个公共电子表格。

Here's my current code, with the gdata library: 这是我当前的代码,使用gdata库:

client = gdata.spreadsheet.service.SpreadsheetsService()  
key = '0Atncguwd4yTedEx3Nzd2aUZyNmVmZGRHY3Nmb3I2ZXc'  
worksheets_feed = client.GetWorksheetsFeed(key)  

This fails on line 3 with BadStatusLine. 这与BadStatusLine的第3行失败。

How can I read in the data from the spreadsheet? 如何从电子表格中读取数据?

I want to start out by echoing your sentiment that the Documentation is really poor. 我想首先回应一下您的文档非常糟糕的情绪。 But, here's what I've been able to figure out so far. 但是,到目前为止,这是我能够弄清楚的。

Published of Public 出版公众

It is very important that your spreadsheet be "Published to The Web" as opposed to just being "Public on the web." 您的电子表格“发布到网络”非常重要,而不仅仅是“在网上公开”。 The first is achieved by going to the "File -> Publish to The Web ..." menu item. 第一个是通过转到“文件 - >发布到Web ...”菜单项来实现的。 The second is achieved by clicking the "Share" button in the upper left-hand corner of the spreadsheet. 第二个是通过单击电子表格左上角的“共享”按钮来实现的。

I checked, and your spreadsheet with key = '0Atncguwd4yTedEx3Nzd2aUZyNmVmZGRHY3Nmb3I2ZXc' is only "Public on the web." 我查了一下,你的密钥='0Atncguwd4yTedEx3Nzd2aUZyNmVmZGRHY3Nmb3I2ZXc'的电子表格只是“公开在网上”。 I made a copy of it to play around with for my example code. 我为它的示例代码制作了它的副本。 My copy has a key = '0Aip8Kl9b7wdidFBzRGpEZkhoUlVPaEg2X0F2YWtwYkE' which you will see in my sample code later. 我的副本有一个键='0Aip8Kl9b7wdidFBzRGpEZkhoUlVPaEg2X0F2YWtwYkE',您稍后会在我的示例代码中看到它。

This "Public on the Web" vs. "Published on The Web" nonsense is obviously a point of common confusion. 这种“公共网络”与“网络上公布”的废话显然是一个常见的混淆点。 It is actually documented in a red box in the "Visibilities and Projections" sections of the main API documentation. 它实际上记录在主API文档的“可见性和预测”部分的红色框中。 However, it is really hard to read that document. 但是,阅读该文档真的很难。

Visibility and Projections 可见性和预测

As that same document says, there are projections other than "full." 正如同一份文件所述,除了“完整”之外还有其他预测。 And in fact (undocumented), "full" doesn't seem to play nicely with a visibility of "public" which is also important to set when making unauthenticated calls. 实际上(没有文档记录),“完整”似乎不能很好地与“公共”的可见性一起发挥作用,这在制作未经认证的电话时也很重要。

You can kind of glean from the pydocs that many of the methods on the SpreadsheetsService object can take "visibility" and "projection" parameters. 您可以从pydocs中收集SpreadsheetsService对象的许多方法可以获取“可见性”和“投影”参数。 I know only of "public" and "private" visibilities. 我只知道“公共”和“私人”的可见性。 If you learn of any others, I'd like to know about them too. 如果您了解其他任何人,我也想了解他们。 It seems that "public" is what you should use when making unauthenticated calls. 似乎“公共”是您在进行未经身份验证的呼叫时应该使用的内容。

As for Projections, it is even more complicated. 至于预测,它更复杂。 I know of "full", "basic", and "values" projections. 我知道“完整”,“基本”和“价值”预测。 I only got lucky and found the "values" projection by reading the source code to the excellent Tabletop javascript library. 我很幸运,通过阅读优秀的Tabletop javascript库的源代码找到了“值”投影。 And, guess what, that's the secret missing ingredient to make things work. 而且,猜猜是什么,这是让事情发挥作用的秘密缺失成分。

Working Code 工作守则

Here is some code you can use to query the worksheets from my copy of your spreadsheet. 以下是一些代码,您可以使用这些代码从我的电子表格副本中查询工作表。

#!/usr/bin/python
from gdata.spreadsheet.service import SpreadsheetsService

key = '0Aip8Kl9b7wdidFBzRGpEZkhoUlVPaEg2X0F2YWtwYkE'

client = SpreadsheetsService()
feed = client.GetWorksheetsFeed(key, visibility='public', projection='basic')

for sheet in feed.entry:
  print sheet.title.text

** Tips ** I find it really helpful when working with terribly documented python APIs to use the dir() method in a running python interpreter to find out more about the kind of information I can get from the python objects. **提示**我发现在使用非常有文档的python API在运行的python解释器中使用dir()方法来查找更多关于我可以从python对象获得的信息时,它确实非常有用。 In this case, it doesn't help too much because the abstraction above the XML and URL based API is pretty poor. 在这种情况下,它没有太大帮助,因为基于XML和URL的API之上的抽象非常差。

By the way, I'm sure you are going to want to start dealing with the actual data in the spreadsheet, so I'll go ahead and toss in one more pointer. 顺便说一句,我确定你会想要开始处理电子表格中的实际数据,所以我会继续并再投入一个指针。 The data for each row organized as a dictionary can be found using GetListFeed(key, sheet_key, visibility='public', projection='values').entry[0].custom 可以使用GetListFeed(key,sheet_key,visibility ='public',projection ='values')找到组织为字典的每一行的数据。entry [0] .custom

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM