使用 CKAN API 和 Python 请求库创建 CKAN 数据集

Question

我正在使用 CKAN 2.2 版并尝试自动创建数据集和上传资源。 我似乎无法使用 python requests库创建数据集。 我收到 400 错误代码。 代码：

import requests, json

dataset_dict = {
    'name': 'testdataset',
    'notes': 'A long description of my dataset',
}

d_url = 'https://mywebsite.ca/api/action/package_create'
auth = {'Authorization': 'myKeyHere'}
f = [('upload', file('PathToMyFile'))]

r = requests.post(d_url, data=dataset_dict, headers=auth)

奇怪的是，我能够创建一个新的资源，并使用Python请求库上传文件。 该代码基于此文档。 代码：

import requests, json

res_dict = {
    'package_id':'testpackage',
    'name': 'testresource',
    'description': 'A long description of my resource!',
    'format':'CSV'
}

res_url = 'https://mywebsite.ca/api/action/resource_create'
auth = {'Authorization': 'myKey'}
f = [('upload', file('pathToMyFile'))]

r = requests.post(res_url, data=res_dict, headers=auth, files=f)

我还可以使用内置的 Python 库使用 CKAN 文档中的方法创建数据集。 文档： CKAN 2.2

代码：

#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint

# Put the details of the dataset we're going to create into a dict.
dataset_dict = {
    'name': 'test1',
    'notes': 'A long description of my dataset',
}

# Use the json module to dump the dictionary to a string for posting.
data_string = urllib.quote(json.dumps(dataset_dict))

# We'll use the package_create function to create a new dataset.
request = urllib2.Request('https://myserver.ca/api/action/package_create')

# Creating a dataset requires an authorization header.
request.add_header('Authorization', 'myKey')

# Make the HTTP request.
response = urllib2.urlopen(request, data_string)
assert response.code == 200

# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())
assert response_dict['success'] is True

# package_create returns the created package as its result.
created_package = response_dict['result']
pprint.pprint(created_package)

我不确定为什么我创建数据集的方法不起作用。 package_create 和 resource_create 函数的文档非常相似，我希望能够使用相同的技术。 我更愿意使用 requests 包来处理我与 CKAN 的所有交易。 有没有人能够成功地使用请求库创建数据集？

任何帮助是极大的赞赏。

Answer 1

我终于回到了这一点并弄清楚了。 爱丽丝检查编码的建议非常接近。 虽然requests确实为您进行编码，但它也会根据输入自行决定哪种编码类型是合适的。 如果文件与 JSON 字典一起传入，则请求会自动执行 CKAN 接受的多部分/表单数据编码，因此请求成功。

但是，如果我们只传递一个 JSON 字典，则默认编码是表单编码。 CKAN 需要对没有文件的请求进行 URL 编码（application/x-www-form-urlencoded）。 为了防止请求进行任何编码，我们可以将参数作为字符串传入，然后请求将仅执行 POST。 这意味着我们必须自己对它进行 URL 编码。

因此，如果我指定内容类型，则将参数转换为字符串并使用 urllib 进行编码，然后将参数传递给请求：

head['Content-Type'] = 'application/x-www-form-urlencoded'
in_dict = urllib.quote(json.dumps(in_dict))
r = requests.post(url, data=in_dict, headers=head)

然后请求成功。

Answer 2

您发送的数据必须是 JSON 编码的。

从文档（您链接到的页面）：

要调用 CKAN API，请将 HTTP POST 请求中的 JSON 字典发布到 CKAN 的 API URL 之一。

在 urllib 示例中，这是由以下代码行执行的：

data_string = urllib.quote(json.dumps(dataset_dict))

我认为（虽然你应该检查） requests库会为你做引用 - 所以你只需要将你的 dict 转换为 JSON。 这样的事情应该工作：

r = requests.post(d_url, data=json.dumps(dataset_dict), headers=auth)

使用 CKAN API 和 Python 请求库创建 CKAN 数据集

问题描述

2 个解决方案

解决方案1
6 已采纳 2014-08-11 23:47:22

解决方案2
3 2014-07-09 10:35:28

使用 CKAN API 和 Python 请求库创建 CKAN 数据集

问题描述

2 个解决方案

解决方案1 6 已采纳 2014-08-11 23:47:22

解决方案2 3 2014-07-09 10:35:28

解决方案1
6 已采纳 2014-08-11 23:47:22

解决方案2
3 2014-07-09 10:35:28