简体   繁体   English

Mechanical Turk - 通过API获取批次的结果

[英]Mechanical Turk - Fetch results for a batch via API

We've created batches of HITs using the Mechanical Turk web interface. 我们使用Mechanical Turk网络界面创建了批量的HIT。 Now all we want to do is download the results for a batch using the API, the same way you can download the results for a batch in the web interface using "Download CSV". 现在我们要做的就是使用API​​下载批量结果,就像使用“下载CSV”下载Web界面中批量结果一样。

The documentation from Amazon says that downloading the results from the API is possible and I would be surprised if it isn't. 亚马逊的文档说,从API下载结果是可能的,如果不是,我会感到惊讶。 But after a lot of programming hours and testing I have not been able to get the results of a batch. 但经过大量的编程时间和测试后,我无法获得批量的结果。

http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_OperationsArticle.html http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_OperationsArticle.html

Our problem is not to get the HIT data, that stuff is easy with GetHIT . 我们的问题不是获取HIT数据, GetHIT很容易。 Our problem isn't either to get the assignment data, that's easily done with GetAssignmentsForHIT . 我们的问题不是获取分配数据, GetAssignmentsForHIT使用GetAssignmentsForHIT轻松完成。 Our problem is to figure out the HIT IDs of a batch so that we only fetch the results of that batch. 我们的问题是弄清楚批处理的HIT ID,以便我们只获取该批处理的结果。

We thought we would be able to do this with GetHITsForQualificationType but since we use the same HIT type ID for all batches this isn't possible. 我们认为我们可以使用GetHITsForQualificationType执行此操作,但由于我们对所有批次使用相同的HIT类型ID,因此这是不可能的。 The only other operation I can see is SearchHITs, but this operation only lets you "sort" values and not "filter" by eg batch ID. 我能看到的唯一其他操作是SearchHIT,但是此操作只允许您“排序”值而不是“过滤”,例如批次ID。

If Amazon is a SOA company and they follow the "eat your own dog food" concept, then I wonder how they generate the results in "Download CSV" using their API? 如果亚马逊是一家SOA公司并且他们遵循“吃你自己的狗食”的概念,那么我想知道他们如何使用他们的API在“下载CSV”中生成结果?

Any hints would be greatly appreciated. 任何提示将不胜感激。 Thank you! 谢谢!

UPDATE #1 更新#1

I believe you could use SearchHITs to pull out all HITs. 我相信您可以使用SearchHITs来提取所有 HIT。 Then grab the details for each HIT using GetHIT . 然后使用GetHIT获取每个HIT的详细信息。 Then filter all the HITs by "RequesterAnnotation" which actually contains the batch ID, eg "BatchId:1234567;". 然后通过实际包含批次ID的“RequesterAnnotation”过滤所有HIT,例如“BatchId:1234567;”。 This might be the only solution. 这可能是唯一的解决方案。 Sounds a bit far fetched though. 听起来有点牵强。

The workflow is exactly as you describe in your Update #1: (1) Use SearchHITs to get all of your HITs. 工作流程与您在更新#1中描述的完全相同:(1)使用SearchHITs来获取所有HIT。 (2) Get details with GetHIT (You can actually skip this step because the "Requester Annotation" field comes with SearchHITs if you include the HITDetail response group). (2)使用GetHIT获取详细信息(实际上可以跳过此步骤,因为如果包含HITDetail响应组,则“请求者注释”字段随SearchHITs提供)。 (3) Filter the results by the annotation field to get the HITs you want. (3)通过注释字段过滤结果以获得所需的HIT。 (4) Use GetAssignmentsForHIT to retrieve assignments. (4)使用GetAssignmentsForHIT检索分配。

The "batch id" is something that appears to only be accessible to Amazon for use on the Requester User Interface. “批处理ID”似乎只能由Amazon访问,以便在请求者用户界面上使用。 (see some discussion on the MTurk Developer Forum ) (参见关于MTurk开发者论坛的一些讨论)

And, of course, the API is going to give you results in XML, which you'll need to parse to turn them into a CSV. 当然,API会为您提供XML结果,您需要解析它以将其转换为CSV。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM