简体   繁体   English

Amazon CloudSearch在文档上传中抛出HTTP 403

[英]Amazon CloudSearch throws HTTP 403 on document upload

I am trying to integrate Amazon CloudSearch into SilverStripe. 我正在尝试将Amazon CloudSearch集成到SilverStripe中。 What I want to do is when the pages are published I want a CURL request to send the data about the page as a JSON string to the search cloud. 我想做的是页面发布时,我希望CURL请求将有关页面的数据作为JSON字符串发送到搜索云。

I am using http://docs.aws.amazon.com/cloudsearch/latest/developerguide/uploading-data.html#uploading-data-api as a reference. 我正在使用http://docs.aws.amazon.com/cloudsearch/latest/developerguide/uploading-data.html#uploading-data-api作为参考。

Every time I try to upload it returns me a 403. I have allowed the IP address in the access policies for the search domain as well. 每次尝试上传时,都会返回403。我也允许搜索域的访问策略中使用IP地址。

I am using this as a code reference: https://github.com/markwilson/AwsCloudSearchPhp 我将其用作代码参考: https : //github.com/markwilson/AwsCloudSearchPhp

I think the problem is the AWS does not authenticate correctly. 我认为问题在于AWS无法正确认证。 How do I correctly authenticate this? 如何正确验证此身份?

If you are getting the following error 如果出现以下错误

403 Forbidden, Request forbidden by administrative rules. 403禁止,请求被管理规则禁止。

and if you are sure you have appropriate rules in effect, I would check the api url you are using. 并且如果您确定有适当的规则有效,我将检查您使用的api网址。 Make sure you are using the correct endpoint. 确保使用正确的端点。 If you are doing batch upload the api endpoint should look like below 如果您要批量上传,则api端点应如下所示

your-search-doc-endpoint/2013-01-01/documents/batch your-search-doc-endpoint / 2013-01-01 / documents / batch

Notice 2013-01-01, that is a required part of the url. 注意2013-01-01,这是URL的必需部分。 That is the api version you will be using. 那就是您将要使用的api版本。 You cannot do the following even though it might make sense 即使可能有意义,您也无法执行以下操作

your-search-doc-endpoint/documents/batch <- Won't work your-search-doc-endpoint / documents / batch <-无效

To search you would need to hit the following api 要搜索,您需要点击以下API

your-search-endpoint/2013-01-01/search?your-search-params your-search-endpoint / 2013-01-01 / search?your-search-params

To diagnose whether it's an access policy issue, have you tried a policy that allows all access to the upload? 为了诊断是否是访问策略问题,您是否尝试过一项允许所有访问上传文件的策略? Something like the following opens it up to everything: 类似于以下内容的内容可以打开所有内容:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "cloudsearch:*"
    }
  ]
}

I noticed that if you just go to the document upload endpoint in a browser (mine looks like "doc-YOURDOMAIN-RANDOMID.REGION.cloudsearch.amazonaws.com") you'll get the 403 "Request forbidden by administrative rules" error, even with open access, so as @dminer said you'll need to make sure you're posting to the correct full url. 我注意到,如果您只是在浏览器中转到文档上载终结点(我的看起来像是“ doc-YOURDOMAIN-RANDOMID.REGION.cloudsearch.amazonaws.com”),则会收到403“请求被管理规则禁止”错误,即使具有开放式访问权限,所以正如@dminer所说,您需要确保要发布到正确的完整 URL。

Have you considered using a PHP SDK? 您是否考虑过使用PHP SDK? Like http://docs.aws.amazon.com/aws-sdk-php/guide/latest/service-cloudsearchdomain.html . 就像http://docs.aws.amazon.com/aws-sdk-php/guide/latest/service-cloudsearchdomain.html一样。 It should take care of making correct requests, in which case you could rule out transport errors. 它应注意做出正确的请求,在这种情况下,您可以排除运输错误。

After many searches and trial and error I was able to put together a small code block, from small pieces of code from everywhere to be able to upload a "file" using CURL and PHP to aws cloudsearch. 经过多次搜索和反复试验后,我能够将一个小的代码块(来自世界各地的小段代码)组合在一起,从而能够使用CURL和PHP将“文件”上传到AWS CloudSearch。

The one and most important things is to make sure that your data is prepare correctly to be sent in JSON format. 一件最重要的事情是确保您的数据已正确准备好以JSON格式发送。

Note: For cloudsearch you're not uploading a file your posting a stream of JSON data. 注意:对于cloudsearch,您不需要上传文件或发布JSON数据流。 That is why many of us have a problem uploading the data. 这就是为什么我们许多人在上传数据时遇到问题。

So in my case I wanted to be able to upload data that my search engine on clousearch, it seems simple and it is but the lack of example code to do this is not there most people tell you you to go to the documentation which usually has examples but to use the aws CLI. 因此,就我而言,我希望能够在clousearch上上传我的搜索引擎所需要的数据,这看起来很简单,但这并不是缺少示例代码的原因,因此大多数人都告诉您转到具有以下内容的文档:示例,但要使用aws CLI。 The php SDK is just a learning curb plus instead of making it simple you do 20 steps to do 1 task and not only that you're require to have all these other libraries that are just wrappers for native PHP functions and sometimes instead of making it simple it becomes complicated. php SDK只是一个学习限制,而不是简单地执行20个步骤来完成一项任务,不仅需要您拥有所有其他这些库,这些库只是本机PHP函数的包装器,有时甚至不使它变得简单简单就变得复杂。

So back to how I did it, first I am pulling the data from my database as an array and serialize it to save it to a file. 回到我的工作方式,首先,我将数据库中的数据作为数组提取并序列化以保存到文件中。

$row = $database_data;

foreach ($rows as $key => $row) {
  $data['type'] = 'add';
  $data['id'] = $row->id;           
  $data['fields']['title'] = $row->title;
  $data['fields']['content'] = $row->content;
  $data2[] = $data;
}

// now save your data to a file and make sure
// to serialize() it
$fp = fopen($path_to_file, $mode)
flock($fp, LOCK_EX);
fwrite($fp, serialize($data2));
flock($fp, LOCK_UN);
fclose($fp);

Now that you have your data saved we can play with it 现在您已经保存了数据,我们可以使用它

$aws_doc_endpoint = '{Your AWS CloudSearch Document Endpoint URL}';

// Lets read the data   
$data = file_get_contents($path_to_file);
// Now lets unserialize() it and encoded in JSON format
$data = json_encode(unserialize($data));

// finally lets use CURL    
$ch   = curl_init($aws_doc_endpoint);

curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Length: ' . strlen($data)));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

$response = curl_exec($ch);
curl_close($ch);

$response = json_decode($response);

if ($response->status == 'success')
{
    return TRUE;
}
return FALSE;

And like I said there is nothing to it. 就像我说的没什么。 Most answers that I encounter where, use Guzzle its really easy, well yes it is but for just a simple task like this you don't need it. 我在哪里遇到的大多数答案,使用Guzzle确实很容易,是的,但是对于像这样的简单任务,您不需要它。

Aside from that if you still get an error make sure to check the following. 除此之外,如果仍然出现错误,请确保检查以下内容。

Well formatted JSON data. 格式正确的JSON数据。 Make sure you have access to the endpoint. 确保您有权访问端点。

Well I hope someone finds this code helpful. 好吧,我希望有人觉得这段代码有帮助。

this never worked for me. 这从来没有为我工作。 and i used the Cloudsearch terminal to upload files. 我使用Cloudsearch终端上传文件。 and php curl to search files. 和php curl搜索文件。

尝试在“操作”下将“ cloudsearch:document”添加到CloudSearch的访问策略

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM