简体   繁体   English

CloudSearch deleteByQuery

[英]CloudSearch deleteByQuery

The official Solr Java API has a deleteByQuery operation where we can delete documents that satisfy a query. 官方的Solr Java API有一个deleteByQuery操作,我们可以删除满足查询的文档。 The AWS CloudSearch SDK doesn't seem to have matching functionality. AWS CloudSearch SDK似乎没有匹配的功能。 Am I just not seeing the deleteByQuery equivalent, or is this something we'll need to roll our own? 我只是没有看到deleteByQuery等价物,或者这是我们需要自己滚动的东西?

Something like this: 像这样的东西:

SearchRequest searchRequest = new SearchRequest();
searchRequest.setQuery(queryString);
searchRequest.setReturn("id,version");
SearchResult searchResult = awsCloudSearch.search(searchRequest);
JSONArray docs = new JSONArray();
for (Hit hit : searchResult.getHits().getHit()) {
    JSONObject doc = new JSONObject();
    doc.put("id", hit.getId());
    // is version necessary?
    doc.put("version", hit.getFields().get("version").get(0));
    doc.put("type", "delete");
    docs.put(doc);
}
UploadDocumentsRequest uploadDocumentsRequest = new UploadDocumentsRequest();
StringInputStream documents = new StringInputStream(docs.toString());
uploadDocumentsRequest.setDocuments(documents);
UploadDocumentsResult uploadResult = awsCloudSearch.uploadDocuments(uploadDocumentsRequest);

Is this reasonable? 这合理吗? Is there an easier way? 有没有更简单的方法?

You're correct that CloudSearch doesn't have an equivalent to deleteByQuery. 你是对的,CloudSearch没有deleteByQuery的等价物。 Your approach looks like the next best thing. 你的方法看起来是下一个最好的方法。

And no, version is not necessary -- it was removed with the CloudSearch 01-01-2013 API (aka v2). 不, version不是必需的 - 它已被CloudSearch 01-01-2013 API(又名v2)删除。

CloudSearch doesn't provide delete as query, it supports delete in a slightly different way ie build json object having only document id (to be deleted) and operation should be specified as delete. CloudSearch不提供delete作为查询,它支持稍微不同的删除方式,即构建只有文档id(要删除)的json对象,操作应指定为delete。 These json objects can be batched together but batch size has to be less than 5 MB. 这些json对象可以一起批处理,但批处理大小必须小于5 MB。

Following class supports this functionality, you just pass its delete method the array of ids to be deleted: 下面的类支持此功能,您只需将其删除方法传递给要删除的ID数组:

class AWS_CS
{
    protected $client;

    function connect($domain)
    {
        try{
            $csClient = CloudSearchClient::factory(array(
                            'key'          => 'YOUR_KEY',
                            'secret'      => 'YOUR_SECRET',
                            'region'     =>  'us-east-1'

                        ));

            $this->client = $csClient->getDomainClient(
                        $domain,
                        array(
                            'credentials' => $csClient->getCredentials(),
                            'scheme' => 'HTTPS'
                        )
                    );
        }
        catch(Exception $ex){
            echo "Exception: ";
            echo $ex->getMessage();
        }
        //$this->client->addSubscriber(LogPlugin::getDebugPlugin());        
    }
    function search($queryStr, $domain){

        $this->connect($domain);

        $result = $this->client->search(array(
            'query' => $queryStr,
            'queryParser' => 'lucene',
            'size' => 100,
            'return' => '_score,_all_fields'
            ))->toArray();

        return json_encode($result['hits']);
        //$hitCount = $result->getPath('hits/found');
        //echo "Number of Hits: {$hitCount}\n";
    }

    function deleteDocs($idArray, $operation = 'delete'){

        $batch = array();

        foreach($idArray as $id){
            //dumpArray($song);
            $batch[] = array(
                        'type'        => $operation,
                        'id'        => $id);                       
        }
        $batch = array_filter($batch);
        $jsonObj = json_encode($batch, JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP);

        print_r($this->client->uploadDocuments(array(
                        'documents'     => $jsonObj,
                        'contentType'     =>'application/json'
                    )));

        return $result['status'] == 'success' ? mb_strlen($jsonObj) : 0;
    }   
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM