簡體   English   中英

使用FSriver的Elasticsearch批量文件索引

[英]Elasticsearch Bulk File indexing with FSriver

我是Elasticsearch的新手,所以我的知識僅來自Elasticsearch站點,我需要幫助。 我的任務是索引應用程序中的所有文件。 我的問題是將所有文件編入elasticsearch的最簡單方法是什么? 我應該為每個文件手動做出彈性請求嗎? 和文件詳細信息array($filesdata)就像..

Array
(
    [0] => Array
        (
            [Library] => Array
                (
                    [id] => 6
                    [org_id] => 5
                    [name] => nagesh1.doc
                    [version] => 1
                    [description] => 
                    [type] => txt
                    [category_id] => 
                    [status] => Active
                    [parent_type] => 25
                    [parent_id] => 4
                    [parent_status] => 
                    [assigned_user_id] => 21
                    [assigned_user_group_id] => 
                    [permissions] => 261510
                    [deleted] => 0
                    [created_id] => 21
                    [created] => 2014-02-19 07:08:38
                    [modified_id] => 
                    [modified] => 2014-02-19 07:08:38
                )
            [AssignedUser] => Array
                (
                    [first_name] => abc
                    [middle_name] => 
                    [last_name] => def
                )
            [CreatedBy] => Array
                (
                    [first_name] => abc
                    [middle_name] => 
                    [last_name] => def
                )
            [ModifiedBy] => Array
                (
                    [first_name] => 
                    [middle_name] => 
                    [last_name] => 
                )
            [RelatedTo] => Array
                (
                    [name] => E-Mails
                )
            [PickListValue] => Array
                (
                    [value] => 
                )
            [LibraryVersion] => Array
                (
                    [0] => Array
                        (
                            [version] => 1
                            [file_name] => nagesh1.doc
                            [file_ext] => txt
                            [file_mime_type] => 
                            [file_url] => \files\test\documents\emails\
                            [file_uuid] => 13927937171.txt
                            [library_id] => 6
                        )

                )

        )
    [1] => Array
        (
            [Library] => Array
                (
                    [id] => 7
                    [org_id] => 5
                    [name] => Resume.doc
                    [version] => 1
                    [description] => 
                    [type] => txt
                    [category_id] => 
                    [status] => Active
                    [parent_type] => 25
                    [parent_id] => 4
                    [parent_status] => 
                    [assigned_user_id] => 21
                    [assigned_user_group_id] => 
                    [permissions] => 261510
                    [deleted] => 0
                    [created_id] => 21
                    [created] => 2014-02-19 07:08:38
                    [modified_id] => 
                    [modified] => 2014-02-19 07:08:38
                )
            [AssignedUser] => Array
                (
                    [first_name] => abc
                    [middle_name] => 
                    [last_name] => def
                )
            [CreatedBy] => Array
                (
                    [first_name] => abc
                    [middle_name] => 
                    [last_name] => def
                )
           [ModifiedBy] => Array
                (
                    [first_name] => 
                    [middle_name] => 
                    [last_name] => 
                )

            [RelatedTo] => Array
                (
                    [name] => E-Mails
                )
            [PickListValue] => Array
                (
                    [value] => 
                )
            [LibraryVersion] => Array
                (
                    [0] => Array
                        (
                            [version] => 1
                            [file_name] => Resume.doc
                            [file_ext] => txt
                            [file_mime_type] => 
                            [file_url] => \files\test\documents\emails\
                            [file_uuid] => 13927937172.txt
                            [library_id] => 7
                        )

                )

        )
)
by looping this array iam indexing all files individually like below
$post_data = "";
foreach($librariesdata as $libdata){
 foreach($libdata['LibraryVersion'] as $librarydata){$filepath='http://'.$_SERVER['HTTP_HOST'].$librarydata['file_url'].$librarydata['file_uuid'];
if($action!='DELETE'){
 $fileContent   = $this->curlFileGetContents($filepath);
 $fileContentencode = base64_encode($fileContent);
  if(!empty($bulk)){
    $post_data .=<<<END
     {"index":{"_index":"$this->elastic_index_id","_type":"library","_id":"{$librarydata['file_uuid']}"}}
END;
unset($type);
}
$librarydata['parent_type'] = $libdata['Library']['parent_type'];
$librarydata['parent_id'] = $libdata['Library']['parent_id'];
$librarydata['assigned_user_id'] = $libdata['Library']['assigned_user_id'];
$librarydata['modified_id'] = $libdata['Library']['modified_id'];
$librarydata['AssignedUser'] = $libdata['AssignedUser']['first_name'];
$librarydata['Created'] = $libdata['Library']['created'];
$librarydata['CreatedBy'] = $libdata['CreatedBy']['first_name'];
$librarydata['Modified'] = $libdata['Library']['modified'];
$librarydata['ModifiedBy'] = $libdata['ModifiedBy']['first_name'];
$librarydata['RelatedTo'] = $libdata['RelatedTo']['name'];
$librarydata['file']= $fileContentencode;
$multiple[$l]= $librarydata; 
$post_data .= "\r\n" . json_encode($multiple[$l]) . "\r\n";
}
$l++;
}
my question is how can index the bulk files by using FSriver plugin above scenario.

通常,如果您在任何給定時間索引許多文檔,則可以使用Bulk API 這是常見的做法。 我還編寫了一些腳本來改進腳本以提高索引吞吐量。 Elasticsearch開發了一個模塊,該模塊可簡化PHP中與Elasticsearch的通信,位於此處。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM