簡體 English 中英

apache solr 用於翻譯文檔索引

[英]apache solr for translated documents indexing

原文 2020-06-27 20:27:43 9 1 solr

Apache solr 是否允許這樣做：

除了翻譯成法語的文檔之外，返回給用戶的可能性，還有原文以及原文中的使用上下文？

要索引的文檔是 pdf 文件。

ُ編輯：添加示例

我有原始文件doc_eng.pdf和翻譯文件doc_fr.pdf

當doc_fr.pdf在查詢響應中返回時，如果可能的話，我希望能夠獲得doc_eng.pdf以及上下文（突出顯示）

我的建議

1- map doc_fr.pdf and doc_eng.pdf to the same id (if this can be done) and add a boolean field isOriginal =true|false.

2-使用嵌套文檔（但我不明白這將如何與 pdf 文件一起使用）

1 個解決方案

是的，solr 可以做到這一點。 我建議你使用apache tika mechanism

Solr 可以在索引期間使用 langid UpdateRequestProcessor 將語言和 map 文本識別到特定於語言的字段。

Solr 支持此功能的兩種實現：

Tika 的語言檢測功能

[LangDetect language detection]( https://github.com/shuyo/language-detection https://lucene.apache.org/solr/guide/7_2/language-analysis.html )

Apache nutch沒有將所有文件索引到apache solr

[英]Apache nutch not indexing all documents to apache solr

如何在將 HTML 文檔索引到 Apache ZDCDB64C465B9D12742A2EA4C88C32D 時保留 HTML 編碼？

[英]How to retain HTML coding while indexing HTML documents to Apache Solr?

索引markdown文檔以在Apache SOLR中進行全文搜索

[英]Indexing markdown documents for full text search in Apache SOLR

文件的Solr索引時間

[英]Solr Indexing Time of Documents

SOLR 6-索引文件

[英]SOLR 6 - indexing documents

Apache Solr：apache solr可以用作索引和搜索來自不同網站的文檔嗎？

[英]Apache Solr: Can apache solr be used as a third part system for indexing and searching for documents from different websites?

在Apache Solr中建立索引

[英]Indexing in Apache Solr

Apache Solr PDF索引

[英]Apache Solr PDF indexing

在Apache Solr中索引MySQL

[英]indexing MySQL in Apache Solr

在異步模式下索引Solr文檔

[英]Indexing solr documents in asynchronous mode

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 Apache nutch沒有將所有文件索引到apache solr 如何在將 HTML 文檔索引到 Apache ZDCDB64C465B9D12742A2EA4C88C32D 時保留 HTML 編碼？索引markdown文檔以在Apache SOLR中進行全文搜索文件的Solr索引時間 SOLR 6-索引文件 Apache Solr：apache solr可以用作索引和搜索來自不同網站的文檔嗎？在Apache Solr中建立索引 Apache Solr PDF索引在Apache Solr中索引MySQL 在異步模式下索引Solr文檔

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM