简体   繁体   English

我可以使用Lucene Search索引和搜索泰米尔语文档吗?

[英]Can i use Lucene Search for indexing and Searching Tamil Documents?

I need to search a tamil document based on certain rules, will i be able to use LUcene search. 我需要根据某些规则搜索泰米尔语文档,我才能使用LUcene搜索。 Will it support Tamil language? 它会支持泰米尔语吗?

While I'm not really familiar specifically with Tamil, by my understanding, StandardAnalyzer should support it reasonably well. 虽然我对Tamil并不是很熟悉,但据我了解, StandardAnalyzer应该会很好地支持它。 It is multi-lingual, and implements UAX #29 , which should provide good text segmentation and normalization for all Indic languages. 它是多语言的,并实现了UAX#29 ,它应该为所有印度语提供良好的文本分割和规范化。

I'm not aware of any Tamil specific analysis package, to provide stemming and the like, though there might be some useful components in org.apache.lucene.analysis.in . 尽管org.apache.lucene.analysis.in中可能有一些有用的组件,但我不知道有任何泰米尔语专用的分析包可提供词干等功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM