简体   繁体   English

Solr索引镶木地板文件

[英]Solr indexing parquet file

I have a solr instance up and running and it should read parquet files to index. 我有一个Solr实例正在运行,它应该读取镶木地板文件以建立索引。 Right now, I am converting the parquet to flat text file and then having solr index them. 现在,我将实木复合地板转换为平面文本文件,然后使用solr对其进行索引。 I'd like to know if it is possible to read the parquet file directly for Solr to consume? 我想知道是否可以直接读取实木复合地板文件以供Solr使用吗?

Thanks 谢谢

Directly: no, not possible. 直接:不,不可能。

If you want something more integrated than what you are actually doing (converting to text and indexing might be good enough already), you can follow two ways: 如果您想要比实际所做的事情更集成的东西(转换为文本和建立索引可能已经足够好了),可以采用以下两种方法:

  1. Create an specialized code around DIH, you probably can write a specialized DataSource , so you could use DIH to do the indexing. 围绕DIH创建专用代码,您可能可以编写专用DataSource ,因此可以使用DIH进行索引。
  2. Just write some java code using SolrJ that reads your file and indexes to Solr 只需使用SolrJ编写一些Java代码即可读取文件并索引Solr

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM