简体   繁体   中英

Read documents with Elastic Search

I have a information retrieval assignment where I have to use elasticSearch to generate some indexing/ranking. I was able to download elasticSearch and it's now running on http://localhost:9200/ but how do I read every documents stored in my folder called ' data '?

Elasticsearch is just a search engine. In order to get your docs and files searchable, you need to load them, extract all relevant data and load into elasticsearch.

Apache Tika is a solution for extracting the data out of the files. Write a file system crawler using Tika. Then use the Rest API to index the data.

If you don't want to reinvent the wheel, have a look on the FSCrawler project. Here is a blogpost describing how to solve a task you are facing.

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM