简体繁体中英

Text mining using Solr and Hadoop

原文 2013-10-18 08:10:06 0 2 hadoop/ solr/ bigdata/ text-mining

I have a Solr database containing about 100m documents. I would like to text-mine these documents.

I'm thinking of making text-mining modules in javacode. And then run the jar's on a Hadoop cluster. (The output of the modules can be stored in solr.)

I'm new to Hadoop and Solr. And I would like to know, is this possible? And/Or is there a better way to text-mine the documents?

Any idea's regarding this situation, would really help me a lot.

2 answers

Do you need access documents frequently?

You can use SolrCloud if you need to access big documents. Sharding and replicas structures can service high loading.

And json/xml stored to Solr are easily.

Check the Mahout library before you go with a completely custom code; it has a Lucene driver, and it is integrated with Hadoop for most of the purposes. Mostly, you need terms vectors in order to do mining with Mahout. Once you have it - it's a rather seamless setup.

Knowledge mining using Hadoop

Data mining library for hadoop

How to search on databases in a hadoop cluster using Solr

Implementing sampling & data mining algorithms in Hadoop

how to load data from hadoop to solr using sqoop?

Integration of Hadoop and Solr

integration of solr on hadoop

Using Hadoop Text Object toString() Method

How to integrate Hadoop, SOLR and Impala?

Sorting a huge text file using hadoop

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Knowledge mining using Hadoop Data mining library for hadoop How to search on databases in a hadoop cluster using Solr Implementing sampling & data mining algorithms in Hadoop how to load data from hadoop to solr using sqoop? Integration of Hadoop and Solr integration of solr on hadoop Using Hadoop Text Object toString() Method How to integrate Hadoop, SOLR and Impala? Sorting a huge text file using hadoop

Related Tags

Text mining using Solr and Hadoop

Question

2 answers

solution1
0 2013-10-18 08:49:05

solution2
0 2013-10-19 17:49:01

Text mining using Solr and Hadoop

Question

2 answers

solution1 0 2013-10-18 08:49:05

solution2 0 2013-10-19 17:49:01

solution1
0 2013-10-18 08:49:05

solution2
0 2013-10-19 17:49:01