简体   繁体   中英

Nutch 1.6 doesn't search new entries in seed.txt

I set up Solr 7.7.1 and Nutch 1.6 and ran a test search. For that I put a URL in seed.txt and everything works fine. After this test I removed the old core in Solr, created a new core and put multiple URLs in seed.txt, and started Nutch again for a new crawl. But I got in every try the results of the previous test run. How can I remove the previous search and can start Nutch to crawl the new URLs i put in seed.txt?

Thanks in advance for your answers.

You should remove the crawl/ directory (if it is named crawl). This directory contains the previously crawled data (before it is sent to Solr). Probably there is no new content after you run the crawl command and Nutch is sending the already stored data into Solr.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM