简体   繁体   中英

Nutch REST api Results (limited)

I've just figured out how to complete a Nutch crawl via the REST api for the 2.3 version of Nutch. You can see my post here . So after running the crawl, I go to MongoVue to check out the results and there is no "status" or "baseUrl" fields, along with others. Now if I do a normal crawl through cygwin, I get all fields. Is there some parameter I'm missing from the POST request to UPDATEDB call?

Here is the last call I make for Updatedb.

{
  "args":{
    "crawlId":"crawl-01",
    "batch":"1428526896161-4430"
  },
  "confId":"default",
  "crawlId":"crawl-01",
  "type":"UPDATEDB"
}

I figured it out. The timestamp used in the GenerateJob step was wrong. It needed to be in a particular format and my code wasn't supporting it. Found a work around.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM