简体   繁体   English

将src / groovy类中的值设置为域类属性

[英]Set values from src/groovy classes to domain class properties

I'm working on crawler4j using groovy and grails. 我正在使用groovy和grails来处理crawler4j。

I have a BasicCrawler.groovy class in src/groovy and the domain class Crawler.groovy and a controller called CrawlerController.groovy . 我在src / groovy中有一个BasicCrawler.groovy类,在域Crawler.groovy中有一个名为CrawlerController.groovy的控制器。

I have few properties in BasicCrawler.groovy class like url , parentUrl , domain etc. 我在BasicCrawler.groovy类中有很少的属性,如urlparentUrldomain等。

I want to persist these values to the database by passing these values to the domain class while crawling is happening. 我想通过在爬行过程中将这些值传递给域类来将这些值持久化到数据库中。

I tried doing this in my BasicCrawler class under src/groovy 我尝试在src / groovy下的BasicCrawler类中执行此操作

class BasicCrawler extends WebCrawler {
   Crawler obj = new Crawler()
   //crawling code 
   @Override
   void visit(Page page) {
      //crawling code
      obj.url = page.getWebURL().getURL()
      obj.parentUrl = page.getWebURL().getParentUrl()
   }

   @Override
   protected void handlePageStatusCode(WebURL webUrl, int statusCode, String   statusDescription) {
      //crawling code
      obj.httpstatus = "not found"
   }
}

And my domain class is as follows: 我的域类如下:

class Crawler extends BasicCrawler {
   String url
   String parentUrl
   String httpstatus
   static constraints = {}
}

But I got the following error: 但是我收到以下错误:

ERROR crawler.WebCrawler  - Exception while running the visit method. Message: 'No such property: url for class: mypackage.BasicCrawler
Possible solutions: obj' at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:50)

After this I tried another approach. 在此之后我尝试了另一种方法。 In my src/groovy/BasicCrawler.groovy class, I declared the url and parentUrl properties on the top and then used databinding (I might be wrong since I am just a beginner): 在我的src / groovy / BasicCrawler.groovy类中,我在顶部声明了urlparentUrl属性,然后使用了数据绑定(我可能是错的,因为我只是一个初学者):

class BasicCrawler extends WebCrawler {
   String url
   String parentUrl

   @Override
   boolean shouldVisit(WebURL url) { //code
   }

   @Override
   void visit(Page page) { //code
   }

   @Override
   protected void handlePageStatusCode(WebURL webUrl, int statusCode, String statusDescription) {
      //code}
   }
   def bindingMap = [url: url , parentUrl: parentUrl]
   def Crawler = new Crawler(bindingMap)
}

And my Crawler.groovy domain class is as follows: 我的Crawler.groovy域类如下:

class Crawler {
   String url
   String parentUrl
   static constraints = {}
}

Now, it doesn't show any error but the values are not being persisted in the database. 现在,它没有显示任何错误,但值未在数据库中持久存在。 I am using mongodb for the backend. 我正在使用mongodb作为后端。

I think this example is a bit contrived but here is a way you might solve this problem in current situation: 我认为这个例子有点做作,但这是一种在当前情况下可以解决这个问题的方法:

class BasicCrawler extends WebCrawler {
   @Override
   void visit(Page page) {
      Crawler obj = new Crawler()
      obj.url = page.getWebURL().getURL()
      obj.parentUrl = page.getWebURL().getParentUrl()
      obj.save()
   }

   @Override
   protected void handlePageStatusCode(WebURL webUrl, int statusCode, String   statusDescription) {
      Crawler obj = Crawler.findByUrl(webUrl)
      obj.httpstatus = "not found"
      obj.save()
   }
}

Key here is not using a member instance variable and using the URL to 'refetch' and update original site 'visited' since I'm assuming that will be a unique constraint on each row. 这里的关键是不使用成员实例变量并使用URL来“重新获取”并更新原始站点'visited',因为我假设这将是每行的唯一约束。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM