[英]Set values from src/groovy classes to domain class properties
I'm working on crawler4j using groovy and grails. 我正在使用groovy和grails来处理crawler4j。
I have a BasicCrawler.groovy class in src/groovy and the domain class Crawler.groovy and a controller called CrawlerController.groovy . 我在src / groovy中有一个BasicCrawler.groovy类,在域Crawler.groovy中有一个名为CrawlerController.groovy的控制器。
I have few properties in BasicCrawler.groovy class like url , parentUrl , domain etc. 我在BasicCrawler.groovy类中有很少的属性,如url , parentUrl , domain等。
I want to persist these values to the database by passing these values to the domain class while crawling is happening. 我想通过在爬行过程中将这些值传递给域类来将这些值持久化到数据库中。
I tried doing this in my BasicCrawler class under src/groovy 我尝试在src / groovy下的BasicCrawler类中执行此操作
class BasicCrawler extends WebCrawler {
Crawler obj = new Crawler()
//crawling code
@Override
void visit(Page page) {
//crawling code
obj.url = page.getWebURL().getURL()
obj.parentUrl = page.getWebURL().getParentUrl()
}
@Override
protected void handlePageStatusCode(WebURL webUrl, int statusCode, String statusDescription) {
//crawling code
obj.httpstatus = "not found"
}
}
And my domain class is as follows: 我的域类如下:
class Crawler extends BasicCrawler {
String url
String parentUrl
String httpstatus
static constraints = {}
}
But I got the following error: 但是我收到以下错误:
ERROR crawler.WebCrawler - Exception while running the visit method. Message: 'No such property: url for class: mypackage.BasicCrawler
Possible solutions: obj' at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:50)
After this I tried another approach. 在此之后我尝试了另一种方法。 In my src/groovy/BasicCrawler.groovy class, I declared the url and parentUrl properties on the top and then used databinding (I might be wrong since I am just a beginner): 在我的src / groovy / BasicCrawler.groovy类中,我在顶部声明了url和parentUrl属性,然后使用了数据绑定(我可能是错的,因为我只是一个初学者):
class BasicCrawler extends WebCrawler {
String url
String parentUrl
@Override
boolean shouldVisit(WebURL url) { //code
}
@Override
void visit(Page page) { //code
}
@Override
protected void handlePageStatusCode(WebURL webUrl, int statusCode, String statusDescription) {
//code}
}
def bindingMap = [url: url , parentUrl: parentUrl]
def Crawler = new Crawler(bindingMap)
}
And my Crawler.groovy domain class is as follows: 我的Crawler.groovy域类如下:
class Crawler {
String url
String parentUrl
static constraints = {}
}
Now, it doesn't show any error but the values are not being persisted in the database. 现在,它没有显示任何错误,但值未在数据库中持久存在。 I am using mongodb for the backend. 我正在使用mongodb作为后端。
I think this example is a bit contrived but here is a way you might solve this problem in current situation: 我认为这个例子有点做作,但这是一种在当前情况下可以解决这个问题的方法:
class BasicCrawler extends WebCrawler {
@Override
void visit(Page page) {
Crawler obj = new Crawler()
obj.url = page.getWebURL().getURL()
obj.parentUrl = page.getWebURL().getParentUrl()
obj.save()
}
@Override
protected void handlePageStatusCode(WebURL webUrl, int statusCode, String statusDescription) {
Crawler obj = Crawler.findByUrl(webUrl)
obj.httpstatus = "not found"
obj.save()
}
}
Key here is not using a member instance variable and using the URL to 'refetch' and update original site 'visited' since I'm assuming that will be a unique constraint on each row. 这里的关键是不使用成员实例变量并使用URL来“重新获取”并更新原始站点'visited',因为我假设这将是每行的唯一约束。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.