我应该将哪个jBoss和EJB3功能用于Web Crawler

Question

Happy New Year everybody,now I am trying to develop my own bot( web crawler ) that will walk around through Internet, for search engine. 大家新年快乐，现在我正在尝试开发自己的机器人（网络爬虫），该机器人将通过Internet遍历搜索引擎。 I am thinking to use jboss scheduler-service to schedule bot and something like this to get content: 我正在考虑使用jboss scheduler-service来安排bot和类似的东西来获取内容：

URL u = new URL("http://www.google.kz");
  InputStream in = u.openStream();

I want to ask which EJB3 or jBoss features should I use to develop effectively(in right way) my bot? 我想问一下我应该使用哪些EJB3或jBoss功能来有效地（正确地）开发我的机器人？ I am new to EJB3 and jBoss. 我是EJB3和jBoss的新手。

If you have better ideas, you could right here.I am developing search engine to practice my Java skills and in academical issues, I am not going to compete with Google :) 如果你有更好的想法，你可以在这里。我正在开发搜索引擎来练习我的Java技能和学术问题，我不打算与谷歌竞争:)

jboss-5.1.0.GA jboss-5.1.0.GA
XP XP
EJB3 EJB3
Eclipse helios 日蚀太阳神

PS I didn't decide yet how I will parse html, I am thinking about something like this Parse HTML . PS：我还没有决定如何解析html，我正在考虑类似“ 解析HTML”的内容。 What can you recommend? 你能推荐什么？

Answer 1

You don't need EJB or JBoss at all. 您根本不需要EJB或JBoss。 In fact I can hardly think of a use of them for a web-crawler. 实际上，我几乎无法想到将它们用于网络爬虫。 Only perhaps if you are using JPA to store the results from the crawl, then you can make use of container-managed transactions, and the automatic injection of the JPA entity manager. 只有当您使用JPA存储爬网结果时，您才可以使用容器管理的事务，并自动注入JPA实体管理器。 Apart from that - no. 除此之外 - 没有。

我应该将哪个jBoss和EJB3功能用于Web Crawler

问题描述

1 个解决方案

解决方案1
2 已采纳 2011-01-08 23:31:30

我应该将哪个jBoss和EJB3功能用于Web Crawler

问题描述

1 个解决方案

解决方案1 2 已采纳 2011-01-08 23:31:30

解决方案1
2 已采纳 2011-01-08 23:31:30