Which jBoss and EJB3 features should I use for Web Crawler

Question

Happy New Year everybody,now I am trying to develop my own bot( web crawler ) that will walk around through Internet, for search engine. I am thinking to use jboss scheduler-service to schedule bot and something like this to get content:

URL u = new URL("http://www.google.kz");
  InputStream in = u.openStream();

I want to ask which EJB3 or jBoss features should I use to develop effectively(in right way) my bot? I am new to EJB3 and jBoss.

If you have better ideas, you could right here.I am developing search engine to practice my Java skills and in academical issues, I am not going to compete with Google :)

jboss-5.1.0.GA
XP
EJB3
Eclipse helios

PS I didn't decide yet how I will parse html, I am thinking about something like this Parse HTML . What can you recommend?

Answer 1

You don't need EJB or JBoss at all. In fact I can hardly think of a use of them for a web-crawler. Only perhaps if you are using JPA to store the results from the crawl, then you can make use of container-managed transactions, and the automatic injection of the JPA entity manager. Apart from that - no.

Which jBoss and EJB3 features should I use for Web Crawler

Question

1 answers

solution1
2 ACCPTED 2011-01-08 23:31:30

Which jBoss and EJB3 features should I use for Web Crawler

Question

1 answers

solution1 2 ACCPTED 2011-01-08 23:31:30

solution1
2 ACCPTED 2011-01-08 23:31:30