I need to extract main news content from a web page.I searched on inte.net and found an api named Boilerpipe freely available for that purpose http://boilerpipe-web.appspot.com/ But I'm not abled to find any implementations in java that make use of Boilerpipe.Can anyone tell me how can I use Boilerpipe in Java to extract the news content or give me some links to implementations in java which make use of Boilerpipe to extract content from a news web page?
may be my answer is too late. But it's pretty simple.
URL url = new URL("http://www.nydailynews.com/sports/baseball");
ArticleExtractor ae = new ArticleExtractor();
String content = ae.getText(url); // this contains the final text
simple huh, suppose you need to extract this URL
just use my BoilerPipe Alternative Web API HERE , my service is based on boilerpipe,i have developed this because of getting overquota error in the original application..you have the option to get back the result in JSON,just consume it in your application..
Best Regards
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.