Completely new programmer here having trouble with regular expressions despite trying various online regex testers. I'm working in Eclipse on an Android project I'm querying an openx ad server for a text ad and getting this in return:
var OX_abced445 = '';
OX_abced445 += "<"+"a href=\'http://the.server.url/openx/www/delivery/ck.php?oaparams=2__bannerid=29__zoneid=3__cb=e3efa8b703__oadest=http%3A%2F%2Fsomesite.com\'target=\'_blank\'>This is some sample text to test with!<"+"/a><"+"div id=\'beacon_e3efa8b703\'style=\'position: absolute; left: 0px; top: 0px; visibility:hidden;\'><"+"img src=\'http://the.server.url/openx/www/delivery/lg.php?bannerid=29&campaignid=23&zoneid=3&loc=1&cb=e3efa8b703\' width=\'0\'height=\'0\' alt=\'\' style=\'width: 0px; height: 0px;\' /><"+"/div>\n";
document.write(OX_abced445);
I need to extract the first href url but not the img src url so I figure I should have a regex that looks for everything between href=\\'
and '
. I also need to extract the target text, ie. This is some sample text to test with!
that is encapsulated between the _blank\\'>
and <"+"/a>
. I've found plenty of regexes dealing with extracting urls and such but have struggled to get one working in Eclipse with this particular case. Any assistance would be appreciated.
It is a very bad idea to try to parse JavaScript that generates HTML with regex. Use something like JSoup or Validator.nu for Java or Nokogiri for Ruby instead. If you must use a regex:
Plain regex:
^.*? href=\\'([^']+)\'[^>]*>([^<]*)<
or, in Java:
Pattern p = Pattern.compile("^.*? href=\\\\'([^']+)\\'[^>]*>([^<]*)<",
Pattern.MULTILINE);
Matcher m = p.matcher(hideousString);
m.find();
// Now m.group(1) is the URL and m.group(2) is the text
will capture the href
url in capture group 1 and the text in capture group 2, but that will break quickly if the site changes their response format.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.