简体   繁体   中英

Regular expression for retrieving tags

in my project i want to retrieve tags from a web page for that i used dom methods.

But tags can be created dynamically like document.write(“<a href=”http://somedomain.com”>”);

here tags are given in the format of a string so i am trying to use regular expressions.

I want a regular expression which matches all the tags and attributes provided the expression should be able to extract specific attribute also

It is very hard to understand what you are asking and it is very unclear.

First off: never use regex to parse HTML if you have an option. It looks simple right? No. You'll find a problem sooner or later.

Second: what David said .

Now here's a regex to match any HTML tag (have not tested it or anything so try it out first if you must):

\<[^>]*\>

Be warned it will match a script tag too (do not let users write any tag to your page, whitelist a few if you must, and be prepared to have trouble if you don't use a library).

Try these out at RegExr for example (but remind that it uses ActionScript regexes, may be different from Javascript ones sometimes, for example Javascript has no lookahead/lookbehind.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM