简体   繁体   中英

javascript RegExp - get count of all html tags total characters

I am trying to get the count (length) of all matching HTML characters including opening ( <tag> ) and closing ( </tag> ) including any attributes

Consider the following HTML:

<div>
    <a href="#">link</a>
    <span>some text</span>
</div>

The HTML character length count will be 40 (as it counts <div><a href="#"></a><span></span></div> )

This is the working regExp (on gskinner.com)

But when using it in javascript there is a error
See jsfiddle

The reason for the error is that your regex includes a positive lookbehind (?<=\\s) - a feature that the Javascript implementation of regular expressions does not provide (see Mimicking Lookbehinds in Javascript ). (More precisely, the error is caused by the ? following the un-escaped ( , when not followed by ! , = or : etc.)

The link you provided to a working example is a Flex application written in ActionScript 3 and that does include positive lookbehinds.

You also need to add the g flag to the end of your regex literal to get an array of all the matches from match , then you can sum their lengths.

Here is a working example with the positive lookbehind removed and the g flag added: jsfiddle .

It shows a length of 163 which looks about right, but I'll leave the counting to you.
You may need to add something in place of the lookbehind or otherwise edit the regex - I'll also leave you to work that out.

There is a syntax error.

You have to escape your forward slashes / because it's also your delimiter.

/(<(?:[A-Za-z_:][\w:.-]*(?=\s)(?!(?:[^>"\']|"[^"]*"|\'[^\']*\')*?(?<=\s)\s*=)(?!\s*\/?>)\s+(?:".*?"|\'.*?\'|[^>]*?)+|\/?[A-Za-z_:][\w:.-]*\s*\/?)>)/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM