简体   繁体   中英

Get caret [start, end] position in HTML source code for an element that was clicked

This is quite a challenging problem. I haven't seen it solved anywhere on Stack Overflow. So I decided to post it.

0      ----17----+           +---30---
|                |           |                +----47
|                |           |                |
<div>ABC<b>B Elem<i>Italic</i>ent</b> DEF</div>
        |                           |
        +---8---             ---37--+

Action: Let's say Element <i> tag is clicked.

Problem: Create a function that returns coordinates [17,30]

Note: The coordinates are start and end caret position, represented as 0-based index, in original HTML source code, encompassing only the element that was clicked. May assume normalized HTML nodes as in id = "" becomes id="". (But extra credit, if it doesn't.)

Example 2: If <b> tag was clicked. The script should return [8, 37] because it is the start/end caret position encompassing the B tag.

Example 3: If ABC text or DEF text was clicked, return value is [0,47]

Walk the parent chain until you hit whatever tag you consider to be a container ( <div> in your case, apparently).

Use the parent's childs to locate the particular child you're coming from, in case you have two or more identical childs, like in from <i>two</i> to <i>two</i> to <i>two</i> <i>two</i> .
That should give you the child offset within the parent. You can then cumulate the offsets until you hit the div tag or whatever other container element.
Ending position is just this offset plus the clicked element length.

And after two days of solving this, I am posting my own solution.

I tried to parse the DOM and count characters manually, at first. But that was more complicated than it had to be.

Credit: Thanks to kuroi neko, who suggested the end caret position is just start position + length of the HTML encompassing the clicked tag.

Note: I am manually removing <tbody> tags, before calculating caret values. This is because, even original HTML does not contain them, during normalization process (which takes place during innerHTML or outerHTML call,) they are auto-inserted. It's a personal preference, if you're building a text editor that needs this functionality -- to leave them alone and update original HTML.

On the other hand, if you prefer the purist approach, and want to consider the original HTML intact, as it was written by the author of said HTML, then you may want to remove <tbody> manually. This also assumes that you take responsibility for taking care of all other cases, similar to these. Whatever they might be. ( Not included in the solution below. )

Solution: Considering textarea (HTML source editor) and #preview are two separate elements representing the same HTML.

$(document).ready(function() {

            // Normalize source code
            var normalized_html = document.getElementById("preview").innerHTML;

            // Remove all TBODY tags (they are auto-inserted, even if not present in original HTML)
            normalized_html = normalized_html.replace(/<tbody>/g, '');

            $("#textarea").html(normalized_html);

            $("#preview").on("click", function(event) {

                // Get clicked tag HTML
                var tag = event.target.outerHTML;

                // Get original HTML before split character is inserted
                var orig_html = document.getElementById("preview").innerHTML;//.replace(/<preview>/g, '').replace(/<\/preview>/g, '');

                // Insert unique separator just before the tag that was clicked, to mark beginning
                $(event.target).before("[*-*]");

                // Get preview source code
                var html = document.getElementById("preview").innerHTML;

                // Remove line breaks
                html = html.replace(/\r|\n/g, '');

                // Remove tags that were auto-inserted by native normalization process that did not exist in original HTML.
                html = html.replace(/<tbody>/g, '');

                var before_split = html;

                // Split HTML at the tag that was clicked
                html = html.split("[*-*]")[0];

                // Restore preview to original HTML
                $("#preview")[0].innerHTML = orig_html;

                // Get start and end of caret in source code
                var caret_start = html.length;
                var caret_end = caret_start + tag.length;

                console.log("caret start = " + caret_start + " end = " + caret_end);

            });

        });

You achieve that by simply using Descop library.

// Get the source html code of target document
var html = yourFunctionToGetHTML();

// Get the target document itself
var dom = yourFunctionToGetDocument();

// Get the element you want to found in source code
var element = document.getElementById("target-element");

// Create an instance of Descop
var descop = new Descop();

// Connect document
descop.connectDocument(dom);

// Connect source code
descop.connectSource(html);

// Get element position in source code
var position = descop.getElementPosition(element);
// eg. position => { start: 320, end: 480 }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM