简体   繁体   English

检测文本中哪个单词被点击

[英]Detect which word has been clicked on within a text

I am building a JS script which at some point is able to, on a given page, allow the user to click on any word and store this word in a variable.我正在构建一个 JS 脚本,它在某个时候能够在给定的页面上允许用户单击任何单词并将该单词存储在变量中。

I have one solution which is pretty ugly and involves class-parsing using jQuery: I first parse the entire html, split everything on each space " " , and re-append everything wrapped in a <span class="word">word</span> , and then I add an event with jQ to detect clicks on such a class, and using $(this).innerHTML I get the clicked word.我有一个非常难看的解决方案,它涉及使用 jQuery 进行类解析:我首先解析整个 html,将每个空间上的所有内容拆分为" " ,然后重新附加包裹在<span class="word">word</span> ,然后我用 jQ 添加一个事件来检测对此类类的点击,并使用 $(this).innerHTML 我得到点击的词。

This is slow and ugly in so many ways and I was hoping that someone knows of another way to achieve this.这在很多方面都是缓慢而丑陋的,我希望有人知道实现这一目标的另一种方法。

PS: I might consider running it as a browser extension, so if it doesn't sound possible with mere JS, and if you know a browser API that would allow that, feel free to mention it ! PS:我可能会考虑将它作为浏览器扩展来运行,所以如果仅使用 JS 听起来不可能,并且如果您知道允许这样做的浏览器 API,请随时提及它!

A possible owrkaround would be to get the user to highlight the word instead of clicking it, but I would really love to be able to achieve the same thing with only a click !一个可能的 owrkaround 是让用户突出显示这个词而不是点击它,但我真的很想只需点击一下就可以实现同样的事情!

Here's a solution that will work without adding tons of spans to the document (works on Webkit and Mozilla and IE9+):这是一个无需向文档添加大量跨度即可工作的解决方案(适用于 Webkit 和 Mozilla 以及 IE9+):

https://jsfiddle.net/Vap7C/15/ https://jsfiddle.net/Vap7C/15/

 $(".clickable").click(function(e){ s = window.getSelection(); var range = s.getRangeAt(0); var node = s.anchorNode; // Find starting point while(range.toString().indexOf(' ') != 0) { range.setStart(node,(range.startOffset -1)); } range.setStart(node, range.startOffset +1); // Find ending point do{ range.setEnd(node,range.endOffset + 1); }while(range.toString().indexOf(' ') == -1 && range.toString().trim() != ''); // Alert result var str = range.toString().trim(); alert(str); });
 <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script> <p class="clickable"> Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris rutrum ante nunc. Proin sit amet sem purus. Aliquam malesuada egestas metus, vel ornare purus sollicitudin at. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer porta turpis ut mi pharetra rhoncus. Ut accumsan, leo quis hendrerit luctus, purus nunc suscipit libero, sit amet lacinia turpis neque gravida sapien. Nulla facilisis neque sit amet lacus ornare consectetur non ac massa. In purus quam, imperdiet eget tempor eu, consectetur eget turpis. Curabitur mauris neque, venenatis a sollicitudin consectetur, hendrerit in arcu. </p>

in IE8, it has problems because of getSelection.在 IE8 中,由于 getSelection,它有问题。 This link ( Is there a cross-browser solution for getSelection()? ) may help with those issues.此链接( 是否有 getSelection() 的跨浏览器解决方案? )可能有助于解决这些问题。 I haven't tested on Opera.我还没有在 Opera 上测试过。

I used https://jsfiddle.net/Vap7C/1/ from a similar question as a starting point.我从类似的问题中使用https://jsfiddle.net/Vap7C/1/作为起点。 It used the Selection.modify function:它使用了Selection.modify函数:

s.modify('extend','forward','word');
s.modify('extend','backward','word');

Unfortunately they don't always get the whole word.不幸的是,他们并不总能得到完整的信息。 As a workaround, I got the Range for the selection and added two loops to find the word boundaries.作为一种解决方法,我获得了选择范围并添加了两个循环来查找单词边界。 The first one keeps adding characters to the word until it reaches a space.第一个不断向单词添加字符,直到它到达一个空格。 the second loop goes to the end of the word until it reaches a space.第二个循环到单词的末尾,直到它到达一个空格。

This will also grab any punctuation at the end of the word, so make sure you trim that out if you need to.这也将抓住单词末尾的任何标点符号,因此请确保在需要时将其修剪掉。

As far as I know, adding a span for each word is the only way to do this.据我所知,为每个单词添加一个span是唯一的方法。

You might consider using Lettering.js , which handles the splitting for you.您可以考虑使用Lettering.js ,它会为您处理拆分 Though this won't really impact performance, unless your "splitting code" is inefficient.尽管这不会真正影响性能,除非您的“拆分代码”效率低下。

Then, instead of binding .click() to every span , it would be more efficient to bind a single .click() to the container of the span s, and check event.target to see which span has been clicked.然后,与其将.click()绑定到每个span.click()将单个.click()绑定到span的容器,并检查event.target以查看单击了哪个span会更有效。

Here are improvements for the accepted answer:以下是对已接受答案的改进:

$(".clickable").click(function (e) {
    var selection = window.getSelection();
    if (!selection || selection.rangeCount < 1) return true;
    var range = selection.getRangeAt(0);
    var node = selection.anchorNode;
    var word_regexp = /^\w*$/;

    // Extend the range backward until it matches word beginning
    while ((range.startOffset > 0) && range.toString().match(word_regexp)) {
      range.setStart(node, (range.startOffset - 1));
    }
    // Restore the valid word match after overshooting
    if (!range.toString().match(word_regexp)) {
      range.setStart(node, range.startOffset + 1);
    }

    // Extend the range forward until it matches word ending
    while ((range.endOffset < node.length) && range.toString().match(word_regexp)) {
      range.setEnd(node, range.endOffset + 1);
    }
    // Restore the valid word match after overshooting
    if (!range.toString().match(word_regexp)) {
      range.setEnd(node, range.endOffset - 1);
    }

    var word = range.toString();
});​

And another take on @stevendaniel's answer:另一个对@stevendaniel 的回答的看法:

 $('.clickable').click(function(){ var sel=window.getSelection(); var str=sel.anchorNode.nodeValue,len=str.length, a=b=sel.anchorOffset; while(str[a]!=' '&&a--){}; if (str[a]==' ') a++; // start of word while(str[b]!=' '&&b++<len){}; // end of word+1 console.log(str.substring(a,b)); });
 <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> <p class="clickable">The objective can also be achieved by simply analysing the string you get from <code>sel=window.getSelection()</code>. Two simple searches for the next blank before and after the word, pointed to by the current position (<code>sel.anchorOffset</code>) and the work is done:</p> <p>This second paragraph is <em>not</em> clickable. I tested this on Chrome and Internet explorer (IE11)</p>

The only cross-browser (IE < 8) way that I know of is wrapping in span elements.我所知道的唯一跨浏览器(IE < 8)方式是包装在span元素中。 It's ugly but not really that slow.这很丑陋,但并不是那么慢。

This example is straight from the jQuery .css() function documentation, but with a huge block of text to pre-process:这个例子直接来自 jQuery .css() 函数文档,但有一大块文本要预处理:

http://jsfiddle.net/kMvYy/ http://jsfiddle.net/kMvYy/

Here's another way of doing it (given here: jquery capture the word value ) on the same block of text that doesn't require wrapping in span .这是在不需要在span换行的同一文本块上执行此操作的另一种方法(此处给出: jquery 捕获单词 value )。 http://jsfiddle.net/Vap7C/1 http://jsfiddle.net/Vap7C/1

-EDIT- What about this? - 编辑 -这个怎么样? it uses getSelection() binded to mouseup它使用绑定到mouseup getSelection()

<script type="text/javascript" src="jquery-1.6.3.min.js"></script>
<script>
$(document).ready(function(){
    words = [];
    $("#myId").bind("mouseup",function(){
        word = window.getSelection().toString();
        if(word != ''){
            if( confirm("Add *"+word+"* to array?") ){words.push(word);}
        }
    });
    //just to see what we've got
    $('button').click(function(){alert(words);});
});
</script>

<div id='myId'>
    Some random text in here with many words huh
</div>
<button>See content</button>

I can't think of a way beside splitting, this is what I'd do, a small plugin that will split into spans and when clicked it will add its content to an array for further use:我想不出除了拆分之外的其他方法,这就是我要做的,一个小插件将拆分为spans ,单击时会将其内容添加到array以供进一步使用:

<script type="text/javascript" src="jquery-1.6.3.min.js"></script>
<script>
//plugin, take it to another file
(function( $ ){
$.fn.splitWords = function(ary) {
    this.html('<span>'+this.html().split(' ').join('</span> <span>')+'</span>');
    this.children('span').click(function(){
        $(this).css("background-color","#C0DEED");
        ary.push($(this).html());
    });
};
})( jQuery );
//plugin, take it to another file

$(document).ready(function(){
    var clicked_words = [];
    $('#myId').splitWords(clicked_words);
    //just to see what we've stored
    $('button').click(function(){alert(clicked_words);});
});
</script>

<div id='myId'>
    Some random text in here with many words huh
</div>
<button>See content</button>

This is a followup on my comment to stevendaniels' answer (above):这是我对stevendaniels 的回答(上图)的评论的后续:

In the first code section above, range.setStart(node, (range.startOffset - 1));在上面的第一个代码部分, range.setStart(node, (range.startOffset - 1)); crashes when run on the first word in a "node," because it attempts to set range to a negative value.在“节点”中的第一个单词上运行时崩溃,因为它试图将范围设置为负值。 I tried adding logic to prevent that, but then the subsequent range.setStart(node, range.startOffset + 1);我尝试添加逻辑来防止这种情况发生,但是随后的 range.setStart(node, range.startOffset + 1); returns all but the first letter of the first word.返回除第一个单词的第一个字母之外的所有内容。 Also, when words are separated by a newline, the last word on the previous line is returned in addition to the clicked-on word.此外,当单词由换行符分隔时,除了单击的单词外,还返回上一行的最后一个单词。 So, this needs some work.所以,这需要一些工作。

Here is my code to make the range expansion code in that answer work reliably:这是我的代码,使该答案中的范围扩展代码可靠地工作:

while (range.startOffset !== 0) {                   // start of node
    range.setStart(node, range.startOffset - 1)     // back up 1 char
    if (range.toString().search(/\s/) === 0) {      // space character
        range.setStart(node, range.startOffset + 1);// move forward 1 char
        break;
    }
}

while (range.endOffset < node.length) {         // end of node
    range.setEnd(node, range.endOffset + 1)     // forward 1 char
    if (range.toString().search(/\s/) !== -1) { // space character
        range.setEnd(node, range.endOffset - 1);// back 1 char
        break;
    }
}

Here is a completely different method.这是一种完全不同的方法。 I am not sure about the practicality of it, but it may give you some different ideas.我不确定它的实用性,但它可能会给你一些不同的想法。 Here is what I am thinking if you have a container tag with position relative with just text in it.如果您有一个带有相对位置的容器标签,其中只有文本,这就是我的想法。 Then you could put a span around each word record its offset Height, Width, Left, and Top, then remove the span.然后你可以在每个单词周围放一个跨度记录它的偏移量 Height、Width、Left 和 Top,然后删除 span。 Save those to an array then when there is a click in the area do a search to find out what word was closest to the click.将它们保存到一个数组中,然后当该区域中有点击时,进行搜索以找出最接近点击的单词。 This obviously would be intensive at the beginning.这显然在开始时会很密集。 So this would work best in a situation where the person will be spending some time perusing the article.因此,在此人将花费一些时间阅读文章的情况下,这将最有效。 The benefit is you do not need to worry about possibly 100s of extra elements, but that benefit may be marginal at best.好处是您不需要担心可能有 100 多个额外元素,但这种好处充其量只是微不足道的。

Note I think you could remove the container element from the DOM to speed up the process and still get the offset distances, but I am not positive.注意我认为您可以从 DOM 中删除容器元素以加快过程并仍然获得偏移距离,但我并不积极。

The selected solution sometimes does not work on Russian texts (shows error).所选解决方案有时不适用于俄语文本(显示错误)。 I would suggest the following solution for Russian and English texts:对于俄文和英文文本,我建议采用以下解决方案:

function returnClickedWord(){
    let selection = window.getSelection(),
        text = selection.anchorNode.data,
        index = selection.anchorOffset,
        symbol = "a";
    while(/[a-zA-z0-9а-яА-Я]/.test(symbol)&&symbol!==undefined){
        symbol = text[index--];
    }
    index += 2;
    let word = "";
    symbol = "a";
    while(/[a-zA-z0-9а-яА-Я]/.test(symbol) && index<text.length){
        symbol = text[index++];
    word += symbol;
    }
    alert(word);
}
document.addEventListener("click", returnClickedWord);

For the sake of completeness to the rest of the answers, I am going to add an explanation to the main methods used:为了其余答案的完整性,我将对所使用的主要方法进行解释:

  • window.getSelection() : This is the main method. window.getSelection() :这是主要方法。 It is used to get information about a selection you made in text (by pressing the mouse button, dragging and then releasing, not by doing a simple click).它用于获取有关您在文本中所做选择的信息(通过按下鼠标按钮,拖动然后释放,而不是通过简单的单击)。 It returns a Selection object whose main properties are anchorOffset and focusOffset, which are the position of the first and last characters selected, respectively.它返回一个Selection对象,其主要属性是anchorOffset和focusOffset,分别是选择的第一个和最后一个字符的位置。 In case it doesn't make total sense, this is the description of anchor and focus the MDN website I linked previously offers:如果它没有完全意义,这是我之前链接的 MDN 网站提供的锚点和焦点的描述:

    The anchor is where the user began the selection and the focus is where the user ends the selection锚点是用户开始选择的地方,焦点是用户结束选择的地方

    • toString() : This method returns the selected text. toString() :此方法返回选定的文本。

    • anchorOffset : Starting index of selection in the text of the Node you made the selection. anchorOffset :您进行选择的节点文本中选择的起始索引。
      If you have this html:如果你有这个 html:

       <div>aaaa<span>bbbb cccc dddd</span>eeee/div>

      and you select 'cccc', then anchorOffset == 5 because inside the node the selection begins at the 5th character of the html element.并且您选择'cccc',然后选择anchorOffset == 5,因为在节点内部,选择从html 元素的第5 个字符开始。

    • focusOffset : Final index of selection in the text of the Node you made the selection. focusOffset :您进行选择的节点文本中选择的最终索引。
      Following the previous example, focusOffset == 9.按照上一个示例,focusOffset == 9。

    • getRangeAt() : Returns a Range object. getRangeAt() :返回一个Range对象。 It receives an index as parameter because (I suspect, I actually need confirmation of this) in some browsers such as Firefox you can select multiple independent texts at once .它接收一个索引作为参数,因为(我怀疑,我实际上需要对此进行确认)在某些浏览器(例如 Firefox)中,您可以一次选择多个独立文本

      • startOffset : This Range's property is analogous to anchorOffset. startOffset :这个 Range 的属性类似于 anchorOffset。
      • endOffset : As expected, this one is analogous to focusOffset. endOffset :正如预期的那样,这个类似于 focusOffset。
      • toString : Analogous to the toString() method of the Selection object. toString :类似于 Selection 对象的 toString() 方法。

Aside from the other solutions, there is also another method nobody seems to have noticed: Document.caretRangeFromPoint()除了其他解决方案,还有另一种似乎没有人注意到的方法: Document.caretRangeFromPoint()

The caretRangeFromPoint() method of the Document interface returns a Range object for the document fragment under the specified coordinates. Document 接口的 caretRangeFromPoint() 方法为指定坐标下的文档片段返回一个 Range 对象。

If you follow this link you will see how, in fact, the documentation provides an example that closely resembles what the OP was asking for.如果您点击此链接,您将看到该文档实际上如何提供与 OP 要求的内容非常相似的示例。 This example does not get the particular word the user clicked on, but instead adds a <br> right after the character the user clicked.这个例子没有得到用户点击的特定单词,而是在用户点击的字符之后添加一个<br>

 function insertBreakAtPoint(e) { let range; let textNode; let offset; if (document.caretPositionFromPoint) { range = document.caretPositionFromPoint(e.clientX, e.clientY); textNode = range.offsetNode; offset = range.offset; } else if (document.caretRangeFromPoint) { range = document.caretRangeFromPoint(e.clientX, e.clientY); textNode = range.startContainer; offset = range.startOffset; } // Only split TEXT_NODEs if (textNode && textNode.nodeType == 3) { let replacement = textNode.splitText(offset); let br = document.createElement('br'); textNode.parentNode.insertBefore(br, replacement); } } let paragraphs = document.getElementsByTagName("p"); for (let i = 0; i < paragraphs.length; i++) { paragraphs[i].addEventListener('click', insertBreakAtPoint, false); }
 <p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.</p>

It's just a matter to get the word by getting all the text after the previous and before the next blank characters.通过获取前一个空白字符之后和下一个空白字符之前的所有文本来获取单词只是一个问题。

Here's an alternative to the accepted answer that works with Cyrillic.这是适用于西里尔文的已接受答案的替代方案。 I don't understand why checking the word boundaries is necessary, but by default the selection is collapsed for some reason for me.我不明白为什么需要检查单词边界,但默认情况下,出于某种原因,选择是折叠的。

let selection = window.getSelection();
if (!selection || selection.rangeCount < 1) return
let node = selection.anchorNode
let range = selection.getRangeAt(0)

let text = selection.anchorNode.textContent

let startIndex, endIndex
startIndex = endIndex = selection.anchorOffset
const expected = /[A-ZА-Я]*/i

function testSlice() {
  let slice = text.slice(startIndex, endIndex)
  return slice == slice.match(expected)[0]
}

while(startIndex > 0 && testSlice()) {
  startIndex -= 1
}
startIndex += 1

while(endIndex < text.length && testSlice()){
  endIndex += 1
}
endIndex -= 1

range.setStart(node, startIndex)
range.setEnd(node, endIndex)

let word = range.toString()
return word

What looks like a slightly simpler solution.看起来像一个稍微简单的解决方案。

document.addEventListener('selectionchange', () => {
  const selection = window.getSelection();
  const matchingRE = new RegExp(`^.{0,${selection.focusOffset}}\\s+(\\w+)`);
  const clickedWord = (matchingRE.exec(selection.focusNode.textContent) || ['']).pop();
});

I'm testing我在测试

As with the accepted answer , this solution uses window.getSelection to infer the cursor position within the text.接受的答案一样,此解决方案使用window.getSelection来推断文本中的光标位置。 It uses a regex to reliably find the word boundary, and does not restrict the starting node and ending node to be the same node.它使用正则表达式来可靠地找到单词边界,并且不限制起始节点结束节点为同一个节点。

This code has the following improvements over the accepted answer :此代码对接受的答案有以下改进:

  • Works at the beginning of text.在文本的开头工作。
  • Allows selection across multiple nodes.允许跨多个节点进行选择。
  • Does not modify selection range.不修改选择范围。
  • Allows the user to override the range with a custom selection.允许用户使用自定义选择覆盖范围。
  • Detects words even when surrounded by non-spaces (eg "\\t\\n" )即使被非空格包围也能检测到单词(例如"\\t\\n"
  • Uses vanilla JavaScript, only.仅使用普通 JavaScript。
  • No alerts!没有警报!

 getBoundaryPoints = (range) => ({ start: range.startOffset, end: range.endOffset }) function expandTextRange(range) { // expand to include a whole word matchesStart = (r) => r.toString().match(/^\\s/) // Alternative: /^\\W/ matchesEnd = (r) => r.toString().match(/\\s$/) // Alternative: /\\W$/ // Find start of word while (!matchesStart(range) && range.startOffset > 0) { range.setStart(range.startContainer, range.startOffset - 1) } if (matchesStart(range)) range.setStart(range.startContainer, range.startOffset + 1) // Find end of word var length = range.endContainer.length || range.endContainer.childNodes.length while (!matchesEnd(range) && range.endOffset < length) { range.setEnd(range.endContainer, range.endOffset + 1) } if (matchesEnd(range) && range.endOffset > 0) range.setEnd(range.endContainer, range.endOffset - 1) //console.log(JSON.stringify(getBoundaryPoints(range))) //console.log('"' + range.toString() + '"') var str = range.toString() } function getTextSelectedOrUnderCursor() { var sel = window.getSelection() var range = sel.getRangeAt(0).cloneRange() if (range.startOffset == range.endOffset) expandTextRange(range) return range.toString() } function onClick() { console.info('"' + getTextSelectedOrUnderCursor() + '"') } var content = document.body content.addEventListener("click", onClick)
 <div id="text"> <p>Vel consequatur incidunt voluptatem. Sapiente quod qui rem libero ut sunt ratione. Id qui id sit id alias rerum officia non. A rerum sunt repudiandae. Aliquam ut enim libero praesentium quia eum.</p> <p>Occaecati aut consequuntur voluptatem quae reiciendis et esse. Quis ut sunt quod consequatur quis recusandae voluptas. Quas ut in provident. Provident aut vel ea qui ipsum et nesciunt eum.</p> </div>

Because it uses arrow functions , this code doesn't work in IE;因为它使用了箭头函数,所以这段代码在 IE 中不起作用; but that is easy to adjust.但这很容易调整。 Furthermore, because it allows the user selection to span across nodes, it may return text that is usually not visible to the user, such as the contents of a script tag that exists within the user's selection.此外,因为它允许用户选择跨越节点,所以它可能返回用户通常不可见的文本,例如存在于用户选择中的脚本标签的内容。 (Triple-click the last paragraph to demonstrate this flaw.) (三击最后一段来演示这个缺陷。)

You should decide which kinds of nodes the user should see, and filter out the unneeded ones, which I felt was beyond the scope of the question.您应该决定用户应该看到哪些类型的节点,并过滤掉不需要的节点,我认为这超出了问题的范围。

an anonymous user suggested this edit: An improved solution that always gets the proper word, is simpler, and works in IE 4+ 一位匿名用户建议进行此编辑:一个改进的解决方案,它总是得到正确的词,更简单,并且适用于 IE 4+

http://jsfiddle.net/Vap7C/80/ http://jsfiddle.net/Vap7C/80/

document.body.addEventListener('click',(function() {
 // Gets clicked on word (or selected text if text is selected)
 var t = '';
 if (window.getSelection && (sel = window.getSelection()).modify) {
    // Webkit, Gecko
    var s = window.getSelection();
    if (s.isCollapsed) {
        s.modify('move', 'forward', 'character');
        s.modify('move', 'backward', 'word');
        s.modify('extend', 'forward', 'word');
        t = s.toString();
        s.modify('move', 'forward', 'character'); //clear selection
    }
    else {
        t = s.toString();
    }
  } else if ((sel = document.selection) && sel.type != "Control") {
    // IE 4+
    var textRange = sel.createRange();
    if (!textRange.text) {
        textRange.expand("word");
    }
    // Remove trailing spaces
    while (/\s$/.test(textRange.text)) {
        textRange.moveEnd("character", -1);
    }
    t = textRange.text;
 }
 alert(t);
});

Here's an alternative that doesn't not imply to visually modify the range selection.这是一个替代方案,并不意味着要在视觉上修改范围选择。

/**
 * Find a string from a selection
 */
export function findStrFromSelection(s: Selection) {
  const range = s.getRangeAt(0);
  const node = s.anchorNode;
  const content = node.textContent;

  let startOffset = range.startOffset;
  let endOffset = range.endOffset;
  // Find starting point
  // We move the cursor back until we find a space a line break or the start of the node
  do {
    startOffset--;
  } while (startOffset > 0 && content[startOffset - 1] != " " && content[startOffset - 1] != '\n');

  // Find ending point
  // We move the cursor forward until we find a space a line break or the end of the node
  do {
    endOffset++;
  } while (content[endOffset] != " " && content[endOffset] != '\n' && endOffset < content.length);
  
  return content.substring(startOffset, endOffset);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM