简体   繁体   English

获取两个图像之间的文本

[英]get text between two images

Is there a simpler way to catch the text between two images which haven't a same parent element? 有没有一种更简单的方法来在没有相同父元素的两个图像之间捕获文本? I'm making a userscript for a webpage. 我正在为网页制作用户脚本。 Like: 喜欢:

<div id="content"></div>
     <div style="text-align:center"><img src="" alt=""></div>
     <a>some text</a>
     <img src="" alt="">
     <div style="text-align:left">more text</div>
</div>

How to get the text between the 1st image and 2nd image of the div content. 如何在div内容的第一张图片和第二张图片之间获取文本。 I don't know exactly the correct structure because maybe the text and the images could be inside of div or a nodes. 我不知道确切的结构,因为文本和图像可能在div或节点内部。 I'd rather not use libraries 我宁愿不使用图书馆

Improving Mathew's answer with pure javascript: 使用纯JavaScript改善Mathew的答案:

var html = document.getElementById('content').cloneNode(true);
var imgs = html.getElementsByTagName('img');
for (i = 0; i < imgs.length; i++) {
    var textSep = document.createTextNode('@img@');
    imgs[i].parentNode.insertBefore(textSep, imgs[i]);
}
var texts = html.textContent;
texts = texts.split('@img@');

You basically want to handle the <img> tags as quotes around text you want to extract. 基本上,您希望将<img>标记作为要提取的文本周围的引号来处理。

The easiest way to do that is to just replace the <img> tag with something not likely repeated in the text, and use that character as a delimiter. 最简单的方法是将<img>标记替换为文本中不太可能重复的内容,然后使用该字符作为分隔符。 I'll show you how using jQuery. 我将向您展示如何使用jQuery。 If you need it done in pure JS than you'll have to convert this. 如果您需要使用纯JS完成此操作,则必须进行转换。

First, make a copy of the HTML. 首先,复制HTML。

var html = $('<div>').append($("#content").html());

Next, replace all <img> tags with a special character (or other token you know is unique). 接下来,将所有<img>标签替换为特殊字符(或其他您知道是唯一的标记)。

html.find("img").replaceWith("<div>~</div>");

Once you've done that you can just match text between those delimiters like this. 完成后,您可以像这样在这些定界符之间匹配文本。

var str = html.text();
var rx = /~([^~]+)~/g;
var match = rx.exec(str);

To find all matches just repeat. 要查找所有匹配项,请重复。

while(match != null)
{
    alert(match[1]);
    match = rx.exec(str);    
}

It's possible to do the same with a unique phrase like @img@ instead of a single character, but a single character is way easier. 可以使用像@img@这样的独特短语来代替单个字符,从而实现相同的目的,但是使用单个字符会更容易。

Here's a working fiddle. 这是一个工作的小提琴。

http://jsfiddle.net/thinkingmedia/etx1z6ov/2/ http://jsfiddle.net/thinkingmedia/etx1z6ov/2/

This answer is not the best, I just put it here just for info 这个答案不是最好的,我只是把它放在这里仅供参考

1 - Go down 1-下山

2 - Go to the next sibling 2-转到下一个同级

3 - If there isn't, go up and the next sibling. 3-如果没有,上一个兄弟姐妹。

Repeat. 重复。

It's like walking through a valley like this :D 就像在山谷中漫步一样:D

\                 div#content                     /
 \txt/\  div   /\ div /\    div     /\txt/\  div /
       \  a   /  \txt/  \    a     /       \img2/
        \img1/           \txt/\txt/

Well, after several hours I found out, and made the algorithm: 好了,几个小时后,我发现了问题,并制定了算法:

function textAfterElem (el, nextEl) {
    var txt = "";

    while(true){
        //go down
        while(el.firstChild) {
            el = el.firstChild;
            if (el == nextEl) {return txt;}
        }
        txt += el.textContent; // extract
        //go next
        if (el.nextSibling) {
            el = el.nextSibling;
            if (el == nextEl) {return txt;}
        }else{
            //go up
            while(!el.nextSibling) {
                el = el.parentNode;
                if (el == document.body) {return txt;} // for security
            }
            if (el.nextSibling) {
                el = el.nextSibling; // go next
                if (el == nextEl) {return txt;}
            }else{
                return txt; // for security
            }
        }
    }
    return txt;
}

And it could be used not only with images but any element 而且它不仅可以用于图像,而且可以用于任何元素

A simple erase and split works too: 一个简单的擦除和拆分也可以:

html.replace(/<(?!img\b)[^>]*>/g, '').split(/<img\b[^>]*>/)

With your example the result is: 以您的示例为例,结果是:

["↵     ", "↵     some text↵     ", "↵     more text↵"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM