零深度从Java字符串中剥离HTML标签

Question

I understand that this question is very similar to this one and others. 据我所知，这个问题很相似，这一个和其他人。 I have the same question ( how to strip out HTML tags from a Java string? ) with the added constraint that I don't want to add any dependencies (Apache Commons, Spring, etc.) to my code. 我有一个相同的问题（ 如何从Java字符串中删除HTML标记？ ）， 但又增加了约束，我不想在代码中添加任何依赖项（Apache Commons，Spring等）。

So I'm looking for a "pure Java SE" flavor of the HTML tag-stripping algorithms used by a lot of these other frameworks, but not sure exactly where to start. 因此，我正在寻找许多其他框架使用的HTML标记剥离算法的“纯Java SE”风格，但不确定从何开始。 Thanks in advance. 提前致谢。

Answer 1

Without using the HTMLEditorKit explicitly: 无需显式使用HTMLEditorKit：

    String html = "<html>...";
    JTextPane pane = new JTextPane();
    pane.setContentType("text/html");
    pane.setText(html);
    StyledDocument doc = pane.getStyledDocument();
    try {
        System.out.println("Text: " + doc.getText(0, doc.getLength()));
    } catch (BadLocationException ex) {
        Logger.getLogger(NewJFrame.class.getName()).log(Level.SEVERE, null, ex);
    }

零深度从Java字符串中剥离HTML标签

问题描述

1 个解决方案

解决方案1
0 2013-03-19 10:35:15

零深度从Java字符串中剥离HTML标签

问题描述

1 个解决方案

解决方案1 0 2013-03-19 10:35:15

解决方案1
0 2013-03-19 10:35:15