如何删除字符串中的html标签？

Question

when i search keyword " data ", i get abtract of paper in digital library : 当我搜索关键字“ 数据 ”时，我在数字图书馆中收到一些论文：

Many organizations often underutilize their existing <span class='snippet'>data</span> warehouses. In this paper, we suggest a way of acquiring more information from corporate <span class='snippet'>data</span> warehouses without the complications and drawbacks of deploying additional software systems. Association-rule mining, which captures co-occurrence patterns within <span class='snippet'>data</span>, has attracted considerable efforts from <span class='snippet'>data</span> warehousing researchers and practitioners alike. Unfortunately, most <span class='snippet'>data</span> mining tools are loosely coupled, at best, with the <span class='snippet'>data</span> warehouse repository. Furthermore, these tools can often find association rules only within the main fact table of the <span class='snippet'>data</span> warehouse (thus ignoring the information-rich dimensions of the star schema) and are not easily applied on non-transaction level <span class='snippet'>data</span> often found in <span class='snippet'>data</span> warehouses

How can i remove all tag <span class='snippet'>..</span> , but still keep keywod data to have abtract like that : 我如何删除所有标签<span class='snippet'>..</span> ，但仍保留按键数据像这样的摘要：

Many organizations often underutilize their existing data warehouses. 许多组织经常未充分利用其现有数据仓库。 In this paper, we suggest a way of acquiring more information from corporate data warehouses without the complications and drawbacks of deploying additional software systems. 在本文中，我们建议一种从公司数据仓库中获取更多信息的方法，而又不存在部署其他软件系统的复杂性和缺点。 Association-rule mining, which captures co-occurrence patterns within data, has attracted considerable efforts from data warehousing researchers and practitioners alike. 捕获数据中共现模式的关联规则挖掘吸引了数据仓库研究人员和从业人员的巨大努力。 Unfortunately, most data mining tools are loosely coupled, at best, with the data warehouse repository. 不幸的是，大多数数据挖掘工具充其量只能与数据仓库存储库松散耦合。 Furthermore, these tools can often find association rules only within the main fact table of the data warehouse (thus ignoring the information-rich dimensions of the star schema) and are not easily applied on non-transaction level data often found in data warehouses 此外，这些工具通常只能在数据仓库的主事实表中找到关联规则（从而忽略星型模式的信息丰富的维度），并且不容易应用于通常在数据仓库中找到的非事务级数据

Answer 1

strip_tags() is your friend. strip_tags()是您的朋友。 Code kindly copied from here . 请从此处复制代码。

  public static String strip_tags(String text, String allowedTags) {
      String[] tag_list = allowedTags.split(",");
      Arrays.sort(tag_list);

      final Pattern p = Pattern.compile("<[/!]?([^\\\\s>]*)\\\\s*[^>]*>",
              Pattern.CASE_INSENSITIVE);
      Matcher m = p.matcher(text);

      StringBuffer out = new StringBuffer();
      int lastPos = 0;
      while (m.find()) {
          String tag = m.group(1);
          // if tag not allowed: skip it
          if (Arrays.binarySearch(tag_list, tag) < 0) {
              out.append(text.substring(lastPos, m.start())).append(" ");

          } else {
              out.append(text.substring(lastPos, m.end()));
          }
          lastPos = m.end();
      }
      if (lastPos > 0) {
          out.append(text.substring(lastPos));
          return out.toString().trim();
      } else {
          return text;
      }
  }

如何删除字符串中的html标签？

问题描述

1 个解决方案

解决方案1
2 已采纳 2010-10-20 03:46:50

如何删除字符串中的html标签？

问题描述

1 个解决方案

解决方案1 2 已采纳 2010-10-20 03:46:50

解决方案1
2 已采纳 2010-10-20 03:46:50