在JAVA中计算子字符串忽略大小写的次数

Question

I am trying to count the occurences of div tag in my html file. 我正在尝试计算html文件中div标签的出现。 When I search for div , I get 2 and for DIV , I get 1650. So ideally when i use sHtml.toUpperCase() , and then search for DIV , I should get 1652 . 当我搜索div ，我得到2，对于DIV ，我得到sHtml.toUpperCase()所以理想情况下，当我使用sHtml.toUpperCase()然后搜索DIV ，我应该得到1652 。 But I am getting 1656 . 但是我要1656 。 What might be going wrong here? 这里可能出什么问题了？

        /********* Counting occurences of div **************/
        String findString = "DIV";
        int lastIndex = 0;
        int count = 0;

        while (lastIndex != -1) {

            lastIndex = sHtml.indexOf(findString, lastIndex);

            if (lastIndex != -1) {
                count++;
                lastIndex += findString.length();
            }
        }
        System.out.println("Count of div = " + count);

Answer 1

You are picking up substrings that were mixed-case before - say, Div . 您要提取的是混合大小写的子字符串，例如Div 。 This is not a good reason to count "div" s, though, because you would pick up parts of longer words (say, Division or Divorce ). 但是，这不是计数"div"的好理由，因为您会选择较长单词的一部分（例如， Division或Divorce ）。

If you want a better count, you could use a simple regex to do the counting: 如果您希望获得更好的计数，则可以使用简单的正则表达式进行计数：

"[</]div[ />]"

This regular expression will match a div that is preceded by < or / , and followed by a space, / , or > : 此正则表达式将匹配以<或/开头，后跟空格/或>的div ：

Pattern countRx = Pattern.compile("[</]div[ />]", Pattern.CASE_INSENSITIVE);
Matcher m = countRx.matcher(sHtml);
int count = 0;
while (m.find()) {
    count++;
}
System.out.println(count);

Answer 2

By the process of elimination, you must have some combination of Div , DIv , DiV or dIV as well. 通过消除过程，您还必须具有Div ， DIv ， DiV或dIV某种组合。 It is also possible that your text contains a word with div in it (like long division ). 您的文本中也可能包含一个带有div的单词（如long division ）。

在JAVA中计算子字符串忽略大小写的次数

问题描述

2 个解决方案

解决方案1
2 已采纳 2014-11-05 04:16:51

解决方案2
1 2014-11-05 04:15:43

在JAVA中计算子字符串忽略大小写的次数

问题描述

2 个解决方案

解决方案1 2 已采纳 2014-11-05 04:16:51

解决方案2 1 2014-11-05 04:15:43

解决方案1
2 已采纳 2014-11-05 04:16:51

解决方案2
1 2014-11-05 04:15:43