[英]Counting words from array in a string
I have an array of string say 我有一个字符串说
A=["hello", "you"]
I have a string, say 我有绳子说
s="hello, hello you are so wonderful"
I need to count the number of occurrence of strings from A
in s
. 我需要从
s
A
计算字符串出现的次数。 In this case, the number of occurrences is 3 (2 "hello"
, 1 "you"
). 在这种情况下,出现的次数是3(2
"hello"
,1 "you"
)。
How to do this effectively? 如何有效地做到这一点? (
A
might contains lots of words, and s
might be long in practice) (
A
可能包含很多单词,而s
在实践中可能很长)
Try: 尝试:
Map<String, Integer> wordCount = new HashMap<>();
for(String a : dictionnary) {
wordCount.put(a, 0);
}
for(String s : text.split("\\s+")) {
Integer count = wordCount.get(s);
if(count != null) {
wordCount.put(s, count + 1);
}
}
int count =0;
for(int i=0;i<A.length;i++)
{
count = count + s.split(A[i],-1).length - 1;
}
Working Ideone : http://ideone.com/Z9K3JX 工作的Ideone: http ://ideone.com/Z9K3JX
public void countMatches() {
String[] A = {"hello", "you"};
String s = "hello, hello you are so wonderful";
String patternString = "(" + StringUtils.join(A, "|") + ")";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(s);
int count = 0;
while (matcher.find()) {
count++;
}
System.out.println(count);
}
Note that StringUtils is from apache commons. 请注意,StringUtils来自Apache Commons。 If you don't want to include and additional jar you can just construct that string using a for loop.
如果您不想包含和添加其他jar,则可以使用for循环来构造该字符串。
HashSet<String> searchWords = new HashSet<String>();
for(String a : dictionary) {
searchWords.add(a);
}
int count = 0;
for(String s : input.split("[ ,]")) {
if(searchWords.contains(s)) {
count++;
}
}
This is fully working method with output :) 这是输出完全可用的方法:)
public static void main(String[] args) {
String[] A={"hello", "you"};
String s= "hello, hello you are so wonderful";
int[] count = new int[A.length];
for (int i = 0; i < A.length; i++) {
count[i] = (s.length() - s.replaceAll(A[i], "").length())/A[i].length();
}
for (int i = 0; i < count.length; i++) {
System.out.println(A[i] + ": " + count[i]);
}
}
What does this line do? 这条线是做什么的?
count[i] = (s.length() - s.replaceAll(A[i], "").length())/A[i].length();
This part s.replaceAll(A[i], "")
changes all "hello" to empty "" string in the text. s.replaceAll(A[i], "")
这部分将所有“ hello”更改为文本中的空“”字符串。
So I take the length of everything s.length()
I substract from it the length of same string without that word s.replaceAll(A[i], "").length()
and I divide it by the length of that word /A[i].length()
所以我将所有内容的长度
s.length()
减去不带该单词的相同字符串的长度s.replaceAll(A[i], "").length()
length s.replaceAll(A[i], "").length()
然后将其除以该单词的长度/A[i].length()
Sample output for this example : 此示例的示例输出:
hello: 2
you: 1
You can use the String Tokenizer 您可以使用String Tokenizer
Do something like this: 做这样的事情:
A = ["hello", "you"];
s = "hello, hello you are so wonderful";
StringTokenizer st = new StringTokenizer(s);
while (st.hasMoreElements()) {
for (String i: A) {
if(st.nextToken() == i){
//You can keep going from here
}
}
}
This is what I came up with: 这是我想出的:
It doesn't create any new objects. 它不会创建任何新对象。 It uses
String.indexOf(String, int)
, keeps track of the current index, and increments the occurance-count. 它使用
String.indexOf(String, int)
,跟踪当前索引,并增加出现次数。
public class SearchWordCount {
public static final void main(String[] ignored) {
String[] searchWords = {"hello", "you"};
String input = "hello, hello you are so wonderful";
for(int i = 0; i < searchWords.length; i++) {
String searchWord = searchWords[i];
System.out.print(searchWord + ": ");
int foundCount = 0;
int currIdx = 0;
while(currIdx != -1) {
currIdx = input.indexOf(searchWord, currIdx);
if(currIdx != -1) {
foundCount++;
currIdx += searchWord.length();
} else {
currIdx = -1;
}
}
System.out.println(foundCount);
}
}
}
Output: 输出:
hello: 2
you: 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.