[英]Java- how to parse for words in a string for a specific word
How would I parse for the word "hi" in the sentence "hi, how are you?"我将如何解析句子“嗨,你好吗?”中的“嗨”这个词。 or in parse for the word "how" in "how are you?"?或解析“你好吗?”中的“如何”?
example of what I want in code:我想要的代码示例:
String word = "hi";
String word2 = "how";
Scanner scan = new Scanner(System.in).useDelimiter("\n");
String s = scan.nextLine();
if(s.equals(word)) {
System.out.println("Hey");
}
if(s.equals(word2)) {
System.out.println("Hey");
}
To just find the substring, you can use contains
or indexOf
or any other variant:要找到子字符串,您可以使用contains
或indexOf
或任何其他变体:
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html
if( s.contains( word ) ) {
// ...
}
if( s.indexOf( word2 ) >=0 ) {
// ...
}
If you care about word boundaries, then StringTokenizer
is probably a good approach.如果您关心单词边界,那么StringTokenizer
可能是一个好方法。
https://docs.oracle.com/javase/1.5.0/docs/api/java/util/StringTokenizer.html https://docs.oracle.com/javase/1.5.0/docs/api/java/util/StringTokenizer.html
You can then perform a case-insensitive check (equalsIgnoreCase) on each word.然后,您可以对每个单词执行不区分大小写的检查 (equalsIgnoreCase)。
Looks like a job for Regular Expressions .看起来像正则表达式的工作。 Contains
would give a false positive on, say, "hire-purchase"
. Contains
会给出一个误报,比如"hire-purchase"
。
if (Pattern.match("\\bhi\\b", stringToMatch)) { //...
I'd go for the java.util.StringTokenizer
: https://docs.oracle.com/javase/1.5.0/docs/api/java/util/StringTokenizer.html我会去java.util.StringTokenizer
: https : //docs.oracle.com/javase/1.5.0/docs/api/java/util/StringTokenizer.html
StringTokenizer st = new StringTokenizer(
"Hi, how are you?",
",.:?! \t\n\r" //whitespace and puntuation as delimiters
);
while (st.hasMoreTokens()) {
if(st.nextToken().equals("Hi")){
//matches "Hi"
}
}
Alternatively, take a look at java.util.regex
and use regular expressions.或者,查看java.util.regex
并使用正则表达式。
I'd go for a tokenizer , instead.相反,我会选择tokenizer 。 Set space and other elements like commas, full stops etc. as delimiters.将空格和其他元素(如逗号、句号等)设置为分隔符。 And rememeber to compare in case-insensitive mode.并记住在不区分大小写的模式下进行比较。
This way you can find "hi" in "Hi, how is his test going" without getting a false positive on "his" and a false negative on "Hi" (starts with a uppercase H).通过这种方式,您可以在“嗨,他的测试进展如何”中找到“嗨”,而不会在“hi”上得到假阳性,在“Hi”上得到假阴性(以大写字母 H 开头)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.