简体   繁体   English

检查单词是否包含数字或特殊字符

[英]check if word contains a number or special character

I am writing a program to count the total number of valid English words in a text file. 我正在编写一个程序来计算文本文件中有效英语单词的总数。 In this code, I want to ignore words that contain number/numbers or special characters eg "word123", "123word ", "word&&", "$name". 在此代码中,我想忽略包含数字/数字或特殊字符的单词,例如“ word123”,“ 123word”,“ word &&”,“ $ name”。 Currently my program detects words that start with numbers eg "123number". 目前,我的程序检测到以数字开头的单词,例如“ 123number”。 However cannot detect "number123". 但是无法检测到“ number123”。 Can anyone tell me how should I move forward ? 谁能告诉我我应该如何前进? Below is my code: 下面是我的代码:

public int wordCounter(String filePath) throws FileNotFoundException{
    File f = new File(filePath);
    Scanner scanner = new Scanner(f);
    int nonWord = 0;
    int count = 0;
    String regex = "[a-zA-Z].*";

    while(scanner.hasNext()){
        String word = scanner.next();
        if(word.matches(regex)){
            count++;
    }
        else{
            nonWord++;
        }
    }
    return count;
}

Lose the dot: 丢点:

String regex = "[a-zA-Z]*"; // more correctly "[a-zA-Z]+", but both will work here

The dot means "any character", but you want a regex that means "only composed of letters". 点表示“任何字符”,但是您需要一个正则表达式,表示“仅由字母组成”。

BTW, you can also express this more succinctly (although arguably less readably) using a POSIX expression: 顺便说一句,您还可以使用POSIX表达式更简洁地表达(尽管可能不太可读):

String regex = "\\p{L}+";

The regex \\p{L} means "any letter". 正则表达式\\p{L}表示“任何字母”。


To extend the expression to include the apostrophe, which can appear at the start, eg 'tis , the middle eg can't or the end eg Jesus' , but not more than once: 为了将表达式扩展为包括撇号,该撇号可以出现在开始处,例如'tis ,中间例如can't或者结束处可以出现,例如Jesus' ,但不能超过一次:

String regex = "(?!([^']*'){2})['\\p{L}]+";

Use regex ^[a-zA-Z-]+$ for word match. 使用正则表达式^ [a-zA-Z-] + $进行单词匹配。

public int wordCounter(String filePath) throws FileNotFoundException
{
File f = new File(filePath);
Scanner scanner = new Scanner(f);
int nonWord = 0;
int count = 0;
String regex = "^[a-zA-Z-]+$";

while(scanner.hasNext()){
    String word = scanner.next();
    if(word.matches(regex)){
        count++;
}
    else{
        nonWord++;
    }
}
return count;

} }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查字符串是否包含特殊字符 - Check if a String contains a special character 如何在没有循环键的情况下检查键名是否包含嵌套 json 对象的数字/特殊字符? - How to check if a key name contains a number/special character for a nested json object without looping keys? Android检查字符串是否在单词之前包含数字 - Android check if string contains number before a word Java正则表达式检查字符串是否包含1个字符(数字)> 0 - Java regex check if string contains 1 character (number) > 0 替换字符串中包含数字或特殊符号的单词 - Replace a word from string which contains number or special symbol 如何检查用户输入的内容是否是单词,而不是(* !?)之类的特殊字符? - How can I check if the user input is a word and not a special character like (*!?)? 如何检查是否至少有两个字母,一个数字和一个特殊字符? - How to check if there is at least: two letters, one number and one special character? 检查行是否包含单词 - Check if row contains word 如何检查字符串是否包含小写字母,大写字母,特殊字符和数字? - How to check whether a string contains lowercase letter, uppercase letter, special character and digit? 生成随机字符串,该字符串必须包含字母,数字和特殊字符(6-10位数字) - generate random String that must Contains alphabets, number and Special Character (6-10 digits)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM