Given a string "this is high-tech job market in which? we make. careers" I have to remove all special characters except hyphen and count number of words in a string so output should be 10 in this case. I have written below program but it did not pass the test cases.
public int countWords(String str) {
if(str.isEmpty() || str==null)
return 0;
String replacedString = str.replaceAll(["^a-zA-Z0-9- ]","");
String[] arrWords = replacedString.split("\\s+");
return arrWords.length;
}
You can use the regex, [\p{Punct}&&[^-]]
where \p{Punct}
stands for a punctuation. If you want to replace everything other than alphabets, digits, hyphen and space, you can use the regex, [^\p{Alnum}\s-]
where \p{Alnum}
stands for an alphanumeric character .
Demo:
import java.util.Arrays;
public class Main {
public static void main(String[] args) {
String str = "this is high-tech job market in which? we make. careers";
String[] arr = str.replaceAll("[\\p{Punct}&&[^-]]", "").split("\\s+");
System.out.println(Arrays.toString(arr));
int count = arr.length;
System.out.println(count);
}
}
Output:
[this, is, high-tech, job, market, in, which, we, make, careers]
10
First, the null / empty condition should be in the opposite order:
if(str==null || str.isEmpty())
Do you understand why? (Hint: Java evaluation is lazy)
Additionally, does a "-" (minus) should be removed or not?
You can do this to count 10 words from your code. You replace your string if it's a non-word ( \W
) and you make an exception for hyphen.
public class Test {
public static void main(String[] args) {
String myString = "this is high-tech job market in which? we make. careers";
myString = myString.replaceAll("[\\W&&[^\\-]]", " ");
String[] arrWords = myString.split("\\s+");
System.out.println(arrWords.length);
}
}
The advantage of using \W
is that it includes all unicode punctuation.
For exemple if you have these characters ' „
, \p{Punct}
won't work.
Here is an alternative using Pattern.UNICODE_CHARACTER_CLASS
if you need it:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String myString = "this is high-tech job market ‘ „ in which? we make. careers";
String[] arrWords2 = Pattern.compile("[\\p{Punct}&&[^-]]|\\s", Pattern.UNICODE_CHARACTER_CLASS).split(myString);
List<String> arrayList = new ArrayList<String>(Arrays.asList(arrWords2));
arrayList.removeAll(Arrays.asList("",null));
System.out.println(arrayList);
System.out.println(arrayList.size());
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.