I have problem in my program. In my program a bottleneck is replacing and spliting String. I need to get words to tab from String.
For example:
I have String: "This is Ala. Does Ala have a cat? Money-making cat."
I need to get a String tab[] and results
tab[0]="This"<br>
tab[1]="is"<br>
tab[2]="Ala" not "Ala."<br>
tab[3]="Does"<br>
....<br>
tab[7]="cat" not "cat?"<br>
tab[8]="Money" not "Money-making"<br>
tab[9]="making"<br>
tab[10]="cat" not "cat." <br>
The words cant have signs like ",./;!:?- etc. They can have only english letters.
Actually im doing this like that
s = s.replace(",", " ").replace("!", " ").... ;
String [] tab = s.split("\\s+");
But this way is really slow. How can i do that faster? In Java Language.
正则表达式是您的朋友。
s = s.replaceAll("[^a-zA-Z]"," ");
You can split at one or more non-word characters:
String[] parts = str.split("\\W+");
Note: Non-word characters mean anything other than _
, letters and digits. If you only want lettes than you would have to go with @Bailey S 's answer.
您可以使用replaceAll。例如s.replaceAll(“ [?。,]”,“”)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.