I have have a class that check id a phrase is contained in a message, I tried to do it with Matcher
and Pattern
and with String.contains()
, but the results returned are odd.
Here is the class:
public class MotsClesFilter implements EmailFilter {
final String NAME = "Filtrage par mots cles";
/*private Pattern chaineSpam;
private Matcher chaineCourriel;*/
private int nbOccMotSpam;
private byte confidenceLevel;
@Override
public String getFilterName() {
return this.NAME;
}
@Override
public byte checkSpam(MimeMessage message) {
analyze(message);
if(this.nbOccMotSpam==0)
this.confidenceLevel = 1;
else if (this.nbOccMotSpam>0 && this.nbOccMotSpam<2)
this.confidenceLevel = CANT_SAY;
else if (this.nbOccMotSpam>1 && this.nbOccMotSpam<3)
this.confidenceLevel = 50;
else if (this.nbOccMotSpam>3 && this.nbOccMotSpam<4)
this.confidenceLevel = 65;
else if (this.nbOccMotSpam>4 && this.nbOccMotSpam<5)
this.confidenceLevel = 85;
else this.confidenceLevel = 90;
return (getConfidenceLevel());
}
public void analyze(MimeMessage message){
try {
List<String> listeChaines = new ArrayList<String>();
BufferedReader bis = new BufferedReader(new InputStreamReader(new FileInputStream(new File("SpamWords.txt"))));
while(bis.ready()){
String ligne = bis.readLine();
listeChaines.add(ligne);
}
String mail = ((String.valueOf(message.getContent())));
//System.out.println(mail);
for (int j =0; j<listeChaines.size();j++){
//System.out.println(listeChaines.get(j));
Pattern chaineSpam = Pattern.compile(listeChaines.get(j),Pattern.CASE_INSENSITIVE);
Matcher chaineCourriel = chaineSpam.matcher(mail);
if (chaineCourriel.matches())
this.nbOccMotSpam++;
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (MessagingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
@Override
public byte getConfidenceLevel() {
// TODO Auto-generated method stub
return this.confidenceLevel;
}
@Override
public boolean enabled() {
// TODO Auto-generated method stub
return true;
}
}
The results returned by checkSpam
are always 1 if use matches and 90 if I use find, it also returns 90 when I use mail.contains(listeChaines.get(j))
.
That means that the message doesn't match any of the strings in the file, but that there are at least 5 strings in the file that can be found inside the message.
matches()
checks if the whole string matches the pattern. Not if some substring matches it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.