简体   繁体   中英

Java 8: How to list files that don't match pattern using regex

For some reason, my code will print ALL files when I want it to only print out the files that DONT match my regex pattern.. I need it to print out the files that dont match the pattern because I dont know all the possible inconsistencies there are in the file naming. I checked my regex pattern on regex101 and it is correct. I am not a coder, but I am a psychology student working on a mass database.

Ive tried making Pattern into a list pattern, and I tried putting patternList.matcher(file.getName()) into like its own Matcher variable.

    private static void checkFolder(File root, Pattern patternList) {
        for(File file : root.listFiles())

        if(file.isFile()){

            if(patternList.matcher(file.getName()).matches())
                checkFolder(file, patternList);
            else 
                System.out.println(file); //print if it does not match
        }

For example, If my code looks at these file names:

  • 95F Front Anger.BW
  • 95F.Front.Anger.C.Micro
  • 95F.Front.Fear.C.Micro
  • 95F.Front.Frown.BW

And my regex is this:

    Pattern patternList = Pattern.compile("((\\d{1,3}(F|M)\\.(Front|Profile|Right)"
    +"\\.(Anger|Fear|Frown|Smile)\\.(BW\\.Micro|BW|C\\.Micro|C)))|"
    +"(\\d{1,3}(F|M)\\.(Front|Profile|Right)\\.(Neutral|Smile)\\."
    +"(C\\.Micro|C|BW\\.Micro|BW|HighLight|LowLight|MedLight)\\.(BW\\.Micro|BW|C\\.Micro|C))|"
    +"(\\d{1,3}(F|M)\\.(Selfie1|Selfie2|StudentID)\\.(C\\.Micro|C|BW\\.Micro|BW))")

My code should only print out 95F Front Anger.BW, because it has whitespaces instead of dots, but my code still prints out all four filenames.

I also tried doing this:

    private static void checkFolder(File root, Pattern patternList) {
    for(File file : root.listFiles())

        if(file.isFile()){

            if(patternList.matcher(file.getName()).matches()){
                 checkFolder(file, patternList);  //call checkfolder if the filename matches the pattern

            }
            else if(!patternList.matcher(file.getName()).matches())
            {
               System.out.println(file); //print the file that doesnt match the regex
            }

        }       

Untested, but I'm guessing you want something like this, assuming you are only looking for files that match the pattern:

private static void checkFolder(File dir, Pattern patternList) {
    for(File file : dir.listFiles()) {
        if (file.isFile()) {
            // only check pattern against files not directories
            if(!patternList.matcher(file.getName()).matches())
                System.out.println(file);
        } else {
            // recurse into any/all sub-directories
            checkFolder(file, patternList);
        }
    }
}

If you wanted to do something with the results other than just print them, you could concatenate into a List.

(and yes, to be pedantically complete, recursion is not the best solution if you expect to traverse deep file system paths, this can be changed into looping with a stack at the cost of extra complexity)

Your expression works just fine, you might just want to replace \\. with something like:

(?:\.|\s+)

or

\s*\.?

or

[.\s]

or

[. ]

where necessary, and it might work OK.

((\d{1,3}([FM])(?:\.|\s+)(Front|Profile|Right)(?:\.|\s+)(Anger|Fear|Frown|Smile)(?:\.|\s+)(BW\.Micro|BW|C\.Micro|C)))|(\d{1,3}(F|M)\.(Front|Profile|Right)\.(Neutral|Smile)\.(C\.Micro|C|BW\.Micro|BW|HighLight|LowLight|MedLight)\.(BW\.Micro|BW|C\.Micro|C))|(\d{1,3}(F|M)\.(Selfie1|Selfie2|StudentID)\.(C\.Micro|C|BW\.Micro|BW))

Test

import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "((\\d{1,3}([FM])(?:\\.|\\s*)(Front|Profile|Right)(?:\\.|\\s*)(Anger|Fear|Frown|Smile)(?:\\.|\\s*)(BW\\.Micro|BW|C\\.Micro|C)))|(\\d{1,3}(F|M)\\.(Front|Profile|Right)\\.(Neutral|Smile)\\.(C\\.Micro|C|BW\\.Micro|BW|HighLight|LowLight|MedLight)\\.(BW\\.Micro|BW|C\\.Micro|C))|(\\d{1,3}(F|M)\\.(Selfie1|Selfie2|StudentID)\\.(C\\.Micro|C|BW\\.Micro|BW))";
final String string = "95F    Front   Anger   BW\n"
     + "95F Front Anger BW\n"
     + "95F Front Anger.BW\n"
     + "95F.Front.Anger.C.Micro\n"
     + "95F.Front.Fear.C.Micro\n"
     + "95F.Front.Frown.BW";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Full match: " + matcher.group(0));
    for (int i = 1; i <= matcher.groupCount(); i++) {
        System.out.println("Group " + i + ": " + matcher.group(i));
    }
}

RegEx Circuit

jex.im visualizes regular expressions:

在此处输入图片说明

The expression is explained on the top right panel of regex101.com , if you wish to explore/simplify/modify it, and in this link , you can watch how it would match against some sample inputs, if you like.


Also, maybe if we would design our expression(s) in this fashion,

^(\d{1,3})\s*([FM])\s*\.?(\w+)\s*\.?(\w+)\s*\.?(\w+\.\w+|\w+)\s*$ 

that would be maybe OK too, not sure though.

DEMO 2


To negate the pattern, we can try:

^(?!(?:((\d{1,3}([FM])\.(Front|Profile|Right)\.(Anger|Fear|Frown|Smile)\.(BW\.Micro|BW|C\.Micro|C)))|(\d{1,3}(F|M)\.(Front|Profile|Right)\.(Neutral|Smile)\.(C\.Micro|C|BW\.Micro|BW|HighLight|LowLight|MedLight)\.(BW\.Micro|BW|C\.Micro|C))|(\d{1,3}(F|M)\.(Selfie1|Selfie2|StudentID)\.(C\.Micro|C|BW\.Micro|BW)))).*$

DEMO 3


Or a bit more simplified, maybe, if that be OK:

DEMO 4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM