简体   繁体   中英

Perl: Help writing a regular expression

I am trying to write a common regular expression for the below 3 cases:

  • Supernatural_S07E23_720p_HDTV_X264-DIMENSION.mkv
  • the.listener.313.480p.hdtv.x264-2hd.mkv
  • How.I.met.your.mother.s02e07.hdtv.x264-xor.avi

Now my regular exoression should remove the series name from the original string i,e the output of above string will be:

  • S07E23_720p_HDTV_X264-DIMENSION.mkv
  • 313.480p.hdtv.x264-2hd.mkv
  • s02e07.hdtv.x264-xor.avi

Now for the basic case of supernatural string I wrote the below regex and it worked fine but as soon as the series name got multiple words it fails.

$string =~ s/^(.*?)[\.\_\- ]//i; #delimiter can be (. - _ )

So, I have no idea how to proceed for the aboves cases I was thinking along the lines of \\w+{1,6} but it also failed to do the required.

PS: Explanation of what the regular expression is doing will be appreciated.

you can detect if the .'s next token contains digit, if not, consider it as part of the name.

HOWEVER, I personally think there is no perfect solution for this. it'd still meet problem for something like:

24.313.480p.hdtv.x264-2hd.mkv            // 24
Warehouse.13.s02e07.hdtv.x264-xor.avi    // warehouse 13

As StanleyZ said, you'll always get into trouble with names containing numbers.

But, if you take these special cases appart, you can try :

#perl

$\=$/;

map {

    if (/^([\w\.]+)[\.\_]([SE\d]+[\.\_].*)$/i) {
        print "Match : Name='$1'        Suffix='$2'";
    } else {
        print "Did not match $_";
    }
}
qw!
    Supernatural_S07E23_720p_HDTV_X264-DIMENSION.mkv
    the.listener.313.480p.hdtv.x264-2hd.mkv
    How.I.met.your.mother.s02e07.hdtv.x264-xor.avi
  !;

which outputs :

Match : Name='Supernatural'     Suffix='S07E23_720p_HDTV_X264-DIMENSION.mkv'
Match : Name='the.listener'     Suffix='313.480p.hdtv.x264-2hd.mkv'
Match : Name='How.I.met.your.mother'     Suffix='s02e07.hdtv.x264-xor.avi'

note : aren't you doing something illegal ? ;)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM