How to extract specific parts of the path and filename in linux

Question

My current task is renaming a whole lot of files across multiple directories to different identifiers.

So I have several directories like: b01, b02, b03, etc. Within each directory is filenames such as img01.23495.png, img01.3596596.png, img02.2399495.png, etc.

I have to rename the img01 of b01 to some other identifier. So the identifier is dependent on the directory name and the first part of the filename.

My thoughts on the pipeline is this: get all the png filenames, extract which folder it is in, extract the img## part, and store the information into a file, so I'd get a file with something like:

b01 img01
b01 img02
b02 img01
...

This is useful so I can specify afterwords what the new identifier is as the third column, then read in the file to perform the actual renaming.

Currently, I have paths such as ./images/something/b01/img01.2342394.png.

I'm stuck on the sed part, however. Also any suggestions to do what I'm trying to do is welcomed as well.

Answer 1

find . -name "*.png" | sed 's#^.*/\([^/]*\)/\([^/.]*\)\.[0-9]\+\.png$#\1 \2#' | sort -u

Sorry I can't get a full test on that - I'm at work and stuck on OSX, which has weird sed issues. Anyway, the core of the solution (besides using the -name test for find and the -u flag for sort ) is the sed Regular Expression. You seem to have a handle, but I'll explain the whole thing in case anyone finds it:

s - Search and Replace
  # - Delimiter (Search pattern)
    ^ - Beginning of a line
    . - Any character
    * - zero or more times
    / - a literal '/'
    \( - start a capturing group
      [^/]* - Any character except '/', zero or more times
    \) - End capturing group (#1)
    / - a literal '/'
    \( - start a capturing group
      [^/.]* - Any character except '/' or '.', zero or more times
    \) - End capturing group (#2)
    \. - a literal '.'
    [0-9] - a digit
    \+ - one or more times
    \.png - a literal '.png'
    $ - end of the line
  # - Delimiter, now starting the replace pattern
    \1 - the contents of the first capturing group
       - a space
    \2 - the contents of the second capturing group
  # - Delimiter.  End of all patterns.

How to extract specific parts of the path and filename in linux

Question

1 answers

solution1
2 ACCPTED 2013-05-06 17:15:38

How to extract specific parts of the path and filename in linux

Question

1 answers

solution1 2 ACCPTED 2013-05-06 17:15:38

solution1
2 ACCPTED 2013-05-06 17:15:38