简体   繁体   中英

extract date from a file name in unix using shell scripting

I am working on shell script. I want to extract date from a file name.

The file name is: abcd_2014-05-20.tar.gz

I want to extract date from it: 2014-05-20

echo abcd_2014-05-20.tar.gz |grep -Eo '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'      

Output:

2014-05-20

grep got input as echo stdin or you can also use cat command if you have these strings in a file.

-E Interpret PATTERN as an extended regular expression.

-o Show only the part of a matching line that matches PATTERN.

[[:digit:]] It will fetch digit only from input.

{N} It will check N number of digits in given string, ie: 4 for years 2 for months and days

Most importantly it will fetch without using any separators like "_" and "." and this is why It's most flexible solution.

Using awk with custom field separator, it is quite simple:

echo 'abcd_2014-05-20.tar.gz' | awk -F '[_.]' '{print $2}'
2014-05-20

Use grep :

$ ls -1 abcd_2014-05-20.tar.gz | grep -oP '[\d]+-[\d]+-[\d]+'
2014-05-20
  • -o causes grep to print only the matching part
  • -P interprets the pattern as perl regex
  • [\\d]+-[\\d]+-[\\d]+ : stands for one or more digits followed by a dash (3 times) that matches your date.

I will use some kind of regular expression with the "grep" command, depending on how your file name is created.

If your date is always after "_" char I will use something like this.

ls -l | grep ‘_[REGEXP]’

Where REGEXP is your regular expression according to your date format.

Take a look here http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

Multiple ways you could do it:

echo abcd_2014-05-20.tar.gz | sed -n 's/.*_\(.*\).tar.gz/\1/p'

sed will extract the date and will print it.

Another way:

filename=abcd_2014-05-20.tar.gz
temp=${filename#*_}
date=${temp%.tar.gz}

Here temp will hold string in file name post "_" ie 2014-05-20.tar.gz Then you can extract date by removing .tar.gz from the end.

Here few more examples,

  1. Using cut command (cut gives more readability like awk command)
echo "abcd_2014-05-20.tar.gz" | cut -d "_" -f2 | cut -d "." -f1

Output is:

2014-05-20
  1. using grep commnad
echo "abcd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output is:

2014-05-20

An another advantage of using grep command format is that, it will also help to fetch multiple dates like this:

echo "ab2014-15-12_cd_2014-05-20.tar.gz" | grep -Eo "[0-9]{4}\-[0-9]{2}\-[0-9]{2}"

Output is:

2014-15-12
2014-05-20

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM