简体   繁体   中英

BASH shell scripting file parsing [newbie]

I am trying to write a bash script that goes through a file line by line (ignoring the header), extracts a file name from the beginning of each line, and then finds a file by this name in one directory and moves it to another directory. I will be processing hundreds of these files in a loop and moving over a million individual files. A sample of the file is:

ImageFileName    Left_Edge_Longitude    Right_Edge_Longitude   Top_Edge_Latitude  Bottom_Edge_Latitude

21088_82092.jpg:  -122.08007812500000  -122.07733154296875    41.33763821961143    41.33557596965434

21088_82093.jpg:  -122.08007812500000  -122.07733154296875    41.33970040427444    41.33763821961143

21088_82094.jpg:  -122.08007812500000  -122.07733154296875    41.34176252364274    41.33970040427444

I would like to ignore the first line and then grab 21088_82092.jpg as a variable. File names may not always be the same length, but they will always have the format digits_digits.jpg

Any help for an efficient approach is much appreciated.

这应该使您开始:

$ tail -n +2 input | cut -f 1 -d: | while read file; do test -f $dir/$file && mv -v $dir/$file $destination; done

You can construct a script that will do something like this, then simply run the script. The following command will give you a script which will copy the files from one place to another, but you can make the script generation more complex simply by changing the awk output:

pax:~$ cat qq.in
ImageFileName     Left_Edge_Longitude  Right_Edge_Longitude
21088_82092.jpg:  -122.08007812500000  -122.07733154296875
21088_82093.jpg:  -122.08007812500000  -122.07733154296875
21088_82094.jpg:  -122.08007812500000  -122.07733154296875

pax:~$ awk -F: '/^[0-9]+_[0-9]+.jpg:/ {
        printf "cp /srcdir/%s /dstdir\n",$1
    } {}' qq.in

cp /srcdir/21088_82092.jpg /dstdir
cp /srcdir/21088_82093.jpg /dstdir
cp /srcdir/21088_82094.jpg /dstdir

You capture the output of that script (the last three lines) to another file then that file is your script for doing the actual copies.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM