I posted this question here on purpose, although - for my case - it is laTex related. The problem itself is generic though.
I've got several *.tex
-source files containing references to images, eg image1.jpg
and image2.png
. I want to search all source file for a specified set of extensions (in this case jpg and png ) and replace them with their pdf counterparts. Consequently I want to end up with the references image1.pdf
and image2.pdf
. To complicate the problem, I've got a list of 20 files (image3.jpg, image 4.png etc.) that I don't want changed.
Is there a simple solution out there (sed-based or any tool suggestions?) which might help? I'm no regular expressions guru, though. ;)
A easier way is:
image3.jpg, image4.png
--> image3.JPG, image4.PNG
*.jpg, *.png
--> *.pdf
image3.JPG, image4.PNG
--> image3.jpg, image4.png
I'd suggest using find
to get the file list, filter it with grep -v
to remove the files you don't want changed and then use xargs
to run on those files sed -r -i 's/image([12])\\.(jpg|png)/image\\1.pdf/g'
. Take care, sed -i
performs replacements in place, so it's better to backup your files in case something goes wrong. You can also use sed -ibak
to let sed
make a backup with bak
extension before it modifies a file.
Replace the filenames on your blacklist into some "magic" strings so that they will not look like filenames and will not be matched by anything.
cat your.tex | \
sed 's/image3.jpg/MMMAAAGGGIIICCC1/g' | \
sed 's/image4.png/MMMAAAGGGIIICCC2/g' | \
# ... all others on your blacklist
# then do regexp replacements
sed 's/\(image[0-9]\+\)\.\(png\|jpg\)/\1.pdf/g' | \
# ... convert all "magics" back
sed 's/MMMAAAGGGIIICCC1/image3.jpg/g' | \
sed 's/MMMAAAGGGIIICCC2/image4.png/g' | \
# ... and many others
# then output
cat > output.tex
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.