简体   繁体   中英

How can I change file content with find and awk?

I'm having the following issue - I want to traverse all the xml files under a certain directory and prefix all the id -s I find with a certain prefix. I've written the following script to do that:

#!/bin/bash
find . -iregex .+?\.xml -print -exec awk '{print gensub(/(.*?)=\"(@(\+|)id)\/(.+)\"/, "\\1=\"\\2/prefix_\\4\"", "g", $1);}' {} > {} \;

However the redirection part - > {} won't work. The script will run fine and print everything as expected on the stdout, but it seems that the output cannot be redirected to the same file that awk read from. Any idea how to circumvent this? Thanks!

Exec it to bash instead:

find . -iregex .+\.xml -print \
-exec bash -c "awk '{print gensub(/(.*?)=\\"(@(\+|)id)\/(.+)\\"/, \"\\1=\"\\2/prefix_\\4\\"\", \"g\", \$1);}' {} > {}.NEW" \;

That will completely overwrite the file that you're reading from though (which is why I added .NEW to the file that I'm redirecting to). Perhaps what you wanted was the >> (append) redirector?


It's kind of hard to come up with a complete working example since you didn't provide an example file in your question. However, the following works:

If you have a file named tmp.xml that contains the following:

I have a "
I do not

And then run:

find . -name '111.xml' -exec bash -c "awk '\$0 ~ /\"/ { print \$0 }' {} > {}.NEW" \;

file tmp.xml.NEW will contain:

I have a "

Notice that, in addition to the double quotes, you have to escape your dollars ( $ ) in your script since the shell treats those as variables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM