I am trying to write a script to search and remove htm and html tags from all files recursively. The starting point is given as input in the command to run the script. The resultant files should be saved in new file at the same place ending with _changed. eg, start.html > start.html_changed. Here is the script I wrote so far. It works fine, but the output prints out to the terminal, and I want it to be saved in files respectively.
#!/bin/bash
sudo find $1 -name '*.html' -type f -print0 | xargs -0 sed -n '/<div/,/<\/div>/p'
sudo find $1 -name '*.htm' -type f -print0 | xargs -0 sed -n '/<div/,/<\/div>/p'
Any help is much appreciated.
The following script works just fine, but it is not recursive. how can I make it recursive?
#!/bin/bash
for l in /$1/*.html
do
sed -n '/<div/,/<\/div>/p' $l > "${l}_nobody"
done
for m in /$1/*.htm
do
sed -n '/<div/,/<\/div>/p' $m > "${m}_nobody"
done
Just edit the xargs
part as follows:
xargs -0 -I {} sh -c "sed -n '/<div/,/<\/div>/p' {} > {}_changed"
Explanation:
-I {}
: sets a placeholder > {}_changed"
: does redirection to the file with _changed
suffix
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.