简体   繁体   中英

Search and replace html tags in sed recursively

I am trying to write a script to search and remove htm and html tags from all files recursively. The starting point is given as input in the command to run the script. The resultant files should be saved in new file at the same place ending with _changed. eg, start.html > start.html_changed. Here is the script I wrote so far. It works fine, but the output prints out to the terminal, and I want it to be saved in files respectively.

#!/bin/bash

sudo find $1 -name '*.html' -type f -print0 | xargs -0 sed -n '/<div/,/<\/div>/p'

sudo find $1 -name '*.htm' -type f -print0 | xargs -0 sed -n '/<div/,/<\/div>/p'

Any help is much appreciated.

The following script works just fine, but it is not recursive. how can I make it recursive?

#!/bin/bash

for l in /$1/*.html
 do
   sed -n '/<div/,/<\/div>/p' $l > "${l}_nobody"
 done

for m in /$1/*.htm
 do
   sed -n '/<div/,/<\/div>/p' $m > "${m}_nobody"
 done

Just edit the xargs part as follows:

xargs -0 -I {} sh -c "sed -n '/<div/,/<\/div>/p' {} > {}_changed"

Explanation:

  • -I {} : sets a placeholder
  • > {}_changed" : does redirection to the file with _changed suffix

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM