简体   繁体   English

使用SED替换大量HTML文件中的域名

[英]Using SED to replace a domain name in a large number of HTML files

Ok, I give up. 好吧,我放弃。 I've been trying for a couple of hours to get sed to replace an incorrectly formatted domain name in several thousand html files but I cannot seem to get the escaping of the slashes (and possibly dot/colon) correct. 我已经尝试了几个小时才能被sed替换成几千个html文件中格式错误的域名,但是我似乎无法正确地避开斜杠(可能还有点/冒号)。

Text to find: http://www.domain.com/http 查找文本: http : //www.domain.com/http

Replace with: http 替换为: http

What i have tried: 我试过的

sed -i 's/http:\/\/www.domain.com\/http/http/'
sed -i 's/http\\:\\/\\/www\\.domain\\.com\\/http/http/'
sed -i 's/http\:\/\/www\.domain\.com\/http/http/'
sed -i 's=http://www.domain.com/http=http='

UPDATE: 更新:

As it transpires I was chasing chasing ghosts. 当它发生时,我正在追逐幽灵。 A piece of javascript was adding the http://www.domain.com/ to the beginning of all my img tags! 一段JavaScript将http://www.domain.com/添加到了我所有img标签的开头! Unfortunately now I need to try and remove this from all pages. 不幸的是,现在我需要尝试将其从所有页面中删除。 So instead of the above, i am now looking to: 因此,除了上述内容之外,我现在正在寻找:

Replace this: http://www.domain.com/ '+img[0] 替换为: http : //www.domain.com/'+ img [0]

with this: '+img[0] 与此: '+ img [0]

I have tried the following to no avail: 我尝试了以下无济于事:

find . -name "*.html" -type f -exec sed -i 's|http://www\.domain\.com/\'+img\[0\]|\'+img\[0\]|g' {} \;
find . -name "*.html" -type f -exec sed -i 's|http://www\.domain\.com/\'+img[0]|\'+img[0]|g' {} \;

I appear to be stuck on the escaping of certain chars again. 我似乎再次被困在某些字符的转义中。 Only this time when i try to run one of the above commands it just takes me to a > prompt. 只有这次,当我尝试运行上述命令之一时,它才带我进入>提示符。

You can avoid alot of the escaping by using a different delimiter. 您可以使用其他定界符来避免大量转义。 The dot . . is the only character of special meaning that needs to be escaped, everything else you can match literally. 是唯一需要转义的特殊含义的字符,您可以从字面上匹配的所有其他字符。 Also use the global modifier with your pattern. 还要在样式中使用global修饰符。

sed -i 's|http://www\.domain\.com/http|http|g'

Edit — You can use the following to replace the other part. 编辑 -您可以使用以下内容替换另一部分。

sed -i "s|http://www\.domain\.com/\('[+]img\[0\]\)|\1|g"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM