简体   繁体   English

sed 在文件末尾添加不需要的空格,使其无效

[英]sed is adding unwanted whitespace to end of file, making it invalid

Trying to replace file contents using sed, the replacement works, but for some reason I am getting extra white space at the end of the resulting output file, causing the file to be unreadable/unviewable in the opening application.尝试使用 sed 替换文件内容,替换有效,但由于某种原因,我在生成的输出文件末尾获得了额外的空白,导致文件在打开的应用程序中不可读/不可见。

My command is as follows:我的命令如下:

for file in *.example ; do LANG=C sed -i "" "s|https://foo.bar|http://foo.bar|g" "$file" ; done

Things I have tried without success:我尝试过但没有成功的事情:

  • Not wrapping the s/[...]/g argument in quotes (causes command to fail)不将 s/[...]/g 参数用引号括起来(导致命令失败)
  • Using delimiters other than |使用除 | 之外的分隔符such as / or _ or % (makes no difference)例如 / 或 _ 或 % (没有区别)
  • Using single quotes instead of double (makes no difference)使用单引号代替双引号(没有区别)
  • Escaping the periods and colons as well (makes no difference)也转义句点和冒号(没有区别)

EDIT: This issue appears to be file-type related, and therefore I am no longer interested in a solution.编辑:这个问题似乎与文件类型有关,因此我不再对解决方案感兴趣。 Thank you to those who've replied.谢谢回复的人。

I suggest to replace我建议更换

\foo.bar

by经过

foo.bar

With the benefit of hindsight:事后诸葛亮:

BSD/macOS sed is fundamentally unsuitable for making substitutions in binary files , because it invariably outputs a trailing \\n (newline) with every output command. BSD/macOS sed从根本上不适合在二进制文件中进行替换,因为它总是在每个输出命令中输出一个尾随\\n (换行符)

By contrast, GNU sed doesn't have this problem , because it - commendably - only appends a \\n if the input "line" had one too.相比之下, GNU sed没有这个问题,因为它 - 值得称赞的是 - 如果输入“行”也有一个\\n ,它只会附加一个\\n

Note that the concept of newline-separated lines doesn't really apply to binary input: newlines may or may not be present, and potentially with large chunks of data in between.请注意,换行符分隔的行的概念并不真正适用于二进制输入:换行符可能存在也可能不存在,并且中间可能有大块数据。 In the worst case scenario, the entire input will be read at once .在最坏的情况下,将一次读取整个输入。 [1] [1]

You can test this behavior with the following command:您可以使用以下命令测试此行为:

sed -n 'p' <(printf 'x') | cat -et  # input printf 'x' has no trailing \n

Output x$ indicates that a newline (symbolized as $ by cat -et ) was appended (BSD Sed), whereas just x indicates that it was not (GNU Sed).输出x$表示附加了一个换行符(用cat -et符号表示为$ )(BSD Sed),而仅x表示它不是(GNU Sed)。

Thus, given that you're on macOS, you could use Homebrew to install GNU Sed with brew install gnu-sed and then use the following command:因此,假设您使用的是 macOS,您可以使用Homebrew使用brew install gnu-sed安装 GNU Sed,然后使用以下命令:

LANG=C gsed -i 's|https://foo.bar|http://foo.bar|g' *.example
  • Homebrew installs GNU Sed as gsed , so that it can exist alongside macOS's stock (BSD) sed . Homebrew 将 GNU Sed 安装为gsed ,以便它可以macOS 的股票 (BSD) sed一起存在。

  • LANG=C (slightly more robustly: LC_ALL=C ) is needed to pass all bytes of the binary input through as-is, without causing problems stemming from binary bytes not being recognized as valid characters ). LANG=C (稍微更健壮: LC_ALL=C )需要按原样传递二进制输入的所有字节,而不会导致因二进制字节未被识别为有效字符而引起的问题)。
    Note that this approach limits you to ASCII-only characters in the substitution (unless you explicitly add byte values as escape sequences).请注意,此方法将您限制在替换中只能使用 ASCII 字符(除非您明确添加字节值作为转义序列)。

  • Note the different, incompatible -i syntax for in-place updating without backup - no (separate) option-argument here;请注意不同的、不兼容的-i语法,用于在没有备份的情况下进行就地更新 - 此处没有(单独的)选项参数; see this answer of mine for background.请参阅我的这个答案以了解背景。

  • Note how '...' (single-quoting) is used around the Sed script, which is generally preferable, as it avoids confusion between shell expansions that happen up front and what Sed ends up seeing.请注意如何在 Sed 脚本周围使用'...' (单引号),这通常更可取,因为它避免了前面发生的 shell 扩展与 Sed 最终看到的内容之间的混淆。


[1] Aside from memory use, it is fine to use Sed's default line-parsing behavior here, given that your substitution command doesn't match newlines. [1] 除了内存使用之外,在这里使用 Sed 的默认行解析行为也很好,因为您的替换命令与换行符不匹配。 If you want to break the input into "lines" by NULs (and also use NULs on output), however, you can use GNU Sed's -z option.但是,如果您想通过 NUL 将输入分成“行”(并在输出中使用 NUL),则可以使用 GNU Sed 的-z选项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM