简体   繁体   English

从gzip文件的开头剥离bash脚本

[英]Strip bash script from beginning of gzip file

I have a series of files which are comprised of a bash script, at the end of which a gzip file has been concatenated. 我有一系列由bash脚本组成的文件,最后连接了一个gzip文件。

I would like a method of stripping off the leading bash, to leave a pure gzip file. 我想要一种剥离主导bash的方法,留下一个纯粹的gzip文件。

The method I have come up with is to: 我提出的方法是:

  1. Do a hex dump on the file; 在文件上执行十六进制转储;
  2. Use sed to remove everything before the gzip magic number 1f 8b ; 使用sed删除gzip幻数1f 8b之前的所有内容;
  3. Convert the remaining hex dump back to binary. 将剩余的十六进制转储转换回二进制。

ie

xxd -c1 -p input | tr "\n" " " | sed 's/^.*?1f 8b/1f 8b' | xxd -r -p > output

This appears to work okay on first glance. 乍一看似乎没问题。 However, it would fall apart if the gzip portion of the file happens to contain the byte sequence 1f 8b apart from in the initial header. 但是,如果文件的gzip部分恰好包含除初始头部之外的字节序列1f 8b ,它就会崩溃。 In these cases it deletes everything before the last occurrence. 在这些情况下,它会在最后一次出现之前删除所有内容

Is my initial attempt on the right track, and what can I do to fix it? 我最初的尝试是在正确的轨道上,我该怎么做才能解决它? Or is there a much better way to do this that I have missed? 还是有一个更好的方法来做到这一点,我错过了?

Perl solution. Perl解决方案。 It sets the record separator to the magic sequence and prints all the records except the first one. 它将记录分隔符设置为魔术序列并打印除第一个记录之外的所有记录。 The magic sequence must be prepended at the beginning, otherwise, it would be lost together with the bash script, which is the first record. 魔术序列必须在开头添加,否则,它将与bash脚本一起丢失,这是第一个记录。

perl -ne 'BEGIN { $/ = "\x1f\x8b"; print $/; } print if $. != 1' input > output.gz

I would use the sed line range functionality to accomplish this. 我会使用sed行范围功能来实现这一目标。 -n suppresses normal printing, and the range /\\x1f\\x8b/,$ will match every line after and including the first one with \\x1f\\x8b in it and p rint them out. -n抑制正常打印,并且该范围/\\x1f\\x8b/,$将匹配后的每一行和包括第一次与\\ X1F \\ x8b在它和p RINT出来。

sed -n '/\x1f\x8b/,$ p'

Alternatively, depending on your tastes, you can add a text marker "### BEGIN GZIP DATA ###" and delete everything before and including it: 或者,根据您的喜好,您可以添加文本标记“### BEGIN GZIP DATA ###”并删除之前和之后的所有内容:

sed '1,/### BEGIN GZIP DATA ###/ d'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM