[英]Why do scripts behave differently called from commandline vs git attribuites?
Updated scripts attached below, these are now working on my sample document 下面附有更新的脚本,这些脚本现在可以在我的示例文档中使用
Why do the following python scripts perform differently when called via git attributes or from command line? 当通过git属性或从命令行调用以下python脚本时,为什么表现不同?
What I have are two scripts that I modeled based on the mercurial zipdoc functionality. 我有两个脚本是基于Mercurial zipdoc功能建模的。 All I'm attempting to do is unzip docx files on store (filter.clean) and zip them on restore (filter.smudge).
我要做的只是在存储(filter.clean)上解压缩docx文件,然后在还原(filter.smudge)上压缩它们。 I have two scripts working well, but once I put them into git attribute they don't work and I don't understand why.
我有两个脚本运行良好,但是一旦将它们放到git属性中,它们将无法运行,而且我也不明白为什么。
I've tested by doing the following 我已经通过以下操作进行了测试
Testing the Scripts (git bash) 测试脚本(git bash)
$ cat original.docx |
$ cat original.docx | python ~/Documents/pyscripts/unzip.py > uncompress.docx
python〜/ Documents / pyscripts / unzip.py> uncompress.docx
$ cat uncompress.docx |
$ cat uncompress.docx | python ~/Documents/pyscripts/zip.py > compress.docx
python〜/ Documents / pyscripts / zip.py> compress.docx
$ md5sum uncompress.docx compress.docx
$ md5sum uncompress.docx compress.docx
I can open both the uncompressed and compressed files with Microsoft Word with no errors. 我可以使用Microsoft Word打开未压缩文件和压缩文件,没有任何错误。 The scripts work as expected.
脚本按预期工作。
Test Git Attributes 测试Git属性
I'm really lost here, I thought git Attributes simply provides input on stdin and reads it from stdout. 我在这里真的迷路了,我认为git Attributes只是在stdin上提供输入,并从stdout读取它。 I've tested both scripts to work with a pipe from cat and a redirect from the output just fine.
我已经测试了两个脚本,它们都可以与cat的管道和输出的重定向一起使用。 I know the scripts are running b/c the files change size as expected, however a small change is introduced somewhere in the file.
我知道脚本正在运行b / c,文件会按预期更改大小,但是文件中的某个地方引入了一个小的更改。
Additional References 其他参考
I'm using msgit on Win7, all commands above were typed into the git bash window. 我在Win7上使用msgit,上面的所有命令都输入到git bash窗口中。
Git Attributes Description Git属性说明
Uncompress Script 解压缩脚本
import fileinput
import sys
import zipfile
# Set stdin and stdout to binary read/write
if sys.platform == "win32":
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
try:
from cStringIO import StringIO
except:
from StringIO import StringIO
# Wrap stdio into a file like object
inString = StringIO(sys.stdin.read())
outString = StringIO()
# Store each member uncompressed
try:
with zipfile.ZipFile(inString,'r') as inFile:
outFile = zipfile.ZipFile(outString,'w',zipfile.ZIP_STORED)
for memberInfo in inFile.infolist():
member = inFile.read(memberInfo)
memberInfo.compress_type = 0
outFile.writestr(memberInfo,member)
outFile.close()
except zipfile.BadZipfile:
sys.stdout.write(inString.getvalue())
sys.stdout.write(outString.getvalue())
Compress Script 压缩脚本
import fileinput
import sys
import zipfile
# Set stdin and stdout to binary read/write
if sys.platform == "win32":
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
try:
from cStringIO import StringIO
except:
from StringIO import StringIO
# Wrap stdio into a file like object
inString = StringIO(sys.stdin.read())
outString = StringIO()
# Store each member compressed
try:
with zipfile.ZipFile(inString,'r') as inFile:
outFile = zipfile.ZipFile(outString,'w',zipfile.ZIP_DEFLATED)
for memberInfo in inFile.infolist():
member = inFile.read(memberInfo)
memberInfo.compress_type = zipfile.ZIP_DEFLATED
outFile.writestr(memberInfo,member)
outFile.close()
except zipfile.BadZipfile:
sys.stdout.write(inString.getvalue())
sys.stdout.write(outString.getvalue())
Edit: Formatting Edit 2: Scripts updated to write to memory rather than stdout during file processing. 编辑:格式化编辑2:脚本已更新为在文件处理过程中写入内存而不是标准输出。
I've found that using zipfile.ZipFile() with the target being stdout was causing a problem. 我发现在目标为stdout的情况下使用zipfile.ZipFile()会引起问题。 Opening the zipfile with the target being a StringIO() and then at the end writing the full StringIO file into stdout has solved that problem.
打开目标为StringIO()的zip文件,然后最后将完整的StringIO文件写入stdout已解决了该问题。
I haven't tested this extensively and it's possible some .docx contents won't be handled well but only time will tell. 我尚未对此进行广泛的测试,有可能某些.docx内容无法很好地处理,但只有时间才能证明。 My test files now open with out error, and as a bonus the .docx file in your working directory is smaller due to using higher compression than the standard .docx format.
我的测试文件现在打开时没有错误,并且,由于使用了比标准.docx格式更高的压缩率,因此,工作目录中的.docx文件更小了。
I have confirmed that after performing several edits and commits on a .docx file I can open the file, edit one line, and commit with out a large delta added to the repo size. 我已经确认,在对.docx文件进行几次编辑和提交之后,我可以打开该文件,编辑一行,然后在不增加回购文件大小的情况下提交。 For example, a 19KB file, after 3 previous edits in the repo history, having a new line added at the top created a delta of only 1KB in the repo after performing garbage collection .
例如,一个19KB的文件在回购历史记录中进行了3次先前的编辑之后,在顶部添加了新行,从而在执行垃圾回收后在回购中仅产生了1KB的增量。 Running the same test (as close as I could) with Mercurial resulted in a 9.3KB delta commit.
使用Mercurial运行相同的测试(尽我所能)导致9.3KB增量提交。 I'm no Mercurial expert my understanding is there is no "gc" command for mercurial so none was run.
我不是水银专家,我的理解是没有用于水银的“ gc”命令,因此没有运行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.