[英]Python - How to correctly get content between two offset in a file?
I'm trying to get content between two offset (essentially part of a file).我试图在两个偏移量(基本上是文件的一部分)之间获取内容。 For that, I found fileslice to be useful.为此,我发现文件切片很有用。
For testing I'm using a file called hello
with the string:为了测试,我使用了一个名为hello
的文件和字符串:
helloworld
I left a newline deliberately since I'm doing testing on different things.因为我正在对不同的东西进行测试,所以我故意留下了一个换行符。
Using this code:使用此代码:
from fileslice import Slicer
import sys
r = open('hello', 'r')
slicer = Slicer(r)
start = int(sys.argv[1])
size = int(sys.argv[2])
fileslice = slicer (start, size)
sys.stdout.write(fileslice.read())
Anyway, the problem I'm facing is that, when using certain offset range, it seems like the wrong character represented by the offset get displayed...:无论如何,我面临的问题是,当使用某些偏移量范围时,似乎显示偏移量表示的错误字符......:
:~/fileslice$ wc -c hello # using wc to check the size
11 hello
:~/fileslice$ python -u "/home/user/fileslice/testslice.py" 0 11 | xxd # works
00000000: 6865 6c6c 6f77 6f72 6c64 0a helloworld.
:~/fileslice$ python -u "/home/user/fileslice/testslice.py" 0 10 | xxd # works
00000000: 6865 6c6c 6f77 6f72 6c64 helloworld
:~/fileslice$ python -u "/home/user/fileslice/testslice.py" 1 10 | xxd # doesn't work as expected
00000000: 656c 6c6f 776f 726c 640a elloworld.
Here I'm using the previously mentioned test file and code and pipe the output to wc
(to check the size) then after that, do a couple testing and checking the output in Hex with xxd
.在这里,我使用前面提到的测试文件和代码,并将输出通过管道传输到wc
(以检查大小),然后,使用xxd
进行一些测试并检查十六进制的输出。
As it can be seen, the one commented "works" work as expected, as in, i can get the content between the two offset just fine.可以看出,评论“有效”的人按预期工作,因为我可以很好地获得两个偏移量之间的内容。
But for the last one, where i wanted to get content between the char e
(in this case offset 1
) which "work" but then, notice that the previously discarded newline (offset 10
) appear again, contrary to the previous test which worked fine/as excepted...但在过去的一个,在这里我想获得的焦炭之间的内容e
(在这种情况下偏移1
),它的“工作”,但随后,通知称,以前丢弃的换行符(偏移10
)再次出现,与以前的测试,工作很好/例外...
How can i correctly get content of a file using two offset?如何使用两个偏移量正确获取文件的内容? (start/end) (开始/结束)
大小是两个偏移量之间的距离,即结束减去开始。
size = int(sys.argv[2]) - int(sys.argv[1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.