简体   繁体   中英

Running python subprocess.call on tgz file to untar and stream output

I'm using a subprocess call to untar a file in the command line, I need to use the output of that call to stream into a temp file so I can read the contents of the "+CONTENTS" folder with in the tgz file.

My failed output is:

./streamContents.py rsh: ftp: No address associated with hostname tar (child): ftp://myftpserver.com/pkgsrc/doxygen_pkgs/test. tgz: Cannot open: Input/output error tar (child): Error is not recoverable: exiting now

gzip: stdin: unexpected end of file tar: Child returned status 2 tar: Error exit delayed from previous errors Traceback (most recent call last): File "./streamContents.py", line 29, in stream = proc.stdout.read(8196) AttributeError: 'int' object has no attribute 'stdout'


from io import BytesIO
import urllib2
import tarfile
import ftplib
import socket
import threading
import subprocess

tarfile_url = "ftp://myftpserver.com/pkgsrc/doxygen_pkgs/test.tg

    ftpstream = urllib2.urlopen(tarfile_url)
except URLerror, e:
    print "URL timeout"
except socket.timeout:
    print "Socket timeout"

# BytesIO creates an in-memory temporary file.
tmpfile = BytesIO()
last_size = 0
tfile_extract = ""

while True:
    proc = subprocess.call(['tar','-xzvf', tarfile_url], stdout=subprocess.PIPE)
    # Download a piece of the file from the ftp connection
    stream = proc.stdout.read(8196)
    if not stream: break
    # Seeking back to the beginning of the temporary file.
    # r|gz forbids seeking backward; r:gz allows seeking backward
       tfile = tarfile.open(fileobj=tmpfile, mode="r:gz")
       print tfile.extractfile("+CONTENTS")
       tfile_extract_text = tfile_extract.read()
       print tfile_extract.tell()
       if tfile_extract.tell() > 0 and tfile_extract.tell() == last_size:
          print tfile_extract_text
          last_size = tfile_extract.tell()
    except Exception:

tfile_extract_text = tfile_extract.read()
print tfile_extract_text

# When you're done:

Expanding on my comment above, you need to do download the tar file using urllib2 and tempfile to a temporary file and then open this temporary file using tarfile .

Here's some code to get started:

import urllib2
import tarfile
from tempfile import TemporaryFile

f_url = 'url_of_your_tar_archive'
ftpstream = urllib2.urlopen(f_url)
tmpfile = TemporaryFile()

# Download contents of tar to a temporary file
while True:
    s = ftpstream.read(16384)
    if not s:

# Access the temporary file to extract the file you need
tfile = tarfile.open(fileobj=tmpfile, mode='r:gz')
print tfile.getnames()
contents = tfile.extractfile("+CONTENTS").read()
print contents

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM