简体   繁体   中英

extracting list of xml files from tar.gz file from ftp server

I need to extract a list of xml files that are in a tar.gz file that I'm trying to read.

I tried this:

import os
from ftplib import FTP

def writeline(data):
    filedata.write(data)
    filedata.write(os.linesep)

ftp = FTP('ftp.my.domain.com')
ftp.login(user="username",passwd="password")
ftp.cwd('inner_folder')
filedata = open('mytargz.tar.gz', 'w')
ftp.retrlines('RETR %s' % ftp.nlst()[0], writeline)

I used ftp.nlst()[0] because I have a list of tar.gz files in my ftp. It looks like the data that I'm receiving in my writeline callback is some weird symbols, and than the filedata.write(data) is throwing an error: {UnicodeEncodeError}'charmap' codec can't encode character '\\x8b' in position 1: character maps to <undefined> . I can really use some help here..

I dont have an ftp server to try this with, but this should work:

import os
from ftplib import FTP

def writeline(data):
    filedata.write(data)

ftp = FTP('ftp.my.domain.com')
ftp.login(user="username",passwd="password")
ftp.cwd('inner_folder')
filedata = open('mytargz.tar.gz', 'wb')
ftp.retrbinary('RETR %s' % ftp.nlst()[0], writeline)



note that we open the file with write binary 'wb' and we ask the ftp to return binary and not text and that our callback function only write without adding seperators

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM