
[英]How to find the size of a partially known name file in a partially known name folder in a known drive
[英]How to parse a file, name of which is not known
我想使用s3cmd界面从s3下载文件。 我正在使用命令:
s3cmd get s3://db-backups/db/production_dump_2013-09-12_12-00.sql.gz dump1.sql.g
该命令运行正常。 接下来,我要自动执行下载文件的任务。 目录中有多个名称相似的文件,只是时戳不同,例如:
production_dump_2013-09-12_09-00.sql.gz
production_dump_2013-09-12_12-00.sql.gz
production_dump_2013-09-12_15-00.sql.gz
production_dump_2013-09-12_18-00.sql.gz
production_dump_2013-09-12_21-00.sql.gz
如何下载最新文件? 如果文件名已知,那么我可以使用:
cmd = 's3cmd get s3://voylladb-backups/db/production_dump_2013-09-12_12-00.sql.gz dump1.sql.gz'
args = shlex.split(cmd)
p=subprocess.Popen(args)
p.wait()
如何修改此文件(或使用其他方法)以获取具有最新时间戳的文件?
谢谢
您可以使用s3cmd ls s3://voylladb-backups/db/
。
然后假设您返回一个列表,则可以将其反向排序并取得第一项。 这可能不是写此代码的最简洁的方法,但是它应该可以工作:
import subprocess, re
# Use subprocess.check_output to get the output from the terminal command
lines = subprocess.check_output("s3cmd ls s3://voylladb-backups/db/".split(" ")).split("\n")
# the format is a bit weird so we want to isolate just the s3:// paths
# we'll use a regex search to find the s3:// pattern and any subsequent characters
file_re = re.compile("s3://.+")
files = []
# next we iterate over each line of output from s3cmd ls looking for the s3 paths
for line in lines:
result = file_re.search(line)
if result:
# and add them to our list
files.append(result.group(0))
# finally, reverse the list so the newest file is first, and grab the first item
files.sort(reverse=True)
print files[0] # production_dump_2013-09-12_21-00.sql.gz
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.