![](/img/trans.png)
[英]How to get last modified date of latest file from S3 with Boto Python?
[英]databricks - mounted S3 - how to get file metadata like last modified date (Python)
我已经在数据砖中安装了s3存储桶,可以看到文件列表,也可以使用python读取文件
ACCESS_KEY = "XXXXXXXXXX"
SECRET_KEY = "XXXXXXXXXXXXXX"
ENCODED_SECRET_KEY = SECRET_KEY.replace("/", "%2F")
AWS_BUCKET_NAME = "testbucket"
MOUNT_NAME = "awsmount1"
dbutils.fs.mount("s3a://%s:%s@%s" % (ACCESS_KEY, ENCODED_SECRET_KEY, AWS_BUCKET_NAME), "/mnt/%s" % MOUNT_NAME)
display(dbutils.fs.ls("/mnt/%s/data" % MOUNT_NAME))
我想找出我正在读取的文件的上次修改日期,但找不到太多,但是java选项Databricks读取了Azure blob的Azure blob的上次修改日期 ,databricks中是否有python native选项来读取文件元数据。
如果我理解正确,则需要使用python native sdk在Azure数据块中装入文件的最后修改日期。
这是从Azure blob获取元数据信息的示例代码:
from azure.storage.blob import BlockBlobService
block_blob_service = BlockBlobService(account_name='accoutName', account_key='accountKey')
container_name ='containerName'
block_blob_service.create_container(container_name)
generator = block_blob_service.list_blobs(container_name)
for blob in generator:
lastModified= BlockBlobService.get_blob_properties(block_blob_service,container_name,blob.name).properties.last_modified
print("\t Blob name: " + blob.name)
print(lastModified)
您可以在此处获得更多详细信息。
如果您正在寻找S3,那么我建议您使用Boto.oto3在使用(S3)Object python对象时为LastModified返回一个datetime对象:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Object.last_modified
要将LastModified与今天的日期进行比较(Python3):
import boto3
from datetime import datetime, timezone
today = datetime.now(timezone.utc)
s3 = boto3.client('s3', region_name='eu-west-1')
objects = s3.list_objects(Bucket='my_bucket')
for o in objects["Contents"]:
if o["LastModified"] == today:
print(o["Key"])
希望能帮助到你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.