简体   繁体   English

将最后添加的文件复制到 GCS 存储桶到 Azure Blob 存储

[英]Copy last added file to a GCS bucket into Azure Blob storage

I'm VERY new to Azure Data Factory, so pardon me for my stupid or obvious question.我对 Azure 数据工厂非常陌生,所以请原谅我的愚蠢或明显的问题。

I want to schedule the copy files stored in a GCS bucket in Azure Blob Storage once a day.我想每天安排一次存储在 Azure Blob Storage 中的 GCS 存储桶中的复制文件。 Until now, I managed to copy (both manually and by scheduling the activity of the pipeline) file from the bucket in GCS where I'm uploading the file manually.到目前为止,我设法从 GCS 中的存储桶中复制(手动和通过调度管道的活动)文件,我在其中手动上传文件。

In the near future, the upload will happen automatically once a day at a given time, presumably during the night.在不久的将来,上传将在每天的给定时间自动进行一次,大概是在晚上。 My goal is to schedule the copy of just the last added file and avoid copying every time all the file, overwriting the existing ones.我的目标是仅安排最后添加的文件的副本,并避免每次都复制所有文件,覆盖现有文件。

It's something that requires writing some python script?这需要编写一些 python 脚本吗? Is there some parameter to set?有什么参数可以设置吗?

Thank you all in advance for the replies.提前感谢大家的回复。

There is no need of any explicit coding.不需要任何显式编码。 Adf support simple copy activity to move data from gcs to blob storage wherein your gcs would act as source and blob storage would act as sink in copy activity. Adf 支持简单的复制活动,将数据从 gcs 移动到 blob 存储,其中 gcs 将充当源,而 blob 存储将充当复制活动中的接收器。

https://docs.microsoft.com/en-us/azure/data-factory/connector-google-cloud-storage?tabs=data-factory https://docs.microsoft.com/en-us/azure/data-factory/connector-google-cloud-storage?tabs=data-factory

And to get the latest file, you can use get meta data activity to get list of files and filter for the latest file要获取最新文件,您可以使用获取元数据活动来获取文件列表并过滤最新文件

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何获取最后一个文件最后一个文件存放在gcs存储桶中(python) - How to get the last file last file deposited in a gcs bucket (python) Airflow 将最新文件从 GCS 存储桶复制到本地 - Airflow to copy most recent file from GCS bucket to local 将文件从 Azure blob 存储移动到 Google 云存储桶 - Moving Files from Azure blob storage to Google cloud storage bucket How to access a GCS Blob that contains an xml file in a bucket with the pandas.read_xml() function in python? - How to access a GCS Blob that contains an xml file in a bucket with the pandas.read_xml() function in python? 使用 python 将文件上传到 gcs 存储桶中的文件夹 - upload file to a folder in gcs bucket using python 将 XML 从 GCS Blob 存储转换为不适用于特殊字符的字符串 - Converting an XML from GCS Blob storage to String not working for special characters 验证文件路径的 Google Cloud Storage bucket.get_blob 返回 None - Google Cloud Storage bucket.get_blob to verified file path returns None 如何从 gcs 存储桶中解压缩 tsv 文件并将其加载到 Bigquery - How to unzip and load tsv file into Bigquery from gcs bucket 将文件从 Power Automate 流上传到 GCS 存储桶 - Upload a file from a Power Automate flow into a GCS bucket 如何将特定日期创建的所有文件从一个存储桶复制到 GCS 中的另一个存储桶? - How to copy all the files created on a specific date from one bucket to another in GCS?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM