I have to move files from S3 to GCS. The problem i have is on mondays they uploads files from monday but also of saturdays and sundays and this files have different dates. For example: stack_20220430.csv, stack_20220501.csv. I need to move this files in the same airflow run, is that posible? I'm using the S3ToGCSOperator:
S3ToGCSOperator(
task_id="move_files_s3_to_gcs",
bucket=config["s3_params"]["s3_source_bucket"],
prefix=config["s3_params"]["s3_source_prefix"],
delimiter="/",
dest_gcs=config["gcs_params"]["gcs_destination"],
aws_conn_id=config["s3_params"]["s3_connector_name"],
)
Obviously the problem is that prefix takes a fixed value. I can assign a range for {{ds}}?
The S3ToGCSOperator
copy/move all files in the bucket/key you provided. It does it by listing all of them and then iterate each file and copy it to GCS.
prefix
is templated field so you can use {{ ds }}
with it.
You can always inherit from S3ToGCSOperator
and customize the behavior of the operator to your specific needs.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.