from azure.storage.filedatalake import DataLakeServiceClient datalake_service_client = DataLakeServiceClient.from_connection_string(connect_str) myfilesystem = "ContainerName" myfolder = "FolderName" myfile = "FileName.csv" file_system_client = datalake_service_client.get_file_system_client(myfilesystem) try: directory_client = file_system_client.create_directory(myfolder) except Exception as e: directory_client = file_system_client.get_directory_client(myfolder) file_client = directory_client.create_file(myfile) data = """Test1""" file_client.append_data(data, offset=0, length=len(data)) file_client.flush_data(len(data))
Suppose the next append is for data = """Test2""", how to set the offset and flush_data?
Thanks.
First, you are using directory_client.create_file(myfile)
, this will create the new file every time. So your code will never append any content.
Second, you need to add a judgment condition to check whether it exists, if it exists, use the get_file_client method. If not exists, use the create_file method. Total code is like below:(On my side, I am using .txt file to test.)
from azure.storage.filedatalake import DataLakeServiceClient
connect_str = "DefaultEndpointsProtocol=https;AccountName=0730bowmanwindow;AccountKey=xxxxxx;EndpointSuffix=core.windows.net"
datalake_service_client = DataLakeServiceClient.from_connection_string(connect_str)
myfilesystem = "test"
myfolder = "test"
myfile = "FileName.txt"
file_system_client = datalake_service_client.get_file_system_client(myfilesystem)
directory_client = file_system_client.create_directory(myfolder)
directory_client = file_system_client.get_directory_client(myfolder)
print("11111")
try:
file_client = directory_client.get_file_client(myfile)
file_client.get_file_properties().size
data = "Test2"
print("length of data is "+str(len(data)))
print("This is a test123")
filesize_previous = file_client.get_file_properties().size
print("length of currentfile is "+str(filesize_previous))
file_client.append_data(data, offset=filesize_previous, length=len(data))
file_client.flush_data(filesize_previous+len(data))
except:
file_client = directory_client.create_file(myfile)
data = "Test2"
print("length of data is "+str(len(data)))
print("This is a test")
filesize_previous = 0
print("length of currentfile is "+str(filesize_previous))
file_client.append_data(data, offset=filesize_previous, length=len(data))
file_client.flush_data(filesize_previous+len(data))
On my side it is no problem, please have a try on your side.(The above is just an example, you can design better and streamlined.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.