[英]Azure Data Lake Gen2 - How do I move files from folder to another folder using C#
[英]Read and Query Parquet files from Azure Data Lake Using Azure Function without downloading locally C#
我们需要读取 Azure 数据湖中可用的所有 parquet 文件并转储到 SQL 数据库中。 但是由于一些业务规则和限制我的数据,我想过滤数据集而不实际将文件下载到我的本地。 是否有任何此类 nuget package 或可用于带有任何示例代码的点网的库? 有什么建议么?
这是 java 中可用的成功解决方法
StorageCredentials credentials = new StorageCredentialsAccountAndKey(accountName, accountKey);
CloudStorageAccount connection = new CloudStorageAccount(credentials, true);
CloudBlobClient blobClient = connection.createCloudBlobClient();
CloudBlobContainer container = blobClient.getContainerReference(containerName);
CloudBlob blob = container.getBlockBlobReference(fileName);
Configuration config = new Configuration();
config.set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem");
config.set("fs.azure.sas.<containerName>.<accountName>.blob.core.windows.net", token);
URI uri = new URI("wasbs://<containerName>@<accountName>.blob.core.windows.net/" + blob.getName());
InputFile file = HadoopInputFile.fromPath(new org.apache.hadoop.fs.Path(uri),
config);
ParquetReader<GenericRecord> reader = AvroParquetReader.<GenericRecord> builder(file).build();
GenericRecord record;
while ((record = reader.read()) != null) {
System.out.println(record);
}
reader.close();
这是您可以尝试的 C# 中的解决方法之一
var connectionString = String.Format("<YOUR CONNECTION STRING>");
var storageAccount = CloudStorageAccount.Parse(connectionString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("<YOUR CONTAINER NAME>");
SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy();
sasConstraints.SharedAccessExpiryTime = DateTime.UtcNow.AddDays(2);
sasConstraints.Permissions = SharedAccessBlobPermissions.Read | SharedAccessBlobPermissions.Write | SharedAccessBlobPermissions.List;
CloudBlockBlob blob = container.GetBlockBlobReference("<YOUR PARQUET FILE>");
var blobUrlWithSAS = blob.Uri + blob.GetSharedAccessSignature(sasConstraints);
var client = new HttpClient();
var stream = await client.GetStreamAsync(blobUrlWithSAS);
ParquetReader parquetReader = new ParquetReader();
var options = new ParquetOptions {
TreatByteArrayAsString = true
};
var reader = new ParquetReader(stream, options);
参考:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.