I currently using the ChoETL library to read parquet data, this is the code:
BlobServiceClient blobServiceClient = new BlobServiceClient(azureStorage);
BlobContainerClient container = blobServiceClient.GetBlobContainerClient(contenedor);
var blobs = container.GetBlobs().Where(x => x.Name.Contains(".parquet"));
try
{
foreach (var item in blobs)
{
var blob = container.GetBlobClient(item.Name);
await blob.OpenReadAsync();
//Here i'm trying to read the parquet file, as is shown in the official documentation https://github.com/Cinchoo/ChoETL/wiki/QuickParquetLoad
foreach (dynamic e in new ChoParquetReader(outStream))
{
Console.WriteLine("Id: " + e.Id + " FormNumber: " + e.FormNumber);
}
}
}
catch (Exception ex)
{
throw ex;
}
Trying to executing it, throws an error in this line:
foreach (dynamic e in new ChoParquetReader(outStream))
{
Console.WriteLine("Id: " + e.Id + " FormNumber: " + e.FormNumber);
}
Is there any solution? I tried parquet.net but i don't like it
I cannot find where outStream
is defined in your code, but I think that is the problem. You need to use the Stream
provided by blob.OpenReadAsync()
:
BlobServiceClient blobServiceClient = new BlobServiceClient(azureStorage);
BlobContainerClient container = blobServiceClient.GetBlobContainerClient(contenedor);
var blobs = container.GetBlobs().Where(x => x.Name.Contains(".parquet"));
try
{
foreach (var item in blobs)
{
var blob = container.GetBlobClient(item.Name);
using var stream = await blob.OpenReadAsync();
//Here i'm trying to read the parquet file, as is shown in the official documentation https://github.com/Cinchoo/ChoETL/wiki/QuickParquetLoad
foreach (dynamic e in new ChoParquetReader(stream))
{
Console.WriteLine("Id: " + e.Id + " FormNumber: " + e.FormNumber);
}
}
}
catch (Exception ex)
{
throw ex;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.