简体   繁体   中英

Issues with writing a C# function to load BigQuery with CSV file from Google Cloud Storage

Thanks in advance!

I am trying to create a function in C# to truncate and reload a table in bigquery. The issue I am having is that I cannot for the life of me confirm it is actually accepting the jobOptions I am inputting. I know I am able to enter the else statement, but it just displays the number of rows within the table. I've verified that a specific file I am trying to load only has one record. I think there is something up with the job options with the write disposition setting.

 public void LoadTableGcsCsv(string projectId , string datasetId )
           {
           BigQueryClient client = BigQueryClient.Create(projectId);
           var gcsURI = "gs://{bucketname}/{filename}.csv";
           var dataset = client.GetDataset(datasetId);
           var schema = new TableSchemaBuilder {
            { "col1", BigQueryDbType.String }
           ,{ "col2", BigQueryDbType.String }
           ,{ "col3", BigQueryDbType.Date }
           ,{ "col4", BigQueryDbType.String }
           ,{ "col5", BigQueryDbType.String }
           ,{ "col6", BigQueryDbType.String }
           ,{ "col7", BigQueryDbType.Int64 }
           ,{ "col8", BigQueryDbType.Date }
           ,{ "col9", BigQueryDbType.String }
           ,{ "col10", BigQueryDbType.Int64 }
    }.Build();
           var destinationTableRef = dataset.GetTableReference(
               tableId: "destinationTableName");
       
           // Create job configuration
           var jobOptions = new CreateLoadJobOptions()
                {
                // The source format defaults to CSV; line below is optional.
                SourceFormat = FileFormat.Csv
                ,CreateDisposition = CreateDisposition.CreateIfNeeded
                ,WriteDisposition = WriteDisposition.WriteTruncate
                ,SkipLeadingRows = 1        
                
               
                };
           // Create and run job
           var loadJob = client.CreateLoadJob(
               sourceUri: gcsURI, destination: destinationTableRef,
               schema: schema, options: jobOptions);
           
           loadJob.PollUntilCompleted();  // Waits for the job to complete.
        
           if (loadJob.Status.ErrorResult != null)
                {
                foreach (ErrorProto error in loadJob.Status.Errors)
                     {
                     Console.WriteLine(error.Message);
                     Console.ReadLine();
                     }
                }
           else
                {
                            // Display the number of rows uploaded
                BigQueryTable table = client.GetTable(destinationTableRef);
                Console.WriteLine(
                    $"Loaded {table.Resource.NumRows} rows to {table.FullyQualifiedId}");
                Console.ReadLine();
                }                            
           }
      }

Your job is probably failing but you don't know about it, PollUntilCompleted does not modify the instance it is called on, instead it returns a new instance, so before checking for errors you need to do:

var completedLoadJob = loadJob.PollUntilCompleted();
if (completedLoadJob.Status.ErrorResult != null) ...

Or, if you want to propagate an exception instead of writing the errors in the console, you can do:

loadJob.PollUntilCompleted().ThrowOnAnyError();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM