简体   繁体   中英

How to get the value of a ValueProvider and write it in a BigQuery table?

Good Morning,

I created a DataFlow template that read some informations in BigQuery, apply some transformations and write the result in a new BigQuery Table.

This template takes 2 parameters:

  • Input query
  • Project's name

I wanted to write the project's name in a bigquery table through 'WriteToBigquery' transformation, but instead of writing the name of the project that filled by the user, it returns me an error..

Do you know how can I get this value and write it please?

Thanks you for your help !

CODE:

    @classmethod
    def _add_argparse_args(cls, parser):
        parser.add_value_provider_argument(
            '--query',
            default='',
            help='q')
        parser.add_value_provider_argument(
            '--projet',
            default='',
            help='d')

[...]

  my_options = pipeline_options.view_as(BqReaderOptions).query
  myProjet = pipeline_options.view_as(BqReaderOptions).projet
        
                nb_val = (
                    p
                    | 'Readl' >> beam.io.ReadFromBigQuery(query=my_options, use_standard_sql = True) 
                    |beam.Map(lambda elem :elem== ' 0' )       
                    | 'countVal' >>  beam.combiners.Count.PerElement()  
                    |beam.Map(lambda elem : { "Nb" : int(elem), 'projet': myProjet })) 
                    



 ERROR : 

    default_encoder "Object of type '%s' is not JSON serializable" % type(obj).__name__) TypeError: Object of type 'RuntimeValueProvider' is not JSON serializable [while running 'writeToBigQuery1/BigQueryBatchFileLoads/ParDo(WriteRecordsToFile)/ParDo(WriteRecordsToFile)/ParDo(WriteRecordsToFile)']

You're getting that error because you're outputting a ValueProvider as the result of a transform, and it attempts to do a default encoding to JSON which fails. What it looks like you intended, however, is to output project as a string instead of the raw ValueProvider . You can read the details on how to use ValueProvider in your own functions , but basically you just need to make a DoFn object containing the ValueProvider , and use the get method on it, like so:

class MyFn(beam.DoFn):
    def __init__(self, project): # Pass in project as a ValueProvider
      self.project = project

    def process(self, elem):
      yield { "Nb" : int(elem), "project": self.project.get() }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM