[英]Is there withFormatFunction equivalent in Apache Beam Python SDK?
I'm passing a PCollection of dictionary to WriteToBigQuery class. However, some fields of the dictionary aren't meant to be written to BigQuery tables.我正在将字典的 PCollection 传递给WriteToBigQuery class。但是,字典的某些字段并不意味着要写入 BigQuery 表。 They're important to decide the table name for the element (in streaming mode).它们对于决定元素的表名很重要(在流模式下)。 This is done by passing callable in the table
parameter.这是通过在table
参数中传递 callable 来完成的。 Is this possible to do in Beam Python?这可以在 Beam Python 中完成吗? This is possible in Java SDK through withFormatFunction of BigQueryIO.这可以通过 BigQueryIO 的withFormatFunction在 Java SDK 中实现。 Cheers.干杯。
There's not currently an equivalent of withFormatFunction in Beam Python. If you have a fixed set of output tables, you can have separate WriteToBigQuery transforms for each one, and branch earlier in the pipeline. Beam Python 中目前没有与 withFormatFunction 等效的方法。如果您有一组固定的 output 表,则可以为每个表分别进行 WriteToBigQuery 转换,并在管道中更早地进行分支。 You could potentially also make a PCollection of objects of a type that acts like a dict (containing the payload), but also has fields on it that the table name callable can read您还可以创建一个 PCollection 类型的对象,其行为类似于 dict(包含有效负载),但也有可调用表名可以读取的字段
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.