![](/img/trans.png)
[英]Deploy Spring cloud data flow on Google compute engine with data flow scheduling feature, and without using K8s or Cloud Foundry
[英]Google cloud data flow exmaple
我试图使用以下链接将多个 csv 文件从云存储插入到大查询中,但出现错误“ attributeerror: 'filecoder' object has no attribute 'to_type_hint' ”。 有人可以帮我吗
看起来 FileCoder 错误地没有从beam.coders.Coder
继承; 我怀疑修复此问题将使问题 go 消失。
无论如何,这里实际上最好使用 DoFn 而不是 Coder,例如
class CsvLineDecoder(beam.DoFn):
"""Encode and decode CSV data coming from the files."""
def __init__(self, columns):
self._columns = columns
self._num_columns = len(columns)
self._delimiter = ","
def process(self, value):
st = io.StringIO(value)
cr = csv.DictWriter(st,
self._columns,
delimiter=self._delimiter,
quotechar='"',
quoting=csv.QUOTE_MINIMAL)
return next(cr)
然后将其用作
(p
| 'Read From Text - ' + input_file >> beam.io.ReadFromText(gs_path, skip_header_lines=1)
| beam.ParDo(CsvLineDecoder(list(fields.keys())))
...)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.