[英]Structured Streaming in pyspark
Am trying to stream data from another server to HBase and be able to define different column families in Python. 我试图将数据从另一台服务器流式传输到HBase,并能够在Python中定义不同的列系列。 I have looked around in the Spark docs and only seeing:
我在Spark文档中环顾四周,仅看到:
writestream.format('jdbc').start('jdbc:///')
How can I have the same implementations to write directly to HBase with the ability to map data to different column families? 如何将直接映射到HBase并具有将数据映射到不同列族的功能?
您可以使用foreach
(Scala或Java)将数据写入HBase: http : //spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.