[英]Scala: Write Random Values to JSON and Save in File then Analyze in Spark
I would like to write ten (or a billion) events to JSON and save as files. 我想向JSON写十(或十亿)个事件并保存为文件。
I am writing in a Databricks notebook in Scala. 我在Scala的Databricks笔记本中写作。 I want the JSON string to have randomly generated values for fields like "Carbs":
我希望JSON字符串为“Carbs”等字段随机生成值:
{"Username": "patient1", "Carbs": 92, "Bolus": 24, "Basal": 1.33, "Date": 2017-06-28, "Timestamp": 2017-06-28 21:59:...}
I successfully used the following to write the date to an Array() and then save as a JSON file. 我成功地使用以下内容将日期写入Array(),然后另存为JSON文件。
val dateDF = spark.range(10)
.withColumn("today", current_date())
But what is the best way to write random values to an Array and then save the Array as a JSON file? 但是,将随机值写入数组然后将数组保存为JSON文件的最佳方法是什么?
您将RDD转换为dataframe,然后另存为json格式
dataframe.write.mode('append').json(path)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.