![](/img/trans.png)
[英]How to create DataFrame from json data - dicts, lists and arrays within an array
[英]How to add rows from a json (arrays of dicts) in a dataframe already existing?
嗨,我已經有一個數據框:df_init與所有列:
A|B|C|D
我收到一個像這樣的json:
json=[{"A":"1","B":"2","C":"3"},
{"A":"1","B":"2","C":"3","D":"4"},
{"A":"1","B":"2"}]
我想像df_final一樣:
A|B| C |D
1|2| 3 |None
1|2| 3 |4
1|2|None|None
如果我做:
msgJSON=self.spark.sparkContext.parallelize([json_string],1)
df = self.sqlContext.read.option("multiLine", "true").options(samplingRatio=1.0).json(msgJSON)
但是我有一些錯誤的問題。
謝謝
json = [{"A":"1","B":"2","C":"3"},
{"A":"1","B":"2","C":"3","D":"4"},
{"A":"1","B":"2"}]
msgJSON = spark.sparkContext.parallelize([json],1)
df_final = sqlContext.read.option("multiLine","true").options(samplingRatio=1.0).json(msgJSON)
df_final.show()
+---+---+----+----+
| A| B| C| D|
+---+---+----+----+
| 1| 2| 3|null|
| 1| 2| 3| 4|
| 1| 2|null|null|
+---+---+----+----+
我復制了沒有關鍵字self
的代碼。 您不能在任何地方使用self
。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.