[英]How can I convert a row from a dataframe in pyspark to a column but keep the column names? - pyspark or python
I have an array where it is made up of several arrays.我有一个数组,它由几个 arrays 组成。
Zip the list and then call the dataframe constructor: Zip 列表,然后调用 dataframe 构造函数:
df = spark.createDataFrame(zip(*all_data), cols)
df.show(truncate=False)
+-----------------------------+-----------+
|name |chromossome|
+-----------------------------+-----------+
|NM_019112.4(ABCA7):c.161-2A>T|19p13.3 |
|CCL2, 767C-G |17q11.2-q12|
+-----------------------------+-----------+
Or with zip_longest
:或者使用zip_longest
:
from itertools import zip_longest
df = spark.createDataFrame(zip_longest(*all_data,fillvalue=''),cols)
df.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.