[英]Pyspark: how to reserve only an RDD for each key according to the value
[英]how to save only value in pyspark
我将 pyspark 数据帧中的列的值保存到数据帧仅包含一列的变量中:
variable=df.select(df['columnA']).collect()
print(variable)
输出:
[Row(columnA='value')]
但我希望变量只包含“值”如何实现?
试试下面的代码:
import pandas as pd
df = pd.read_csv(file_name)
variable_name = df[column_name]
# Retrieving data from the "columnA" column
for col in df.collect():
print(col["columnA"])
# first row - first column
print(df.collect()[0][0])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.