从 PySpark 中的选定列和行中提取值

Question

Spark 3.0火花3.0

I would like to extract specific values from selected columns into a print function from a spark DF onto my juypter sub-window.我想将选定列中的特定值提取到从 spark DF 到我的 juypter 子窗口的打印 function 中。 I will be doing a for loop so I can automate the monthly files.我将做一个 for 循环，这样我就可以自动化每月的文件。

So for an example, print('Average salary for a male in company A as an IT is 26000').例如，print('A 公司一名男性作为 IT 人员的平均工资为 26000')。

I tried x['company'][0][0] for example, but I am not getting the value I needed.例如，我尝试了 x['company'][0][0] ，但没有得到我需要的值。

Answer 1

This may be what you're looking for.这可能是您正在寻找的。

df.select('company').collect()[0][0]

从 PySpark 中的选定列和行中提取值

问题描述

1 个解决方案

解决方案1
0 2020-08-19 22:34:09

从 PySpark 中的选定列和行中提取值

问题描述

1 个解决方案

解决方案1 0 2020-08-19 22:34:09

解决方案1
0 2020-08-19 22:34:09