[英]contains pyspark SQL: TypeError: 'Column' object is not callable
I'm using spark 2.0.1,我正在使用 spark 2.0.1,
df.show()
+--------+------+---+-----+-----+----+
|Survived|Pclass|Sex|SibSp|Parch|Fare|
+--------+------+---+-----+-----+----+
| 0.0| 3.0|1.0| 1.0| 0.0| 7.3|
| 1.0| 1.0|0.0| 1.0| 0.0|71.3|
| 1.0| 3.0|0.0| 0.0| 0.0| 7.9|
| 1.0| 1.0|0.0| 1.0| 0.0|53.1|
| 0.0| 3.0|1.0| 0.0| 0.0| 8.1|
| 0.0| 3.0|1.0| 0.0| 0.0| 8.5|
| 0.0| 1.0|1.0| 0.0| 0.0|51.9|
I have a data frame and I want to add a new column to df using withColumn and value of new column is base on other column value.我有一个数据框,我想使用 withColumn 向 df 添加一个新列,新列的值基于其他列值。 I used something like this:我使用了这样的东西:
>>> dfnew = df.withColumn('AddCol' , when(df.Pclass.contains('3.0'),'three').otherwise('notthree'))
It is giving an error它给出了一个错误
TypeError: 'Column' object is not callable
can any help how to over come this error.可以帮助如何克服这个错误。
Its because you are trying to apply the function contains
to the column.这是因为您正在尝试将函数contains
应用于该列。 The function contains
does not exist in pyspark.函数contains
在 pyspark 中不存在。 You should try like
. like
应该试试。 Try this:试试这个:
import pyspark.sql.functions as F
df = df.withColumn("AddCol",F.when(F.col("Pclass").like("3"),"three").otherwise("notthree"))
Or if you just want it to be exactly the number 3
you should do:或者,如果您只是希望它恰好是数字3
,您应该这样做:
import pyspark.sql.functions as F
# If the column Pclass is numeric
df = df.withColumn("AddCol",F.when(F.col("Pclass") == F.lit(3),"three").otherwise("notthree"))
# If the column Pclass is string
df = df.withColumn("AddCol",F.when(F.col("Pclass") == F.lit("3"),"three").otherwise("notthree"))
you should use df.col(colName) instead of df.colName你应该使用 df.col(colName) 而不是 df.colName
exemple using java 8 and spark 2.1:使用 java 8 和 spark 2.1 的例子:
df.show();
+--------+------+---+-----+-----+----+
|Survived|Pclass|Sex|SibSp|Parch|Fare|
+--------+------+---+-----+-----+----+
| 0| 3| 1| 1| 0| 3|
| 1| 1| 0| 1| 0| 2|
+--------+------+---+-----+-----+----+
df = df.withColumn("AddCol", when(df.col("Pclass").contains("3"),"three").otherwise("notthree"));
df.show();
+--------+------+---+-----+-----+----+--------+
|Survived|Pclass|Sex|SibSp|Parch|Fare| AddCol|
+--------+------+---+-----+-----+----+--------+
| 0| 3| 1| 1| 0| 3| three|
| 1| 1| 0| 1| 0| 2|notthree|
+--------+------+---+-----+-----+----+--------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.