简体   繁体   English

类型错误:字段 col1:LongType 不能接受类型中的对象“”<class 'str'>

[英]TypeError: field col1: LongType can not accept object '' in type <class 'str'>

I have json in python like this:我在 python 中有这样的 json:

example = [{"col1":"","col2":"","col3":52272}, ...]

Columns of json might be null. json 的列可能为空。 Empty value is "".空值为“”。

I created the spark schema:我创建了火花模式:

schema = StructType([
   StructField("col1", LongType(), True),
   StructField("col2", LongType(), True),
   StructField("col3", LongType(), True),]

I try to get the spark dataframe like this:我尝试像这样获取火花数据框:

pandas_df = pd.DataFrame(example)
spark_df = spark.createDataFrame(pandas_df, schema = schema)

But I get that error:但我得到了那个错误:

TypeError: field col1: LongType can not accept object '' in type <class 'str'>

What fix the error?什么修复错误? Same error happens if I used other types of this column.如果我使用此列的其他类型,也会发生同样的错误。

As @tdelaney commented, your schema doesn't reflect your data.正如@tdelaney 评论的那样,您的架构并未反映您的数据。

You could try something like this:你可以尝试这样的事情:

from pyspark.sql import SparkSession
import pyspark.sql.functions as F
from pyspark.sql.types import IntegerType, StringType, StructField, StructType

if __name__ == "__main__":
    spark = SparkSession.builder.master("local").appName("Test").getOrCreate()
    data = [{"col1": "", "col2": "", "col3": 52272}]
    schema = StructType(
        [
            StructField("col1", StringType(), True),
            StructField("col2", StringType(), True),
            StructField("col3", IntegerType(), True),
        ]
    )
    df = spark.createDataFrame(data=data, schema=schema)

Which gives:这使:

+----+----+-----+
|col1|col2|col3 |
+----+----+-----+
|    |    |52272|
+----+----+-----+

If, for example, you want to replace empty strings with None you could use:例如,如果你想用None替换空字符串,你可以使用:

df = df.withColumn("col2", F.when(F.col("col2") != "", F.col("col2")).otherwise(None))

Which gives:这使:

+----+----+-----+
|col1|col2|col3 |
+----+----+-----+
|    |null|52272|
+----+----+-----+

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Spark TypeError:LongType不能接受类型中的对象u&#39;Value&#39; <type 'unicode'> - Spark TypeError: LongType can not accept object u'Value' in type <type 'unicode'> TypeError:TimestampType无法接受对象 <class 'str'> 和 <class 'int'> - TypeError: TimestampType can not accept object <class 'str'> and <class 'int'> pyspark createDataframe typeerror: structtype can not accept object 'id' in type<class 'str'></class> - pyspark createDataframe typeerror: structtype can not accept object 'id' in type <class 'str'> Pyspark DataframeType error a: DoubleType can not accept object 'a' in type<class 'str'></class> - Pyspark DataframeType error a: DoubleType can not accept object 'a' in type <class 'str'> PySpark:TypeError:StructType不能接受类型的对象 <type 'unicode'> 要么 <type 'str'> - PySpark: TypeError: StructType can not accept object in type <type 'unicode'> or <type 'str'> 类型错误:对象类型<class 'str'>不能传递给 C 代码 - TypeError: Object type <class 'str'> cannot be passed to C code pyspark:TypeError:IntegerType不能接受类型中的对象<type 'unicode'> - pyspark: TypeError: IntegerType can not accept object in type <type 'unicode'> TypeError:无法将“类型”对象隐式转换为str - TypeError: Can't convert 'type' object to str implicitly TypeError:无法将“ float”对象隐式转换为str或TypeError:-:“ str”和“ float”的不受支持的操作数类型 - TypeError: Can't convert 'float' object to str implicitly or TypeError: unsupported operand type(s) for -: 'str' and 'float' TypeError:&#39;str&#39;对象不可调用//类吗? - TypeError: 'str' object is not callable // class?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM