I'd like to write string value as ObjectId to MongoDB in java spark.
I tried it like this.
List<StructField> structFields = new ArrayList<>();
structFields.add(DataTypes.createStructField("oid", DataTypes.StringType, true));
StructType structType = DataTypes.createStructType(structFields);
spark.sqlContext().udf().register("toObjectId", (String publisherId) -> {
return new com.mongodb.spark.sql.fieldTypes.api.java.ObjectId(publisherId);
}, StructType);
dataframe = dataframe.withColumn("pub_id",
functions.callUDF("toObjectId", dataset.col("publisherId").cast(DataTypes.StringType))
);
Map<String, String> writeOverrides = new HashMap<String, String>();
writeOverrides.put("writeConcern.w", "majority");
WriteConfig writeConfig = WriteConfig.create(jsc).withOptions(writeOverrides);
MongoSpark.save(dataset, writeConfig);
but I got this error.
Caused by: scala.MatchError: com.mongodb.spark.sql.fieldTypes.api.java.ObjectId@8b2d2973 (of class com.mongodb.spark.sql.fieldTypes.api.java.ObjectId)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:236)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:231)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:379)
How can I write ObjectId to MongoDB in spark through JAVA API ?
You don't need to use udf()
import the struct
first:
import static org.apache.spark.sql.functions.struct;
Then use:
dataframe = dataframe.withColumn("pub_id", struct(col("publisherId").as("oid")))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.