I'm processing data with Scala 2.11.7 and Flink 1.3.2. Now I'd like to store the resulting org.apache.flink.api.scala.DataSet in a neo4j graph database.
There are Github projects for compatibility:
What is the most promising way to go? Or should I better use neo4j's REST API directly?
(BTW: Why does stackoverflow restrict the number of links postet...?)
I tried flink-neo4j, but it seems that there are some problems with mixing Java and Scala classes:
package dummy.neo4j
import org.apache.flink.api.common.io.OutputFormat
import org.apache.flink.api.java.io.neo4j.Neo4jOutputFormat
import org.apache.flink.api.java.tuple.{Tuple, Tuple2}
import org.apache.flink.api.scala._
object Neo4jDummyWriter {
def main(args: Array[String]) {
val env = ExecutionEnvironment.getExecutionEnvironment
val outputFormat: OutputFormat[_ <: Tuple] = Neo4jOutputFormat.buildNeo4jOutputFormat.setRestURI("http://localhost:7474/db/data/")
.setConnectTimeout(1000).setReadTimeout(1000).setCypherQuery("UNWIND {inserts} AS i CREATE (a:User {name:i.name, born:i.born})")
.addParameterKey(0, "name").addParameterKey(1, "born").setTaskBatchSize(1000).finish
val tuple1: Tuple = new Tuple2("abc", 1)
val tuple2: Tuple = new Tuple2("def", 2)
val test = env.fromElements[Tuple](tuple1, tuple2)
println("test: " + test.getClass)
test.output(outputFormat)
}
}
Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Lorg.apache.flink.api.common.typeinfo.TypeInformation; at dummy.neo4j.Neo4jDummyWriter$.main(Neo4jDummyWriter.scala:20) at dummy.neo4j.Neo4jDummyWriter.main(Neo4jDummyWriter.scala)
and
Type mismatch, expected: OutputFormat[Tuple], actual: OutputFormat[_ <: Tuple]
The solution is not to change Tuple2 objects to Tuple:
package dummy.neo4j
import org.apache.flink.api.common.io._
import org.apache.flink.api.java.io.neo4j.Neo4jOutputFormat
import org.apache.flink.api.java.tuple.Tuple2
import org.apache.flink.api.scala._
object Neo4jDummyWriter {
def main(args: Array[String]) {
val env = ExecutionEnvironment.getExecutionEnvironment
val tuple1 = ("user9", 1978)
val tuple2 = ("user10", 1996)
val datasetWithScalaTuples = env.fromElements(tuple1, tuple2)
val dataset: DataSet[Tuple2[String, Int]] = datasetWithScalaTuples.map(tuple => new Tuple2(tuple._1, tuple._2))
val outputFormat = Neo4jOutputFormat.buildNeo4jOutputFormat.setRestURI("http://localhost:7474/db/data/").setUsername("neo4j").setPassword("...")
.setConnectTimeout(1000).setReadTimeout(1000).setCypherQuery("UNWIND {inserts} AS i CREATE (a:User {name:i.name, born:i.born})")
.addParameterKey(0, "name").addParameterKey(1, "born").setTaskBatchSize(1000).finish.asInstanceOf[OutputFormat[Tuple2[String, Int]]]
dataset.output(outputFormat)
env.execute
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.