带Flink和Scala的neo4j

Question

I'm processing data with Scala 2.11.7 and Flink 1.3.2. 我正在使用Scala 2.11.7和Flink 1.3.2处理数据。 Now I'd like to store the resulting org.apache.flink.api.scala.DataSet in a neo4j graph database. 现在，我想将生成的org.apache.flink.api.scala.DataSet存储在neo4j图形数据库中。

There are Github projects for compatibility: Github项目具有兼容性：

Flink with neo4j: https://github.com/s1ck/flink-neo4j 与neo4j链接： https : //github.com/s1ck/flink-neo4j
Scala with neo4j: _https://github.com/FaKod/neo4j-scala 使用neo4j的Scala：_https：//github.com/FaKod/neo4j-scala
Flink's graph library "Gelly" with neo4j: _https://github.com/albertodelazzari/gelly-neo4j Flink的带有Neo4j的图形库“ Gelly”：_https：//github.com/albertodelazzari/gelly-neo4j

What is the most promising way to go? 最有前途的方式是什么？ Or should I better use neo4j's REST API directly? 还是应该直接使用neo4j的REST API？

(BTW: Why does stackoverflow restrict the number of links postet...?) （顺便说一句：为什么stackoverflow限制了postet链接的数量...？）

I tried flink-neo4j, but it seems that there are some problems with mixing Java and Scala classes: 我尝试了flink-neo4j，但是混合Java和Scala类似乎存在一些问题：

package dummy.neo4j

import org.apache.flink.api.common.io.OutputFormat
import org.apache.flink.api.java.io.neo4j.Neo4jOutputFormat
import org.apache.flink.api.java.tuple.{Tuple, Tuple2}
import org.apache.flink.api.scala._

object Neo4jDummyWriter {

  def main(args: Array[String]) {
    val env = ExecutionEnvironment.getExecutionEnvironment

    val outputFormat: OutputFormat[_ <: Tuple] = Neo4jOutputFormat.buildNeo4jOutputFormat.setRestURI("http://localhost:7474/db/data/")
  .setConnectTimeout(1000).setReadTimeout(1000).setCypherQuery("UNWIND {inserts} AS i CREATE (a:User {name:i.name, born:i.born})")
  .addParameterKey(0, "name").addParameterKey(1, "born").setTaskBatchSize(1000).finish

    val tuple1: Tuple = new Tuple2("abc", 1)
    val tuple2: Tuple = new Tuple2("def", 2)

    val test = env.fromElements[Tuple](tuple1, tuple2)
    println("test: " + test.getClass)
    test.output(outputFormat)
  }

}

Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object; 线程“主”中的异常java.lang.ClassCastException：[Ljava.lang.Object; cannot be cast to [Lorg.apache.flink.api.common.typeinfo.TypeInformation; 无法转换为[Lorg.apache.flink.api.common.typeinfo.TypeInformation; at dummy.neo4j.Neo4jDummyWriter$.main(Neo4jDummyWriter.scala:20) at dummy.neo4j.Neo4jDummyWriter.main(Neo4jDummyWriter.scala) 在dummy.neo4j.Neo4jDummyWriter $ .main（Neo4jDummyWriter.scala：20）在dummy.neo4j.Neo4jDummyWriter.main（Neo4jDummyWriter.scala）

and 和

Type mismatch, expected: OutputFormat[Tuple], actual: OutputFormat[_ <: Tuple] 类型不匹配，预期：OutputFormat [元组]，实际：OutputFormat [_ <：元组]

Answer 1

The solution is not to change Tuple2 objects to Tuple: 解决方案是不将Tuple2对象更改为Tuple：

package dummy.neo4j

import org.apache.flink.api.common.io._
import org.apache.flink.api.java.io.neo4j.Neo4jOutputFormat
import org.apache.flink.api.java.tuple.Tuple2
import org.apache.flink.api.scala._

object Neo4jDummyWriter {

  def main(args: Array[String]) {
    val env = ExecutionEnvironment.getExecutionEnvironment

    val tuple1 = ("user9", 1978)
    val tuple2 = ("user10", 1996)
    val datasetWithScalaTuples = env.fromElements(tuple1, tuple2)
    val dataset: DataSet[Tuple2[String, Int]] = datasetWithScalaTuples.map(tuple => new Tuple2(tuple._1, tuple._2))

    val outputFormat = Neo4jOutputFormat.buildNeo4jOutputFormat.setRestURI("http://localhost:7474/db/data/").setUsername("neo4j").setPassword("...")
  .setConnectTimeout(1000).setReadTimeout(1000).setCypherQuery("UNWIND {inserts} AS i CREATE (a:User {name:i.name, born:i.born})")
  .addParameterKey(0, "name").addParameterKey(1, "born").setTaskBatchSize(1000).finish.asInstanceOf[OutputFormat[Tuple2[String, Int]]]

    dataset.output(outputFormat)
    env.execute
  }

}

带Flink和Scala的neo4j

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-09-15 15:19:09

带Flink和Scala的neo4j

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-09-15 15:19:09

解决方案1
1 已采纳 2017-09-15 15:19:09