简体   繁体   English

顶点属性继承 - Graphx Scala Spark

[英]Vertex Property Inheritance - Graphx Scala Spark

--- Edit --- ---编辑---

My main issue is that I do not understand this paragraph given in the Graphx documentation: 我的主要问题是我不理解Graphx文档中给出的这一段:

In some cases it may be desirable to have vertices with different property types in the same graph. 在某些情况下,可能需要在同一图表中具有不同属性类型的顶点。 This can be accomplished through inheritance. 这可以通过继承来完成。 For example to model users and products as a bipartite graph we might do the following: 例如,要将用户和产品建模为二分图,我们可能会执行以下操作:

class VertexProperty()
case class UserProperty(val name: String) extends VertexProperty
case class ProductProperty(val name: String, val price: Double) extends VertexProperty
// The graph might then have the type:
var graph: Graph[VertexProperty, String] = null

In the above case given RDD's of each UserProperty and ProductProperty and a RDD of EdgeProperty, how does one create a graph of type Graph[VertexProperty, String]. 在上面的例子中给出了每个UserProperty和ProductProperty的RDD以及EdgeProperty的RDD,如何创建Graph [VertexProperty,String]类型的图形。 I am looking for an example. 我正在寻找一个例子。


This will help you to create a bipartite graph, where vertex property will help you to understand the different class categories. 这将帮助您创建二分图,其中vertex属性将帮助您了解不同的类类别。

// High level interface OR VertexProperty // 高级接口或VertexProperty

trait Node {   def getVertexID : Long  }

class UserNode(sID: String, sname : String, sAge) extends Node with Serializable { }

class ProductNode(sID: String, sNO : String, sdoe : String) extends Node with Serializable{ }

// Data loading // 数据加载

val users: RDD[Node]  = sc.textFile("users.txt")
                                 .map { row =>  val cols = row.split(",")
                                         ( new UserNode(cols(0), cols(1), cols(2))
                                  }

val products: RDD[Node]  = sc.textFile("products.txt")
                                 .map { row =>  val cols = row.split(",")
                                        ( new ProductNode(cols(0), cols(1), cols(3)))
                                }

// Join both RDDs // 加入两个RDD

 val nodes : RDD[Node] = users.++(products) 

You can use a message which can be merged, for example an Iterable[YourClass]. 您可以使用可以合并的消息,例如Iterable [YourClass]。 However you have to take into consideration that the size of these kind of merges can become very large. 但是,您必须考虑到这些合并的大小可能会变得非常大。

It is a scala question, just convert the the extended type to the abstract type using asInstanceOf, for example: 这是一个scala问题,只需使用asInstanceOf将扩展类型转换为抽象类型,例如:

val variable1: RDD[UserProperty]  = {..your code..}
val variable2: RDD[ProductProperty]  = {..your code..}
val result: RDD[VertexProperty] = SparkContext.union(
variable1.asInstanceOf[VertexProperty],
variable2.asInstanceOf[VertexProperty])

The same goes for edge property, use 边缘属性,使用也是如此

val edge: EdgeProperty = Edge(srcID, dstID, variable.asInstanceOf(EdgeProperty))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM