简体   繁体   English

图形数据从边输入格式转换为顶点输入格式

[英]Conversion of graph data from Edge Input Format to Vertex Input Format

I am experimenting around with Giraph. 我正在尝试Giraph。 To run the algorithms in Giraph I need the graph data to be in Vertex Input Format. 要在Giraph中运行算法,我需要图形数据采用“顶点输入格式”。 Almost all the available Big Data online is in Edge List Format. 几乎所有在线可用的大数据都是边缘列表格式。 I wrote a code in Java to convert this Edge List format into VertexInputFormat. 我用Java写了一个代码,将这种Edge List格式转换为VertexInputFormat。 This works for smaller graphs with almost 800k edges. 这适用于具有近800k边的较小图形。 However for the graph that I need, every time I run the program its giving me Heap space exceeded error. 但是对于我需要的图形,每次我运行程序时,它给我的堆空间超出了错误。 I tried increasing the Heap size to maximum. 我尝试将堆大小增加到最大。 Still the error persisted. 错误仍然存​​在。

The file on which I am running is about 15GB in size. 我正在运行的文件大小约为15GB。

I don't know much about how the algorithms(PageRank, SingleSourceShortestPath etc..,) are written in Giraph but I do know that they all take a graph in VertexInputFormat as input. 我对Giraph中的算法(PageRank,SingleSourceShortestPath等)的编写方式不甚了解,但我确实知道它们都以VertexInputFormat中的图形作为输入。

The help I am looking for is: 我正在寻找的帮助是:

  1. An optimized code to convert EdgeInputFormat to VertexInputFormat (or) 将EdgeInputFormat转换为VertexInputFormat(或)的优化代码
  2. Any Online tool that can help in this conversion (or) 任何可以帮助进行此转换的在线工具(或)
  3. PageRank algorithm that takes EdgeInputFormat as input. 以EdgeInputFormat作为输入的PageRank算法。

抱歉,我不明白为什么只想使用VertexInputFormat,Giraph还提供了EdgeInputFormat API,为什么不能使用呢?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM