简体   繁体   English

将边缘列表数据转换为networkx对象

[英]converting edgelist data to networkx object

I have a file which looks like- 我有一个看起来像的文件-

1 2 1
1 3 1
2 999 1
2 1029 1
2 1031 1
2 1032 1
2 1197 1
2 1226 1
2 1296 1
3 450 1
3 933 1
3 934 1
3 955 1
3 1032 1
4 5 1

and I want to convert it to networkx graph but I am getting the following error- 我想将其转换为networkx图,但出现以下错误-

   G=nx.read_edgelist(fh)
  File "<decorator-gen-400>", line 2, in read_edgelist
  File "E:\anaconda\lib\site-packages\networkx\utils\decorators.py", line 227, in _open_file
    result = func_to_be_decorated(*new_args, **kwargs)
  File "E:\anaconda\lib\site-packages\networkx\readwrite\edgelist.py", line 378, in read_edgelist
    data=data)
  File "E:\anaconda\lib\site-packages\networkx\readwrite\edgelist.py", line 288, in parse_edgelist
    "Failed to convert edge data (%s) to dictionary." % (d))
TypeError: Failed to convert edge data (['1']) to dictionary.

and here is the code- 这是代码-

fh=open("YST_full.net", 'rb')
G=nx.read_edgelist(fh)
fh.close()

What am I doing wrong here? 我在这里做错了什么?

edit-I tried converting it to pandas dataframe 编辑-我尝试将其转换为熊猫数据框

df=pd.read_csv("YST_full.net",sep=" ",names=['node1','node2','weight'])
print(df)

G=nx.from_pandas_edgelist(df, 'node1', 'node2', ['weight'])

and now I want to convert it to graphml format- 现在我想将其转换为graphml格式-

nx.write_graphml(G, "YST_full.graphml")

but the error is- 但错误是-

    nx.write_graphml(G, "YST_full.graphml")
  File "<decorator-gen-440>", line 2, in write_graphml_lxml
  File "E:\anaconda\lib\site-packages\networkx\utils\decorators.py", line 227, in _open_file
    result = func_to_be_decorated(*new_args, **kwargs)
  File "E:\anaconda\lib\site-packages\networkx\readwrite\graphml.py", line 149, in write_graphml_lxml
    infer_numeric_types=infer_numeric_types)
  File "E:\anaconda\lib\site-packages\networkx\readwrite\graphml.py", line 596, in __init__
    self.add_graph_element(graph)
  File "E:\anaconda\lib\site-packages\networkx\readwrite\graphml.py", line 658, in add_graph_element
    T = self.xml_type[self.attr_type(k, "edge", v)]
KeyError: <class 'numpy.int64'>

You must inform networkx that the third column is an attribute called "weight" (or whatever you call it): 您必须通知networkx第三列是称为“权重”的属性(或任何您称呼的属性):

graph = nx.read_edgelist("YST_full.net", data=(('weight', float),))

As far as your second question is concerned, sometimes networkx fails to convert a NumPy int64 to the Python int before exporting to GraphML. 关于第二个问题,有时networkx在导出到GraphML之前无法将NumPy int64转换为Python int You have to do it yourself: 您必须自己做:

weights = {(n1,n2): float(d['weight']) # or int()
                    for n1,n2,d in graph.edges(data=True)}
nx.set_edge_attributes(G, weights, 'weight')

This error is due to the dtypes of your pandas dataframe. 此错误是由于您的熊猫数据框的dtypes引起的。 A workaround is to convert your dataframe columns to string dtype. 解决方法是将数据框列转换为字符串dtype。

df = df.apply(lambda x: x.astype(str))
G=nx.from_pandas_edgelist(df, 'node1', 'node2', 'weight')
nx.write_graphml(G,'test.out')

Output: 输出:

<?xml version='1.0' encoding='utf-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd"><key attr.name="weight" attr.type="string" for="edge" id="d0"/>
<graph edgedefault="undirected"><node id="1"/>
<node id="2"/>
<node id="3"/>
<node id="999"/>
<node id="1029"/>
<node id="1031"/>
<node id="1032"/>
<node id="1197"/>
<node id="1226"/>
<node id="1296"/>
<node id="450"/>
<node id="933"/>
<node id="934"/>
<node id="955"/>
<node id="4"/>
<node id="5"/>
<edge source="1" target="2">
  <data key="d0">1</data>
</edge>
<edge source="1" target="3">
  <data key="d0">1</data>
</edge>
<edge source="2" target="999">
  <data key="d0">1</data>
</edge>
<edge source="2" target="1029">
  <data key="d0">1</data>
</edge>
<edge source="2" target="1031">
  <data key="d0">1</data>
</edge>
<edge source="2" target="1032">
  <data key="d0">1</data>
</edge>
<edge source="2" target="1197">
  <data key="d0">1</data>
</edge>
<edge source="2" target="1226">
  <data key="d0">1</data>
</edge>
<edge source="2" target="1296">
  <data key="d0">1</data>
</edge>
<edge source="3" target="450">
  <data key="d0">1</data>
</edge>
<edge source="3" target="933">
  <data key="d0">1</data>
</edge>
<edge source="3" target="934">
  <data key="d0">1</data>
</edge>
<edge source="3" target="955">
  <data key="d0">1</data>
</edge>
<edge source="3" target="1032">
  <data key="d0">1</data>
</edge>
<edge source="4" target="5">
  <data key="d0">1</data>
</edge>
</graph></graphml>

Today I started learning network analysis and this is the first error I encountered. 今天,我开始学习网络分析,这是我遇到的第一个错误。 I just changed G=nx.read_edgelist(fh) to G=nx.read_weighted_edgelist(fh) . 我只是将G=nx.read_edgelist(fh)更改为G=nx.read_weighted_edgelist(fh)

You can also remove the third column and use G=nx.read_edgelist(fh) 您也可以删除第三列,并使用G=nx.read_edgelist(fh)

If you do change G=nx.read_edgelist(fh) to G=nx.read_weighted_edgelist(fh) , it will work. 如果确实将G=nx.read_edgelist(fh)更改为G=nx.read_weighted_edgelist(fh) ,它将起作用。

You don't need to remove third column. 您无需删除第三列。 I am using SNAP Temporal data for node2vec project, was facing same error as described, solving by using G = nx.read_weighted_edgelist(args.input, nodetype=int, data=(('weight',float),),delimiter=',', create_using=nx.DiGraph()) 我正在为node2vec项目使用SNAP时间数据 ,面临与所述相同的错误,通过使用G = nx.read_weighted_edgelist(args.input, nodetype=int, data=(('weight',float),),delimiter=',', create_using=nx.DiGraph())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM