[英]converting edgelist data to networkx object
I have a file which looks like- 我有一个看起来像的文件-
1 2 1
1 3 1
2 999 1
2 1029 1
2 1031 1
2 1032 1
2 1197 1
2 1226 1
2 1296 1
3 450 1
3 933 1
3 934 1
3 955 1
3 1032 1
4 5 1
and I want to convert it to networkx graph but I am getting the following error- 我想将其转换为networkx图,但出现以下错误-
G=nx.read_edgelist(fh)
File "<decorator-gen-400>", line 2, in read_edgelist
File "E:\anaconda\lib\site-packages\networkx\utils\decorators.py", line 227, in _open_file
result = func_to_be_decorated(*new_args, **kwargs)
File "E:\anaconda\lib\site-packages\networkx\readwrite\edgelist.py", line 378, in read_edgelist
data=data)
File "E:\anaconda\lib\site-packages\networkx\readwrite\edgelist.py", line 288, in parse_edgelist
"Failed to convert edge data (%s) to dictionary." % (d))
TypeError: Failed to convert edge data (['1']) to dictionary.
and here is the code- 这是代码-
fh=open("YST_full.net", 'rb')
G=nx.read_edgelist(fh)
fh.close()
What am I doing wrong here? 我在这里做错了什么?
edit-I tried converting it to pandas dataframe 编辑-我尝试将其转换为熊猫数据框
df=pd.read_csv("YST_full.net",sep=" ",names=['node1','node2','weight'])
print(df)
G=nx.from_pandas_edgelist(df, 'node1', 'node2', ['weight'])
and now I want to convert it to graphml format- 现在我想将其转换为graphml格式-
nx.write_graphml(G, "YST_full.graphml")
but the error is- 但错误是-
nx.write_graphml(G, "YST_full.graphml")
File "<decorator-gen-440>", line 2, in write_graphml_lxml
File "E:\anaconda\lib\site-packages\networkx\utils\decorators.py", line 227, in _open_file
result = func_to_be_decorated(*new_args, **kwargs)
File "E:\anaconda\lib\site-packages\networkx\readwrite\graphml.py", line 149, in write_graphml_lxml
infer_numeric_types=infer_numeric_types)
File "E:\anaconda\lib\site-packages\networkx\readwrite\graphml.py", line 596, in __init__
self.add_graph_element(graph)
File "E:\anaconda\lib\site-packages\networkx\readwrite\graphml.py", line 658, in add_graph_element
T = self.xml_type[self.attr_type(k, "edge", v)]
KeyError: <class 'numpy.int64'>
You must inform networkx
that the third column is an attribute called "weight" (or whatever you call it): 您必须通知
networkx
第三列是称为“权重”的属性(或任何您称呼的属性):
graph = nx.read_edgelist("YST_full.net", data=(('weight', float),))
As far as your second question is concerned, sometimes networkx
fails to convert a NumPy int64
to the Python int
before exporting to GraphML. 关于第二个问题,有时
networkx
在导出到GraphML之前无法将NumPy int64
转换为Python int
。 You have to do it yourself: 您必须自己做:
weights = {(n1,n2): float(d['weight']) # or int()
for n1,n2,d in graph.edges(data=True)}
nx.set_edge_attributes(G, weights, 'weight')
This error is due to the dtypes of your pandas dataframe. 此错误是由于您的熊猫数据框的dtypes引起的。 A workaround is to convert your dataframe columns to string dtype.
解决方法是将数据框列转换为字符串dtype。
df = df.apply(lambda x: x.astype(str))
G=nx.from_pandas_edgelist(df, 'node1', 'node2', 'weight')
nx.write_graphml(G,'test.out')
Output: 输出:
<?xml version='1.0' encoding='utf-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd"><key attr.name="weight" attr.type="string" for="edge" id="d0"/>
<graph edgedefault="undirected"><node id="1"/>
<node id="2"/>
<node id="3"/>
<node id="999"/>
<node id="1029"/>
<node id="1031"/>
<node id="1032"/>
<node id="1197"/>
<node id="1226"/>
<node id="1296"/>
<node id="450"/>
<node id="933"/>
<node id="934"/>
<node id="955"/>
<node id="4"/>
<node id="5"/>
<edge source="1" target="2">
<data key="d0">1</data>
</edge>
<edge source="1" target="3">
<data key="d0">1</data>
</edge>
<edge source="2" target="999">
<data key="d0">1</data>
</edge>
<edge source="2" target="1029">
<data key="d0">1</data>
</edge>
<edge source="2" target="1031">
<data key="d0">1</data>
</edge>
<edge source="2" target="1032">
<data key="d0">1</data>
</edge>
<edge source="2" target="1197">
<data key="d0">1</data>
</edge>
<edge source="2" target="1226">
<data key="d0">1</data>
</edge>
<edge source="2" target="1296">
<data key="d0">1</data>
</edge>
<edge source="3" target="450">
<data key="d0">1</data>
</edge>
<edge source="3" target="933">
<data key="d0">1</data>
</edge>
<edge source="3" target="934">
<data key="d0">1</data>
</edge>
<edge source="3" target="955">
<data key="d0">1</data>
</edge>
<edge source="3" target="1032">
<data key="d0">1</data>
</edge>
<edge source="4" target="5">
<data key="d0">1</data>
</edge>
</graph></graphml>
Today I started learning network analysis and this is the first error I encountered. 今天,我开始学习网络分析,这是我遇到的第一个错误。 I just changed
G=nx.read_edgelist(fh)
to G=nx.read_weighted_edgelist(fh)
. 我只是将
G=nx.read_edgelist(fh)
更改为G=nx.read_weighted_edgelist(fh)
。
You can also remove the third column and use G=nx.read_edgelist(fh)
您也可以删除第三列,并使用
G=nx.read_edgelist(fh)
If you do change G=nx.read_edgelist(fh)
to G=nx.read_weighted_edgelist(fh)
, it will work. 如果确实将
G=nx.read_edgelist(fh)
更改为G=nx.read_weighted_edgelist(fh)
,它将起作用。
You don't need to remove third column. 您无需删除第三列。 I am using SNAP Temporal data for
node2vec
project, was facing same error as described, solving by using G = nx.read_weighted_edgelist(args.input, nodetype=int, data=(('weight',float),),delimiter=',', create_using=nx.DiGraph())
我正在为
node2vec
项目使用SNAP时间数据 ,面临与所述相同的错误,通过使用G = nx.read_weighted_edgelist(args.input, nodetype=int, data=(('weight',float),),delimiter=',', create_using=nx.DiGraph())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.