简体   繁体   English

使用多个分隔符networkx / pandas从.txt文件中添加具有属性的节点

[英]Add Nodes with attributes from a .txt file with multiple delimiters networkx/pandas

I have a .txt File that has 46 lines, each line stands for a node in a network and then has LOTS of attributes behind it. 我有一个.txt文件,该文件有46行,每行代表网络中的一个节点,然后后面有很多属性。

Example Name; 03.01.194, Luzern, (LU), Test, Attribute, Other Attribute, 
Kasdasd Alex; 22.12.1957, in Blabla, (ZH), Bürgerorte, Oeschgen (AG),  Zivilstand, 

I'm not sure how I get networkx to see this as a nodelist, some things I thought about, that maybe could work, but do not at the moment 我不确定如何让networkx将其视为节点列表,我曾考虑过一些事情可能会奏效,但目前不行

import pandas as pd
import networkx as nx
nodes = pd.read_csv('final.csv', header=None)
nodes

Problem with the code above is that the attributes are separated by commas, but not the nodes. 上面的代码的问题是属性用逗号分隔,而不是节点分隔。

Another try, where I wanted to open the file, and add nodes line by line but got stuck on the G.add_node() command 另一种尝试,我想打开文件,并逐行添加节点,但是卡在G.add_node()命令上

G = nx.Graph()
with open('final.txt') as infile:
    for line in infile:
        G.add_node()

Is one of the two the approach to go for or should I try something different? 是这两种方法之一,还是我应该尝试一些不同的方法?

Also for further analysis, does networkx offer a possibilty to compare attributes of nodes and if they match, create a weighted edge? 另外,为进行进一步分析,networkx是否可以比较节点的属性,如果匹配,则创建加权边?

You can achieve this by reading the file specifying the delimiter as ';' 您可以通过读取将分隔符指定为';'的文件来实现此目的。 so that the first element is the node key and the rest are the attributes. 因此第一个元素是节点键,其余元素是属性。 Then split the attributes string with the delimiter ',' and add the returned list as a node attribute. 然后,用定界符',分割属性字符串,并将返回的列表添加为节点属性。 I copied the sample you provided in 'test.txt' file and executed the following code. 我复制了您在“ test.txt”文件中提供的示例,并执行了以下代码。

G = nx.DiGraph()

csv_F = csv.reader(open("test.txt"),delimiter=';')
for row in csv_F:
    attributes=row[1].split(',')
    G.add_node(row[0], attr = attributes)

Then I printed the nodes and their attributes as follows: 然后,我按如下所示打印节点及其属性:

for n in G.nodes():
    print 'Node: '  + str(n)
    print 'Atrributes' + str(G.node[n]['attr'])

Result: 结果:

Node: Kasdasd Alex 节点:Kasdasd Alex

Atrributes: [' 22.12.1957', ' in Blabla', ' (ZH)', ' B\\xc3\\xbcrgerorte', ' Oeschgen (AG)', ' Zivilstand', ''] 特例:['22.12.1957','in Blabla','(ZH)','B \\ xc3 \\ xbcrgerorte','Oeschgen(AG)','Zivilstand','']

Node: Example Name 节点:示例名称

Atrributes: [' 03.01.194', ' Luzern', ' (LU)', ' Test', ' Attribute', ' Other Attribute', ' '] 属性:['03.01.194','Luzern','(LU)','Test','Attribute','Other Attribute','']

As for your question in the end, networkx offers such capabilities and more. 最后,关于您的问题,networkx提供了此类功能以及更多功能。 Have a look on the tutorial here . 这里看看教程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM