繁体   English   中英

Networkx:基于节点属性添加加权边缘的 For 循环

[英]Networkx: For Loop that Adds Weighted Edges based on Node Properties

我有这个 CSV:

在此处输入图像描述

我用 NetworkX 把它变成了一个图表:

attr_df = pd.read_csv("names.csv")

# changing all datatypes of every column to string 
attr_df = attr_df.astype(str)
G = nx.Graph()

for index, row in attr_df.iterrows():
    #create dict from row
    row_dict = row.to_dict()

    #pop name and unpack dict to pass to graph
    G.add_node(row_dict.pop('Name'), **row_dict)


# Connect Graph with Weighted Edges based on how many node properties are similar

G.add_edge('Owen', 'Heath', weight=4/4, common='Location= Texas, Race= Black, Gender= Male, Age= 21')
G.add_edge('Owen', 'Roger', weight=2/4, common = 'Race= Black, Gender= Male')
G.add_edge('Owen', 'Cherry', weight=1/4, common='Age= 21')
G.add_edge('Heath', 'Roger', weight=2/4, common='Race= Black, Gender= Male')
G.add_edge('Heath', 'Cherry', weight=1/4, common='Age= 21')
G.add_edge('Amber', 'Susan', weight=2/4, common='Race= White, Gender= Female')
G.add_edge('Amber', 'Cherry', weight=2/4, common='Race= White, Gender= Female')
G.add_edge('Roger', 'Susan', weight=1/4, common='Location= California')
G.add_edge('Susan', 'Cherry', weight=2/4, common='Race= White, Gender= Female')

# Visualize Graph (Matplotlib)
weights = nx.get_edge_attributes(G, 'weight')
common = nx.get_edge_attributes(G, 'common')
nx.draw(G, with_labels = True)

在此处输入图像描述

我想创建一个循环来添加边、权重(基于公共属性)和属性(这是它们的公共属性的列表)。 我知道我可以观察 csv 并从那里创建边缘,但随着 CSV 的增长,这将更加乏味。 我不确定如何创建用于创建这些边缘的 for 循环。

可能不是最快的方法,但它有效:

from itertools import combinations


row_dicts = [
    {"Name": "Owen", "Location": "Texas", "Race": "Black", "Gender": "Male", "Age": "21"},
    {"Name": "Heath", "Location": "Texas", "Race": "Black", "Gender": "Male", "Age": "21"},
    {"Name": "Amber", "Location": "Ohio", "Race": "White", "Gender": "Female", "Age": "19"},
    {"Name": "Roger", "Location": "California", "Race": "Black", "Gender": "Male", "Age": "18"},
    {"Name": "Susan", "Location": "California", "Race": "White", "Gender": "Female", "Age": "22"},
    {"Name": "Cherry", "Location": "Florida", "Race": "White", "Gender": "Female", "Age": "21"},
]

for P1, P2 in combinations(row_dicts, 2):
    common = {key: v1 for (key, v1), v2 in zip(P1.items(), P2.values()) if v1 == v2}
    if common:
        print(f"G.add_edge('{P1['Name']}', '{P2['Name']}', weight={len(common)}/4, common={common})")
        # G.add_edge(P1['Name'], P2['Name'], weight=len(common)/4, common=str(common))

给出 output:

G.add_edge('Owen', 'Heath', weight=4/4, common={'Location': 'Texas', 'Race': 'Black', 'Gender': 'Male', 'Age': '21'})
G.add_edge('Owen', 'Roger', weight=2/4, common={'Race': 'Black', 'Gender': 'Male'})
G.add_edge('Owen', 'Cherry', weight=1/4, common={'Age': '21'})
G.add_edge('Heath', 'Roger', weight=2/4, common={'Race': 'Black', 'Gender': 'Male'})
G.add_edge('Heath', 'Cherry', weight=1/4, common={'Age': '21'})
G.add_edge('Amber', 'Susan', weight=2/4, common={'Race': 'White', 'Gender': 'Female'})
G.add_edge('Amber', 'Cherry', weight=2/4, common={'Race': 'White', 'Gender': 'Female'})
G.add_edge('Roger', 'Susan', weight=1/4, common={'Location': 'California'})
G.add_edge('Susan', 'Cherry', weight=2/4, common={'Race': 'White', 'Gender': 'Female'})

几点注意事项:

  1. 请不要评估那个 f 字符串,而是直接调用 G.add_edge。 这仅用于说明目的。
  2. itertools.combinations采用任何可迭代对象,因此您应该能够直接将行作为迭代器提供给循环,而不必像我在此示例中所做的那样将其作为列表保存在 memory 中。 如果您有大量数据,这可能是相关的。
  3. 该算法依赖于以相同顺序具有相同键的所有行。 如果你不能保证,你需要稍微改变字典理解。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM