[英]Networkx: For Loop that Adds Weighted Edges based on Node Properties
我有这个 CSV:
我用 NetworkX 把它变成了一个图表:
attr_df = pd.read_csv("names.csv")
# changing all datatypes of every column to string
attr_df = attr_df.astype(str)
G = nx.Graph()
for index, row in attr_df.iterrows():
#create dict from row
row_dict = row.to_dict()
#pop name and unpack dict to pass to graph
G.add_node(row_dict.pop('Name'), **row_dict)
# Connect Graph with Weighted Edges based on how many node properties are similar
G.add_edge('Owen', 'Heath', weight=4/4, common='Location= Texas, Race= Black, Gender= Male, Age= 21')
G.add_edge('Owen', 'Roger', weight=2/4, common = 'Race= Black, Gender= Male')
G.add_edge('Owen', 'Cherry', weight=1/4, common='Age= 21')
G.add_edge('Heath', 'Roger', weight=2/4, common='Race= Black, Gender= Male')
G.add_edge('Heath', 'Cherry', weight=1/4, common='Age= 21')
G.add_edge('Amber', 'Susan', weight=2/4, common='Race= White, Gender= Female')
G.add_edge('Amber', 'Cherry', weight=2/4, common='Race= White, Gender= Female')
G.add_edge('Roger', 'Susan', weight=1/4, common='Location= California')
G.add_edge('Susan', 'Cherry', weight=2/4, common='Race= White, Gender= Female')
# Visualize Graph (Matplotlib)
weights = nx.get_edge_attributes(G, 'weight')
common = nx.get_edge_attributes(G, 'common')
nx.draw(G, with_labels = True)
我想创建一个循环来添加边、权重(基于公共属性)和属性(这是它们的公共属性的列表)。 我知道我可以观察 csv 并从那里创建边缘,但随着 CSV 的增长,这将更加乏味。 我不确定如何创建用于创建这些边缘的 for 循环。
可能不是最快的方法,但它有效:
from itertools import combinations
row_dicts = [
{"Name": "Owen", "Location": "Texas", "Race": "Black", "Gender": "Male", "Age": "21"},
{"Name": "Heath", "Location": "Texas", "Race": "Black", "Gender": "Male", "Age": "21"},
{"Name": "Amber", "Location": "Ohio", "Race": "White", "Gender": "Female", "Age": "19"},
{"Name": "Roger", "Location": "California", "Race": "Black", "Gender": "Male", "Age": "18"},
{"Name": "Susan", "Location": "California", "Race": "White", "Gender": "Female", "Age": "22"},
{"Name": "Cherry", "Location": "Florida", "Race": "White", "Gender": "Female", "Age": "21"},
]
for P1, P2 in combinations(row_dicts, 2):
common = {key: v1 for (key, v1), v2 in zip(P1.items(), P2.values()) if v1 == v2}
if common:
print(f"G.add_edge('{P1['Name']}', '{P2['Name']}', weight={len(common)}/4, common={common})")
# G.add_edge(P1['Name'], P2['Name'], weight=len(common)/4, common=str(common))
给出 output:
G.add_edge('Owen', 'Heath', weight=4/4, common={'Location': 'Texas', 'Race': 'Black', 'Gender': 'Male', 'Age': '21'})
G.add_edge('Owen', 'Roger', weight=2/4, common={'Race': 'Black', 'Gender': 'Male'})
G.add_edge('Owen', 'Cherry', weight=1/4, common={'Age': '21'})
G.add_edge('Heath', 'Roger', weight=2/4, common={'Race': 'Black', 'Gender': 'Male'})
G.add_edge('Heath', 'Cherry', weight=1/4, common={'Age': '21'})
G.add_edge('Amber', 'Susan', weight=2/4, common={'Race': 'White', 'Gender': 'Female'})
G.add_edge('Amber', 'Cherry', weight=2/4, common={'Race': 'White', 'Gender': 'Female'})
G.add_edge('Roger', 'Susan', weight=1/4, common={'Location': 'California'})
G.add_edge('Susan', 'Cherry', weight=2/4, common={'Race': 'White', 'Gender': 'Female'})
几点注意事项:
itertools.combinations
采用任何可迭代对象,因此您应该能够直接将行作为迭代器提供给循环,而不必像我在此示例中所做的那样将其作为列表保存在 memory 中。 如果您有大量数据,这可能是相关的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.