簡體   English   中英

刪除python中的邊數

[英]removing number of edges in python

我想知道是否有pythonic的方法來做到這一點。

假設我列出了字典:

{'source': 338, 'target': 343, 'value': 0.667693}
{'source': 339, 'target': 342, 'value': 0.628195}
{'source': 340, 'target': 346, 'value': 0.529861}
{'source': 340, 'target': 342, 'value': 0.470139}
{'source': 341, 'target': 342, 'value': 0.762871}
{'source': 342, 'target': 349, 'value': 0.664869}
{'source': 343, 'target': 347, 'value': 0.513025}
{'source': 343, 'target': 344, 'value': 0.486975}
{'source': 344, 'target': 347, 'value': 0.536706}
{'source': 344, 'target': 349, 'value': 0.463294}
{'source': 345, 'target': 349, 'value': 0.546326}
{'source': 345, 'target': 346, 'value': 0.453674}

基本上是無向圖 但是非常混亂 我想清潔一下。

因此,我想保留前2個節點,這些節點的邊緣最多,如原始格式。

對於其余的節點……至少有5條邊連接到該節點。

我只是維護一個帶有計數的字典...對它進行反向排序。

然后保存前2個並再次遍歷列表..並刪除邊緣,但檢查前2個。

有沒有更清潔的方法可以做到這一點。

我的越野車..凌亂的示例代碼:

import json
from pprint import pprint
import operator
json_data=open('topics350_1.json')

data = json.load(json_data)
edges = data["links"]
node_count_dict = {}
super_nodes = 3
min_nodes = 5

for edge in edges:
    keys = [edge['source'], edge['target']]
    for key in keys:
        if key in node_count_dict:
            node_count_dict[key] +=1
        else:
            node_count_dict[key] = 1

sorted_nodes = sorted(node_count_dict.iteritems(), key=operator.itemgetter(1),reverse = True)           
#print sorted_nodes 
top_nodes = sorted_nodes[super_nodes]
final_node_count = {}
for key in sorted_nodes:
    final_node_count[key[0]] = 0
print final_node_count
link_list = []
for edge in edges:
    keys = [edge['source'], edge['target']]
    for key in keys:
        if key not in top_nodes:
            if final_node_count[key] < min_nodes:
                link_list.append(edge)
print link_list




#print data['links']

我強烈建議您使用networkx與Graph一起使用。

import networkx as nx
G = nx.Graph()
# build your Graph
# G.add_node(), G.add_nodes_from(), G.add_edge(), G.add_edges_from()...

nodes = [(g, G.degree(g)) for g in G.nodes()]
# nodes like this: [(338, 4), (340, 7)...]
# item one is the node, and item two is the edges connected with this node

nodes.sort(key=lambda n: n[1], reverse=True)

# you wanna delete the third node and other nodes which edges at most 5, right?
G.remove_node(nodes[2][1])
for n, e in nodes:
    if e > 5:
        G.remove_node(n)

但是,上面的“您的代碼”將如下所示:

from collections import Counter

sources = []
for edge in edges:
    source.append(edge['source'])
    source.append(edge['target'])

sources_count = Counter(sources)
sources_count = sorted(source_count.items(), key=lambda s: s[1], reverse=True)

sources_count.pop(2)
valid_nodes = filter(lambda s: s[1] <= 5, sources_count)

link_list = filter(
    lambda e: e['source'] not in valid_nodes and e['target'] not in valid_nodes, 
    edges
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM