简体   繁体   English

Python igraph:从图中删除顶点

[英]Python igraph: delete vertices from a graph

I am working with enron email dataset and I am trying to remove email addresses that don't have "@enron.com" (ie I would like to have enron emails only). 我正在使用enron电子邮件数据集,我正在尝试删除没有“@ enron.com”的电子邮件地址(即我只想发送enron电子邮件)。 When I tried to delete those addresses without @enron.com, some emails just got skipped for some reasons. 当我试图在没有@enron.com的情况下删除这些地址时,由于某些原因,一些电子邮件被忽略了。 A small graph is shown below where vertices are email address. 下面显示了一个小图,其中顶点是电子邮件地址。 This is gml format: 这是gml格式:

Creator "igraph version 0.7 Sun Mar 29 20:15:45 2015"
Version 1
graph
[
  directed 1
  node
  [
    id 0
    label "csutter@enron.com"
  ]
  node
  [
    id 1
    label "steve_williams@eogresources.com"
  ]
  node
  [
    id 2
    label "kutner.stephen@enron.com"
  ]
  node
  [
    id 3
    label "igsinc@ix.netcom"
  ]
  node
  [
    id 4
    label "dbn@felesky.com"
  ]
  node
  [
    id 5
    label "cheryltd@tbardranch.com"
  ]
  node
  [
    id 6
    label "slover.eric@enron.com"
  ]
  node
  [
    id 7
    label "alkeister@yahoo.com"
  ]
  node
  [
    id 8
    label "econnors@mail.mainland.cc.tx.us"
  ]
  node
  [
    id 9
    label "jafry@hotmail.com"
  ]
  edge
  [
    source 5
    target 5
    weight 1
  ]
]

My code is: 我的代码是:

G = ig.read("enron_email_filtered.gml")
for v in G.vs:
    print v['label']
    if '@enron.com' not in v['label']:
        G.delete_vertices(v.index)
        print 'Deleted'

In this dataset, 7 emails should be deleted. 在此数据集中,应删除7封电子邮件。 However, based on the above code, only 5 emails are removed. 但是,根据上述代码,只删除了5封电子邮件。

From the tutorial here , you can access all the vertices with a specific property, and then delete them in bulk as follows: 这里的教程,您可以访问具有特定属性的所有顶点,然后批量删除它们,如下所示:

to_delete_ids = [v.index for v in G.vs if '@enron.com' not in v['label']]
G.delete_vertices(to_delete_ids)

Here is the output I got: 这是我得到的输出:

to delete ids: [1, 3, 4, 5, 7, 8, 9]
Before deletion: IGRAPH D-W- 10 1 --
+ attr: id (v), label (v), weight (e)
+ edges:
5->5
After deletion: IGRAPH D-W- 3 0 --
+ attr: id (v), label (v), weight (e)
label: csutter@enron.com
label: kutner.stephen@enron.com
label: slover.eric@enron.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM