简体   繁体   English

为从 pandas 开始的无向图创建节点

[英]Creating nodes for an undirected graph starting from pandas

I have a dataframe that looks like this (I have 170000 observations in reality):我有一个 dataframe 看起来像这样(我在现实中有 170000 个观察):

Firm   pat             cited_pat
F_1    [p0,p1,p2]       [p0,p1,p2]    
F_2    []               []
F_3    [p3,p6,p2]       [p5,p0,p23,p29,p12,p8]
F_4    [p0,p9,p25]      [p0,p29,p31]
...

The idea is this:这个想法是这样的:

  1. Create all possible couples of F_i, F_j;创建所有可能的F_i、F_j对;
  2. If two F_i, F_j have one (or more) "ps" in common, then put an edge of 1 and stop;如果两个 F_i,F_j 有一个(或多个)“ps”的共同点,则放置边 1 并停止;
  3. If they do not, then take cited_pat and check how many "ps" are in common there.如果他们不这样做,则使用cited_pat并检查那里共有多少个“ps”。 If more than 50% are in common than create an edge=1.如果超过 50% 是共同的,则创建 edge=1。

Now, I am struggling a lot finding aa way to do it in an easy way.现在,我正在努力寻找一种简单的方法来做到这一点。 Could you please help me on this?你能帮我解决这个问题吗?

Here's one way to do things:这是一种做事的方法:

import pandas as pd
import numpy as np
import networkx as nx

data = {'Firm': {0: 'F_1', 1: 'F_2', 2: 'F_3', 3: 'F_4'},
 'pat': {0: ['p0','p1','p2'], 1: [], 2: ['p3','p6','p2'], 3: ['p0','p9','p25']},
 'cited_pat': {0: ['p0','p1','p2'],
  1: [],
  2: ['p5','p0','p23','p29','p12','p8'],
  3: ['p0','p29','p31']}}

df = pd.DataFrame(data)

def cited_pat_func(set_i):
    def f(set_j):
        return len(set_i & set_j)*2 >= len(set_i | set_j)
    return f

G = nx.Graph()
G.add_nodes_from(df['Firm'])

for i,row in df.iterrows():
    df_tail = df.iloc[(i+1):,:]
    F_i = row['Firm']
    pat_i = set(row['pat'])
    cpat_i = set(row['cited_pat'])
    
    cond = (df_tail['pat'].apply(set)
              .apply(pat_i.intersection)
              .astype(bool) |
            df_tail['cited_pat'].apply(set)
              .apply(cited_pat_func(cpat_i)))
    for F_j in df_tail.loc[cond,'Firm']:
        G.add_edge(F_i, F_j)

Here's the graph produced for this example:这是为此示例生成的图表:

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM