简体   繁体   English

层次图连通性的度量

[英]Metrics for hierarchical graphs connectivity

this is my first question on Stack Overflow.这是我关于 Stack Overflow 的第一个问题。 This is not really a programming question but since most of us have to deal with theoretical problems at some point and there might be some graph theory specialists around, I thought I might give it a go.这不是一个真正的编程问题,但由于我们大多数人在某些时候必须处理理论问题,并且可能有一些图论专家,我想我可以给它一个 go。

I am currently doing some research on multilingual websites and I found some interesting patterns in the website structure.我目前正在对多语言网站进行一些研究,并在网站结构中发现了一些有趣的模式。 The graphs below are the website graphs of two different multilingual websites.下图是两个不同的多语言网站的网站图。 Sorry, I don't have enough rep points to post images so I leave them as links.抱歉,我没有足够的代表点来发布图片,所以我将它们保留为链接。 I used the Force Atlas algorithm for the layout.我使用 Force Atlas 算法进行布局。 Vertices are colored according to the page language.顶点根据页面语言着色。 The shaded areas correspond to the subgraphs of a specific language.阴影区域对应于特定语言的子图。

Here is the graph of the website where different language versions of the same content are very closely linked.这是网站的图表,其中相同内容的不同语言版本非常紧密地联系在一起。 Hence the planes representing the different language versions are overlapping.因此,代表不同语言版本的平面是重叠的。

http://www.ai.soc.i.kyoto-u.ac.jp/~julien/phd/images/tight.png http://www.ai.soc.i.kyoto-u.ac.jp/~julien/phd/images/tight.png

In this second graph, we have a website where language versions of a website are almost independent, thus we have almost no overlap.在第二张图中,我们有一个网站,其中网站的语言版本几乎是独立的,因此我们几乎没有重叠。

http://www.ai.soc.i.kyoto-u.ac.jp/~julien/phd/images/loose.png http://www.ai.soc.i.kyoto-u.ac.jp/~julien/phd/images/loose.png

So here is my question:所以这是我的问题:

Is there a specific metric to quantify this overlap?是否有特定的指标来量化这种重叠? If so, what is it named?如果有,它叫什么名字?

Since I used a force-based layout, the number of edges between the language subgraphs.由于我使用了基于力的布局,因此语言子图之间的边数。 So I guess something like taking the ratio of the number of edges within the subgraph to the number edges going outside/coming inside a specific subgraph might do the trick.所以我想像取子图中的边数与进入特定子图的外部/进入的边数之比这样的方法可能会奏效。 I am sure I am not the first to get this idea so I was wondering if this metric had a name.我确信我不是第一个得到这个想法的人,所以我想知道这个指标是否有名字。 I could then Google it from there:)然后我可以从那里谷歌它:)

Thank you in advance!先感谢您!

It sounds like what you're looking for is Network Modularity .听起来您正在寻找的是Network Modularity Given a graph, and a partition (breaking the graph into disjoint subgraphs), the modularity is defined as:给定一个图和一个分区(将图分成不相交的子图),模块化定义为:

The fraction of the edges that fall within the given groups minus the expected such fraction if edges were distributed at random.如果边是随机分布的,则落在给定组内的边的分数减去预期的此类分数。

Modularity was the basis of some of the first community detection algorithms on networks, which try to find sets of nodes that are densely connected.模块化是网络上一些第一个社区检测算法的基础,它试图找到密集连接的节点集。 Recently, modularity has been shown to be a poor metric for community detection though because of resolution limits that fail to detect small groups or break apart well defined groups in certain cases (see this paper ).最近,模块化已被证明是社区检测的一个糟糕指标,尽管因为在某些情况下无法检测小群体或分解明确定义的群体的分辨率限制(参见本文)。

And there are now other approaches than modularity, designed to overcome the limitations mentionned by job, such as surprise ;现在除了模块化还有其他方法,旨在克服工作中提到的限制,例如惊喜 or the B- and C-scores (designed to be significance indices).或 B 和C 分数(设计为显着性指标)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM