作为联合概率分布的度相关

$P(k_1, k_2) $ 表示度为$k_1$和$k_2$ 的两个节点连接在一起的概率。 如果没有度相关: $P(k_1, k_2) \sim k_1 P(k_1) k_2 P(k_2) $。

对于无标度网络:$P(k_1, k_2) \sim k_1 P(k_1) k_2 P(k_2) \sim k_1^{1-\gamma} k_2^{1-\gamma} $

密度守恒定律

Density conservation law

$\int P(k_1, k_2) d k_2 = k_1 P(k_1)$

等式两边表示的都是度为$k_1$的节点一共连了多少条边的概率!!!

密度守恒定律,指的是边密度从度分布算和联合度分布算结果总是一样的。

对无标度网络而言,满足 $P(k) \sim k^{-\gamma}$,所以边密度守恒的公式可以表达为:

$\int P(k_1, k_2) d k_2 = k_1 P(k_1) \sim k_1^{-(\gamma -1) }$

硬猜$P(k_1, k_2)$

The statistical similarity of the corresponding plots suggests the invariance of $P(k_1, k_2)$. Accordingly, this suggests that the $k_1$ and $k_2$ dependence can be separated, and the behavior of the tail of the joint degree distribution is

$P(k_1, k_2) \sim k_1^{-(\gamma -1 )} k_2^{-\epsilon}$ (1)

for completely random networks:

$P(k_1, k_2) \sim k_1 P(k_1) k_2 P(k_2) \sim k_1^{1-\gamma} k_2^{1-\gamma}$ (2)

Using Eq. (1), we can see that:

  • For low-degree nodes ($k_1 > k_2$), integrating over $k_2$, we retrieve the $k^{1-\gamma}$ dependence.
  • For the case of hubs, where integration is over $k_1$, the dependence on the degree is $k^{-\epsilon}$

$E_b(k)$

测量度相关的方式很多,Newman提出的度相关系数、$k_{nn}$等,但是不幸的是它们随着重整化并非不变的(invariant)。$E_b(k)$可以做到伴随着重整化不变(但要求是无标度网络才行)。

$E_b(k) \equiv \frac{\int_{bk}^{\infty} P(k | k’) dk’}{ \int_{bk}^{\infty} P( k’) dk’ }$

  • 分子是两节点度为k和大于bk的链接比例;
  • 分母是度大于bk的节点的比例

我们知道对无标度网络而言,满足 $P(k) \sim k^{-\gamma}$,所以边密度守恒的公式可以表达为:$\int P(k_1, k_2) d k_2 = k_1 P(k_1) \sim k_1^{-(\gamma -1) }$

以下哪一个推导正确:

  • 推导1:$P(k | k’) = \frac{P(k, k’) }{\int P(k, k’) dk} = \frac{k^{-(\gamma-1)} k’^{-\epsilon}}{k’^{1-\gamma}} = k^{-(\gamma -1)} k’^{-(1+\epsilon – \gamma)}$
  • 推导2:$P(k | k’) = \frac{P(k, k’) }{P(k’)} = \frac{k^{-(\gamma-1)} k’^{-\epsilon}}{k’^{-\gamma}} = k^{-(\gamma -1)} k’^{-\epsilon + \gamma}$

已知度为k的节点存在的概率为p(k),那么连边的一头连度为k的概率为kP(k)。就是P(k,k’)是连边存在的概率,p(k)为度值为k的节点存在的概率,这个没法放到分母下面,放到分布下面的是度值为k的连边存在的概率,这个值还有乘以k才对。

计算$E_b(k)$

算法设计 – 将链接表达为 (k, k’)的形式
– 使得其中的 k’>k – 选择 b = 3
– 计算: ** – 计算k’ > 3k, k = k的链接数量$N_{kk’}$ **
– 计算$P(k, k’) = \frac{N_{kk’}}{Number\;of\;edges}$ **
– 对于每一个节点, 计算整个网络节点的度大于3k的比例, $P(k’)$和$k’ P(k’)$ **
– 计算$p1 = \sum \frac{P(k, k’)}{k’ P(k’)}$和$P2 = \sum P(k’)$。 **
– 计算 p1/p2

%matplotlib inline
import networkx as nx
import matplotlib.cm as cm
import matplotlib.pyplot as plt
from collections import defaultdict

def ebkk(g, b):
    edge_dict = defaultdict(lambda: defaultdict(int))
    degree_dict = defaultdict(int)
    edge_degree = [sorted(g.degree(e).values()) for e in g.edges()]
    for e in edge_degree:
        edge_dict[e[0]][e[-1]] +=1
    for i in g.degree().values():
        degree_dict[i] +=1
    edge_number = g.number_of_edges()
    node_number = g.number_of_nodes()
    ebks, ks = [], []
    for k1 in edge_dict:
        p1, p2 = 0, 0
        for k2 in edge_dict[k1]:
            if k2 >= b*k1:
                pkk = float(edge_dict[k1][k2])/edge_number
                pk2 = float(degree_dict[k2])/node_number
                k2pk2 = k2*pk2
                p1 += pkk/k2pk2
                p2 += pk2
        if p2 > 0:
            ebks.append(p1/p2)
            ks.append(k1)
    return ebks, ks

# http://snap.stanford.edu/data/ca-CondMat.html
ca = nx.Graph()
with open ('/Users/chengjun/bigdata/ca-CondMat.txt') as f:
    for line in f:
        if line[0] != '#':
            x, y = line.strip().split('\t')
            ca.add_edge(x,y)
nx.info(ca)


ebk, k = ebkk(ca, b=3)

plt.plot(k,ebk,'r^')
plt.xlabel(r'$k$', fontsize = 16)
plt.ylabel(r'$E_b(k)$', fontsize = 16)
plt.xscale('log')
plt.yscale('log')
plt.show()

Reference

Lazaros K. Gallos, Chaoming Song, Hernán A Makse, 2008. Scaling of Degree Correlations and Its Influence on Diffusion in Scale-Free Networks. Physical Review Letters 100(24):248701 Impact Factor: 7.51 DOI: 10.3731/topologica.2.020

Updated: